bpf: Ringbuf: Ensure we acquire load the length for the ring buf entry
The kernel updates the length with xchg() which does a memory barrier,
on the kernel side when the data is actually committed to the ring
buffer [1].
On the user space side the volatile is not sufficient to prevent the
data read from being reordered before the load of length.
[1]https://github.com/torvalds/linux/blob/a20971c187522f5a7cd8e961e7e9c88f31ea2bed/kernel/bpf/ringbuf.c#L484
Bug: 374722456
Bug: 368624834
Bug: 376536942
Change-Id: I75eee3deee2afce83c1b760e6df383375f926ebb
Signed-off-by: Kalesh Singh <kaleshsingh@google.com>
diff --git a/bpf/headers/include/bpf/BpfRingbuf.h b/bpf/headers/include/bpf/BpfRingbuf.h
index 4bcd259..5fe4ef7 100644
--- a/bpf/headers/include/bpf/BpfRingbuf.h
+++ b/bpf/headers/include/bpf/BpfRingbuf.h
@@ -99,6 +99,7 @@
// 32-bit kernel will just ignore the high-order bits.
std::atomic_uint64_t* mConsumerPos = nullptr;
std::atomic_uint32_t* mProducerPos = nullptr;
+ std::atomic_uint32_t* mLength = nullptr;
// In order to guarantee atomic access in a 32 bit userspace environment, atomic_uint64_t is used
// in addition to std::atomic<T>::is_always_lock_free that guarantees that read / write operations
@@ -247,7 +248,8 @@
// u32 len;
// u32 pg_off;
// };
- uint32_t length = *reinterpret_cast<volatile uint32_t*>(start_ptr);
+ mLength = reinterpret_cast<decltype(mLength)>(start_ptr);
+ uint32_t length = mLength->load(std::memory_order_acquire);
// If the sample isn't committed, we're caught up with the producer.
if (length & BPF_RINGBUF_BUSY_BIT) return count;