BpfSyscallWrappers: grab shared lock on writable map open

and add an accessor to grab an exclusive lock on a R/W map open
(such a map could be accessed with a write-through cache)

Note: we can't grab a flock as that would occupy the full inode,
and all bpfmaps are actually (currently) the *same* anonymous inode.

As such we actually grab a lock on a range (a single byte),
the offset being determined by the unique bpf map id.

We include some very simple, but sufficient, correctness tests
in the critical boot path: this is to prevent any surprises
caused by kernel implementation changes.

$ adb root && sleep 1 && adb wait-for-device shell grep OFDLCK /proc/locks
id: OFDLCK ADVISORY [WRITE|READ] pid blkmaj:min:inode min_offset max_offset
11: OFDLCK ADVISORY  READ -1 00:0e:1048 36 36
14: OFDLCK ADVISORY  READ -1 00:0e:1048 35 35
15: OFDLCK ADVISORY  READ -1 00:0e:1048 41 41
16: OFDLCK ADVISORY  READ -1 00:0e:1048 40 40
22: OFDLCK ADVISORY  READ -1 00:0e:1048 24 24
23: OFDLCK ADVISORY  READ -1 00:0e:1048 17 17
24: OFDLCK ADVISORY  READ -1 00:0e:1048 16 16
25: OFDLCK ADVISORY  READ -1 00:0e:1048 13 13

OFDLCK probably means 'Open File Descriptor LoCK' since an OFDLCK
is associated with (held by) the file descriptor and not a process/pid,
on the given (anonymous in this case) block device + inode.

Where READ=shared and WRITE=exclusive.
There are (as yet) no exclusive locks held post boot.

The pid field is unfortunately always -1 (and cannot be manually set).

The 00:0e:1048 (or at least the inode portion) is random
(likely depends on boot ordering)

The final two fields (min and max offset) are the bpf map id.

Test: TreeHugger
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Change-Id: I208e3450da3fe4689ad5fd578539f401f25a4fef
3 files changed