| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 1 | # zygote-start is what officially starts netd (see //system/core/rootdir/init.rc) | 
|  | 2 | # However, on some hardware it's started from post-fs-data as well, which is just | 
|  | 3 | # a tad earlier.  There's no benefit to that though, since on 4.9+ P+ devices netd | 
|  | 4 | # will just block until bpfloader finishes and sets the bpf.progs_loaded property. | 
|  | 5 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 6 | # It is important that we start bpfloader after: | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 7 | #   - /sys/fs/bpf is already mounted, | 
|  | 8 | #   - apex (incl. rollback) is initialized (so that in the future we can load bpf | 
|  | 9 | #     programs shipped as part of apex mainline modules) | 
|  | 10 | #   - logd is ready for us to log stuff | 
|  | 11 | # | 
|  | 12 | # At the same time we want to be as early as possible to reduce races and thus | 
|  | 13 | # failures (before memory is fragmented, and cpu is busy running tons of other | 
|  | 14 | # stuff) and we absolutely want to be before netd and the system boot slot is | 
|  | 15 | # considered to have booted successfully. | 
|  | 16 | # | 
|  | 17 | on load_bpf_programs | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 18 | exec_start bpfloader | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 19 |  | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 20 | service bpfloader /system/bin/netbpfload | 
|  | 21 | # netbpfload will do network bpf loading, then execute /system/bin/bpfloader | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 22 | capabilities CHOWN SYS_ADMIN NET_ADMIN | 
|  | 23 | # The following group memberships are a workaround for lack of DAC_OVERRIDE | 
|  | 24 | # and allow us to open (among other things) files that we created and are | 
|  | 25 | # no longer root owned (due to CHOWN) but still have group read access to | 
|  | 26 | # one of the following groups.  This is not perfect, but a more correct | 
|  | 27 | # solution requires significantly more effort to implement. | 
|  | 28 | group root graphics network_stack net_admin net_bw_acct net_bw_stats net_raw system | 
|  | 29 | user root | 
|  | 30 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 31 | # Set RLIMIT_MEMLOCK to 1GiB for bpfloader | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 32 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 33 | # Actually only 8MiB would be needed if bpfloader ran as its own uid. | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 34 | # | 
|  | 35 | # However, while the rlimit is per-thread, the accounting is system wide. | 
|  | 36 | # So, for example, if the graphics stack has already allocated 10MiB of | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 37 | # memlock data before bpfloader even gets a chance to run, it would fail | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 38 | # if its memlock rlimit is only 8MiB - since there would be none left for it. | 
|  | 39 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 40 | # bpfloader succeeding is critical to system health, since a failure will | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 41 | # cause netd crashloop and thus system server crashloop... and the only | 
|  | 42 | # recovery is a full kernel reboot. | 
|  | 43 | # | 
|  | 44 | # We've had issues where devices would sometimes (rarely) boot into | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 45 | # a crashloop because bpfloader would occasionally lose a boot time | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 46 | # race against the graphics stack's boot time locked memory allocation. | 
|  | 47 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 48 | # Thus bpfloader's memlock has to be 8MB higher then the locked memory | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 49 | # consumption of the root uid anywhere else in the system... | 
|  | 50 | # But we don't know what that is for all possible devices... | 
|  | 51 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 52 | # Ideally, we'd simply grant bpfloader the IPC_LOCK capability and it | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 53 | # would simply ignore it's memlock rlimit... but it turns that this | 
|  | 54 | # capability is not even checked by the kernel's bpf system call. | 
|  | 55 | # | 
|  | 56 | # As such we simply use 1GiB as a reasonable approximation of infinity. | 
|  | 57 | # | 
|  | 58 | rlimit memlock 1073741824 1073741824 | 
|  | 59 | oneshot | 
|  | 60 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 61 | # How to debug bootloops caused by 'bpfloader-failed'. | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 62 | # | 
|  | 63 | # 1. On some lower RAM devices (like wembley) you may need to first enable developer mode | 
|  | 64 | #    (from the Settings app UI), and change the developer option "Logger buffer sizes" | 
|  | 65 | #    from the default (wembley: 64kB) to the maximum (1M) per log buffer. | 
|  | 66 | #    Otherwise buffer will overflow before you manage to dump it and you'll get useless logs. | 
|  | 67 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 68 | # 2. comment out 'reboot_on_failure reboot,bpfloader-failed' below | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 69 | # 3. rebuild/reflash/reboot | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 70 | # 4. as the device is booting up capture bpfloader logs via: | 
|  | 71 | #    adb logcat -s 'bpfloader:*' 'LibBpfLoader:*' 'NetBpfLoad:*' 'NetBpfLoader:*' | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 72 | # | 
|  | 73 | # something like: | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 74 | #   $ adb reboot; sleep 1; adb wait-for-device; adb root; sleep 1; adb wait-for-device; adb logcat -s 'bpfloader:*' 'LibBpfLoader:*' 'NetBpfLoad:*' 'NetBpfLoader:*' | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 75 | # will take care of capturing logs as early as possible | 
|  | 76 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 77 | # 5. look through the logs from the kernel's bpf verifier that bpfloader dumps out, | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 78 | #    it usually makes sense to search back from the end and find the particular | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 79 | #    bpf verifier failure that caused bpfloader to terminate early with an error code. | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 80 | #    This will probably be something along the lines of 'too many jumps' or | 
|  | 81 | #    'cannot prove return value is 0 or 1' or 'unsupported / unknown operation / helper', | 
|  | 82 | #    'invalid bpf_context access', etc. | 
|  | 83 | # | 
| Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 84 | reboot_on_failure reboot,bpfloader-failed | 
| Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 85 | # we're not really updatable, but want to be able to load bpf programs shipped in apexes | 
|  | 86 | updatable |