Maciej Żenczykowski | 8c21593 | 2024-03-05 19:02:11 -0800 | [diff] [blame] | 1 | # Note: This will actually execute /apex/com.android.tethering/bin/netbpfload |
| 2 | # by virtue of 'service bpfloader' being overridden by the apex shipped .rc |
| 3 | # Warning: most of the below settings are irrelevant unless the apex is missing. |
| 4 | service bpfloader /system/bin/false |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 5 | # netbpfload will do network bpf loading, then execute /system/bin/bpfloader |
Maciej Żenczykowski | 8c21593 | 2024-03-05 19:02:11 -0800 | [diff] [blame] | 6 | #! capabilities CHOWN SYS_ADMIN NET_ADMIN |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 7 | # The following group memberships are a workaround for lack of DAC_OVERRIDE |
| 8 | # and allow us to open (among other things) files that we created and are |
| 9 | # no longer root owned (due to CHOWN) but still have group read access to |
| 10 | # one of the following groups. This is not perfect, but a more correct |
| 11 | # solution requires significantly more effort to implement. |
Maciej Żenczykowski | 8c21593 | 2024-03-05 19:02:11 -0800 | [diff] [blame] | 12 | #! group root graphics network_stack net_admin net_bw_acct net_bw_stats net_raw system |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 13 | user root |
| 14 | # |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 15 | # Set RLIMIT_MEMLOCK to 1GiB for bpfloader |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 16 | # |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 17 | # Actually only 8MiB would be needed if bpfloader ran as its own uid. |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 18 | # |
| 19 | # However, while the rlimit is per-thread, the accounting is system wide. |
| 20 | # So, for example, if the graphics stack has already allocated 10MiB of |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 21 | # memlock data before bpfloader even gets a chance to run, it would fail |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 22 | # if its memlock rlimit is only 8MiB - since there would be none left for it. |
| 23 | # |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 24 | # bpfloader succeeding is critical to system health, since a failure will |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 25 | # cause netd crashloop and thus system server crashloop... and the only |
| 26 | # recovery is a full kernel reboot. |
| 27 | # |
| 28 | # We've had issues where devices would sometimes (rarely) boot into |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 29 | # a crashloop because bpfloader would occasionally lose a boot time |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 30 | # race against the graphics stack's boot time locked memory allocation. |
| 31 | # |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 32 | # Thus bpfloader's memlock has to be 8MB higher then the locked memory |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 33 | # consumption of the root uid anywhere else in the system... |
| 34 | # But we don't know what that is for all possible devices... |
| 35 | # |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 36 | # Ideally, we'd simply grant bpfloader the IPC_LOCK capability and it |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 37 | # would simply ignore it's memlock rlimit... but it turns that this |
| 38 | # capability is not even checked by the kernel's bpf system call. |
| 39 | # |
| 40 | # As such we simply use 1GiB as a reasonable approximation of infinity. |
| 41 | # |
Maciej Żenczykowski | 8c21593 | 2024-03-05 19:02:11 -0800 | [diff] [blame] | 42 | #! rlimit memlock 1073741824 1073741824 |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 43 | oneshot |
| 44 | # |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 45 | # How to debug bootloops caused by 'bpfloader-failed'. |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 46 | # |
| 47 | # 1. On some lower RAM devices (like wembley) you may need to first enable developer mode |
| 48 | # (from the Settings app UI), and change the developer option "Logger buffer sizes" |
| 49 | # from the default (wembley: 64kB) to the maximum (1M) per log buffer. |
| 50 | # Otherwise buffer will overflow before you manage to dump it and you'll get useless logs. |
| 51 | # |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 52 | # 2. comment out 'reboot_on_failure reboot,bpfloader-failed' below |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 53 | # 3. rebuild/reflash/reboot |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 54 | # 4. as the device is booting up capture bpfloader logs via: |
| 55 | # adb logcat -s 'bpfloader:*' 'LibBpfLoader:*' 'NetBpfLoad:*' 'NetBpfLoader:*' |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 56 | # |
| 57 | # something like: |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 58 | # $ adb reboot; sleep 1; adb wait-for-device; adb root; sleep 1; adb wait-for-device; adb logcat -s 'bpfloader:*' 'LibBpfLoader:*' 'NetBpfLoad:*' 'NetBpfLoader:*' |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 59 | # will take care of capturing logs as early as possible |
| 60 | # |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 61 | # 5. look through the logs from the kernel's bpf verifier that bpfloader dumps out, |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 62 | # it usually makes sense to search back from the end and find the particular |
Maciej Żenczykowski | 7da54d9 | 2023-10-24 02:11:09 -0700 | [diff] [blame] | 63 | # bpf verifier failure that caused bpfloader to terminate early with an error code. |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 64 | # This will probably be something along the lines of 'too many jumps' or |
| 65 | # 'cannot prove return value is 0 or 1' or 'unsupported / unknown operation / helper', |
| 66 | # 'invalid bpf_context access', etc. |
| 67 | # |
Maciej Żenczykowski | 8c21593 | 2024-03-05 19:02:11 -0800 | [diff] [blame] | 68 | reboot_on_failure reboot,netbpfload-missing |
Maciej Żenczykowski | 7db65c6 | 2023-10-19 16:51:15 -0700 | [diff] [blame] | 69 | updatable |