init: add builtin check for perf_event LSM hooks
Historically, the syscall was controlled by a system-wide
perf_event_paranoid sysctl, which is not flexible enough to allow only
specific processes to use the syscall. However, SELinux support for the
syscall has been upstreamed recently[1] (and is being backported to
Android R release common kernels).
[1] https://github.com/torvalds/linux/commit/da97e18458fb42d7c00fac5fd1c56a3896ec666e
As the presence of these hooks is not guaranteed on all Android R
platforms (since we support upgrades while keeping an older kernel), we
need to test for the feature dynamically. The LSM hooks themselves have
no way of being detected directly, so we instead test for their effects,
so we perform several syscalls, and look for a specific success/failure
combination, corresponding to the platform's SELinux policy.
If hooks are detected, perf_event_paranoid is set to -1 (unrestricted),
as the SELinux policy is then sufficient to control access.
This is done within init for several reasons:
* CAP_SYS_ADMIN side-steps perf_event_paranoid, so the tests can be done
if non-root users aren't allowed to use the syscall (the default).
* init is already the setter of the paranoid value (see init.rc), which
is also a privileged operation.
* the test itself is simple (couple of syscalls), so having a dedicated
test binary/domain felt excessive.
I decided to go through a new sysprop (set by a builtin test in
second-stage init), and keeping the actuation in init.rc. We can change
it to an immediate write to the paranoid value if a use-case comes up
that requires the decision to be made earlier in the init sequence.
Bug: 137092007
Change-Id: Ib13a31fee896f17a28910d993df57168a83a4b3d
diff --git a/rootdir/init.rc b/rootdir/init.rc
index 7a3339d..470c912 100644
--- a/rootdir/init.rc
+++ b/rootdir/init.rc
@@ -936,14 +936,33 @@
on property:sys.sysctl.tcp_def_init_rwnd=*
write /proc/sys/net/ipv4/tcp_default_init_rwnd ${sys.sysctl.tcp_def_init_rwnd}
-on property:security.perf_harden=0
+# perf_event_open syscall security:
+# Newer kernels have the ability to control the use of the syscall via SELinux
+# hooks. init tests for this, and sets sys_init.perf_lsm_hooks to 1 if the
+# kernel has the hooks. In this case, the system-wide perf_event_paranoid
+# sysctl is set to -1 (unrestricted use), and the SELinux policy is used for
+# controlling access. On older kernels, the paranoid value is the only means of
+# controlling access. It is normally 3 (allow only root), but the shell user
+# can lower it to 1 (allowing thread-scoped pofiling) via security.perf_harden.
+on property:sys.init.perf_lsm_hooks=1
+ write /proc/sys/kernel/perf_event_paranoid -1
+on property:security.perf_harden=0 && property:sys.init.perf_lsm_hooks=""
write /proc/sys/kernel/perf_event_paranoid 1
+on property:security.perf_harden=1 && property:sys.init.perf_lsm_hooks=""
+ write /proc/sys/kernel/perf_event_paranoid 3
+
+# Additionally, simpleperf profiler uses debug.* and security.perf_harden
+# sysprops to be able to indirectly set these sysctls.
+on property:security.perf_harden=0
write /proc/sys/kernel/perf_event_max_sample_rate ${debug.perf_event_max_sample_rate:-100000}
write /proc/sys/kernel/perf_cpu_time_max_percent ${debug.perf_cpu_time_max_percent:-25}
write /proc/sys/kernel/perf_event_mlock_kb ${debug.perf_event_mlock_kb:-516}
-
+# Default values.
on property:security.perf_harden=1
- write /proc/sys/kernel/perf_event_paranoid 3
+ write /proc/sys/kernel/perf_event_max_sample_rate 100000
+ write /proc/sys/kernel/perf_cpu_time_max_percent 25
+ write /proc/sys/kernel/perf_event_mlock_kb 516
+
# on shutdown
# In device's init.rc, this trigger can be used to do device-specific actions