riscv64: switch from x18 to gp for shadow call stack.
We want to give back a useful callee-saved general purpose
register (x18) that was only "chosen" because it was what llvm
allowed for historical reasons. gp is a better choice because it's
effectively unused otherwise anyway.
Unfortunately, that means we need extra space in jmp_buf (which I've
reserved in an earlier change, e7b3b8b467bad2cd32470b5edd5cb9938b934316),
so let's rearrange the entries in jmp_buf to match their order in the
register file.
Bug: https://github.com/google/android-riscv64/issues/72
Bug: http://b/277909695
Test: treehugger
Change-Id: Ia629409a894c1a83d2052885702bbdd895c758e1
diff --git a/libc/bionic/pthread_create.cpp b/libc/bionic/pthread_create.cpp
index 15d6d6d..7bf9b40 100644
--- a/libc/bionic/pthread_create.cpp
+++ b/libc/bionic/pthread_create.cpp
@@ -133,14 +133,14 @@
size_t scs_offset =
(getpid() == 1) ? 0 : (arc4random_uniform(SCS_GUARD_REGION_SIZE / SCS_SIZE - 1) * SCS_SIZE);
- // Make the stack readable and writable and store its address in x18.
- // This is deliberately the only place where the address is stored.
+ // Make the stack read-write, and store its address in the register we're using as the shadow
+ // stack pointer. This is deliberately the only place where the address is stored.
char* scs = scs_aligned_guard_region + scs_offset;
mprotect(scs, SCS_SIZE, PROT_READ | PROT_WRITE);
#if defined(__aarch64__)
__asm__ __volatile__("mov x18, %0" ::"r"(scs));
#elif defined(__riscv)
- __asm__ __volatile__("mv x18, %0" ::"r"(scs));
+ __asm__ __volatile__("mv gp, %0" ::"r"(scs));
#endif
#endif
}
diff --git a/libc/bionic/pthread_internal.h b/libc/bionic/pthread_internal.h
index 083c2ed..a3a4ccd 100644
--- a/libc/bionic/pthread_internal.h
+++ b/libc/bionic/pthread_internal.h
@@ -110,7 +110,8 @@
// are actually used.
//
// This address is only used to deallocate the shadow call stack on thread
- // exit; the address of the stack itself is stored only in the x18 register.
+ // exit; the address of the stack itself is stored only in the register used
+ // as the shadow stack pointer (x18 on arm64, gp on riscv64).
//
// Because the protection offered by SCS relies on the secrecy of the stack
// address, storing the address here weakens the protection, but only
@@ -119,22 +120,24 @@
// to other allocations), but not the stack itself, which is <0.1% of the size
// of the guard region.
//
- // longjmp()/setjmp() don't store all the bits of x18, only the bottom bits
- // covered by SCS_MASK. Since longjmp()/setjmp() between different threads is
- // undefined behavior (and unsupported on Android), we can retrieve the high
- // bits of x18 from the current value in x18 --- all the jmp_buf needs to store
- // is where exactly the shadow stack pointer is in the thread's shadow stack:
- // the bottom bits of x18.
+ // longjmp()/setjmp() don't store all the bits of the shadow stack pointer,
+ // only the bottom bits covered by SCS_MASK. Since longjmp()/setjmp() between
+ // different threads is undefined behavior (and unsupported on Android), we
+ // can retrieve the high bits of the shadow stack pointer from the current
+ // value in the register --- all the jmp_buf needs to store is where exactly
+ // the shadow stack pointer is *within* the thread's shadow stack: the bottom
+ // bits of the register.
//
// There are at least two other options for discovering the start address of
// the guard region on thread exit, but they are not as simple as storing in
// TLS.
//
- // 1) Derive it from the value of the x18 register. This is only possible in
- // processes that do not contain legacy code that might clobber x18,
- // therefore each process must declare early during process startup whether
- // it might load legacy code.
- // TODO: riscv64 has no legacy code, so we can actually go this route there!
+ // 1) Derive it from the current value of the shadow stack pointer. This is
+ // only possible in processes that do not contain legacy code that might
+ // clobber x18 on arm64, therefore each process must declare early during
+ // process startup whether it might load legacy code.
+ // TODO: riscv64 has no legacy code, so we can actually go this route
+ // there, but hopefully we'll actually get the Zsslpcfi extension instead.
// 2) Mark the guard region as such using prctl(PR_SET_VMA_ANON_NAME) and
// discover its address by reading /proc/self/maps. One issue with this is
// that reading /proc/self/maps can race with allocations, so we may need