Merge "Use libmemory_trace for writing trace data." into main
diff --git a/docs/mte.md b/docs/mte.md
new file mode 100644
index 0000000..3890283
--- /dev/null
+++ b/docs/mte.md
@@ -0,0 +1,244 @@
+# Arm Memory Tagging Extension (MTE) implementation
+
+AOSP supports Arm MTE to detect invalid memory accesses. The implementation is
+spread across multiple components, both within and out of the AOSP tree. This
+document gives an overview and pointers about how the various MTE features are
+implemented.
+
+For documentation of the behavior rather than the implementation, see the
+[SAC page on MTE] instead. For MTE for apps, see the [NDK page on MTE].
+
+The relevant components are:
+
+* [LLVM Project] (out of AOSP tree)
+    * Stack tagging instrumentation pass
+    * Scudo memory allocator
+* bionic
+    * libc
+    * dynamic loader
+* Zygote
+* debuggerd
+* [NDK]
+
+## MTE enablement
+
+The way MTE is requested and enabled differs between native binaries and Java
+apps. This is necessarily so, because Java apps get forked from the Zygote,
+while native executables get inintialized by the linker.
+
+### Native binaries
+
+Both AOSP and the NDK allow you to compile C/C++ code that use MTE to detect
+memory safety issues. The [NDK legacy cmake toolchain] and the
+[NDK new cmake toolchain] both support "memtag" as an argument for
+`ANDROID_SANITIZE`. NDK make has no specific support for MTE, but the
+relevant flags can be passed directly as `CFLAGS` and `LDFLAGS`.
+
+For the OS itself, [Soong] supports "memtag_[heap|stack|globals]" as
+`SANITIZE_TARGET  and as `sanitize:` attribute in Android.bp files;
+[Android make] supports the same environment variables as Soong. This passes
+the appropriate flags to the clang driver for both compile and link steps.
+
+#### Linker
+
+* For **dynamic executables** LLD has support to
+  [add appropriate dynamic sections] as defined in the [ELF standard]
+* For **static executables** and as a fallback for older devices, LLD
+  also supports [adding the Android-specific ELF note]
+
+Both of the above are controlled by the linker flag `--android-memtag-mode`
+which is [passed in by the clang driver] if
+`-fsanitize=memtag-[stack|heap|globals]` is [passed in].
+`-fsanitize=memtag` [enables all three] (even for API levels that don't
+implement the runtime for globals, which means builds from old versions
+of clang may no work with newer platform versions that support globals).
+`-fsanitize-memtag-mode` allows to choose between ASYNC and SYNC.
+
+This information can be queried using `llvm-readelf --memtag`.
+
+This information is [picked up by libc init] to decide whether to enable MTE.
+`-fsanitize-heap` controls both whether scudo tags allocations, and whether
+tag checking is enabled.
+
+#### Runtime environment (dynamic loader, libc)
+
+There are two different initialization sequences for libc, both of which end up
+calling `__libc_init_mte`.
+
+N.B. the linker has its own copy of libc, which is used when executing these
+functions. That is why we have to use `__libc_shared_globals` to communicate
+with the libc of the process we are starting.
+
+* **static executables** `__libc_init` is called from `crtbegin.c`, which calls
+                         `__libc_init_mte`
+* **dynamic executables** the linker calls `__libc_init_mte`
+
+`__libc_init_mte` figures out the appropriate MTE level that is requested by
+the process, calls `prctl` to request this from the kernel, and stores data in
+`__libc_shared_globals` which gets picked up later to enable MTE in scudo.
+
+It also does work related to stack tagging and permissive mode, which will be
+detailed later.
+
+### Apps
+
+Apps can request MTE be enabled for their process via the manifest attribute
+`android:memtagMode`. This gets interpreted by Zygote, which always runs with
+`ASYNC` MTE enabled, because MTE for a process can only be disabled after
+it has been initialized (see [Native binaries](#native-binaries)), not enabled.
+
+[decideTaggingLevel] in the Zygote figures out whether to enable MTE for
+an app, and stores it in the `runtimeFlags`, which get picked up by
+[SpecializeCommon] after forking from the Zygote.
+
+## MTE implementation
+
+### Heap Tagging
+
+Heap tagging is implemented in the scudo allocator. On `malloc` and `free`,
+scudo will update the memory's tags to prevent use-after-free and buffer
+overflows.
+
+[scudo's memtag.h] contains helper functions to deal with MTE tag management,
+which are used in [combined.h] and [secondary.h].
+
+
+### Stack Tagging
+
+Stack tagging requires instrumenting function bodies. It is implemented as
+an instrumentation pass in LLVM called [AArch64StackTagging], which sets
+the tags according to the lifetime of stack objects.
+
+The instrumentation pass also supports recording stack history, consisting of:
+
+* PC
+* Frame pointer
+* Base tag
+
+This can be used to reconstruct which stack object was referred to in an
+invalid access. The logic to reconstruct this can be found in the
+[stack script].
+
+
+Stack tagging is enabled in one of two circumstances:
+* at process startup, if the main binary or any of its dependencies are
+  compiled with `memtag-stack`
+* library compiled with `memtag-stack` is `dlopen`ed later, either directly or
+  as a dependency of a `dlopen`ed library. In this case, the
+  [__pthread_internal_remap_stack_with_mte] function is used (called from
+  `memtag_stack_dlopen_callback`). Because `dlopen`
+  is handled by the linker, we have to [store a function pointer] to the
+  process's version of the function in `__libc_shared_globals`.
+
+Enabling stack MTE consists of two operations:
+* Remapping the stacks as `PROT_MTE`
+* Allocating a stack history buffer.
+
+The first operation is only necessary when the process is running with MTE
+enabled. The second operation is also necessary when the process is not running
+with MTE enabled, because the writes to the stack history buffer are
+unconditional.
+
+libc keeps track of this through two globals:
+
+* `__libc_memtag_stack`:  whether stack MTE is enabled on the process, i.e.
+  whether the stack pages are mapped with PROT\_MTE. This is always false if
+  MTE is disabled for the process (i.e. `libc_globals.memtag` is false).
+* `__libc_memtag_stack_abi`: whether the process contains any code that was
+  compiled with memtag-stack. This is true even if the process does not have
+  MTE enabled.
+
+### Globals Tagging
+
+TODO(fmayer): write once submitted
+
+### Crash reporting
+
+For MTE crashes, debuggerd serializes special information into the Tombstone
+proto:
+
+* Tags around fault address
+* Scudo allocation history
+
+This is done in [tombstone\_proto.cpp]. The information is converted to a text
+proto in [tombstone\_proto\_to\_text.cpp].
+
+## Bootloader control
+
+The bootloader API allows userspace to enable MTE on devices that do not ship
+with MTE enabled by default.
+
+See [SAC bootloader support] for the API definition. In AOSP, this API is
+implemented in [system/extras/mtectrl]. mtectrl.rc handles the property
+changes and invokes mtectrl to update the misc partition to communicate
+with the bootloader.
+
+There is also an [API in Device Policy Manager] that allows the device admin
+to enable or disable MTE under certain circumstances.
+
+The device can opt in or out of these APIs by a set of system properties:
+
+* `ro.arm64.memtag.bootctl_supported`: the system property API is supported,
+  and an option is displayed in Developer Options.
+* `ro.arm64.memtag.bootctl_settings_toggle`: an option is displayed in the
+  normal settings. This requires `ro.arm64.memtag.bootctl_supported` to be
+  true. This implies `ro.arm64.memtag.bootctl_device_policy_manager`, if it
+  is not explicitely set.
+* `ro.arm64.memtag.bootctl_device_policy_manager`: the Device Policy Manager
+  API is supported.
+
+## Permissive MTE
+
+Permissive MTE refers to a mode which, instead of crashing the process on an
+MTE fault, records a tombstone but then continues execution of the process.
+An important caveat is that system calls with invalid pointers (where the
+pointer tag does not match the memory tag) still return an error code.
+
+This mode is only available for system services, not apps. It is implemented
+in the [debugger\_signal\_handler] by disabling MTE for the faulting thread.
+Optionally, the user can ask for MTE to be re-enabled after some time.
+This is achieved by arming a timer that calls [enable_mte_signal_handler]
+upon expiry.
+
+## MTE Mode Upgrade
+
+When a system service [crashes in ASYNC mode], we set an impossible signal
+as an exit code (because that signal is always gracefully handled by libc),
+and [in init] we set `BIONIC_MEMTAG_UPGRADE_SECS`, which gets handled by
+[libc startup].
+
+[SpecializeCommon]: https://cs.android.com/android/platform/superproject/main/+/main:frameworks/base/core/jni/com_android_internal_os_Zygote.cpp?q=f:frameworks%2Fbase%2Fcore%2Fjni%2Fcom_android_internal_os_Zygote.cpp%20%22%20mallopt(M_BIONIC_SET_HEAP_TAGGING_LEVEL,%22&ss=android%2Fplatform%2Fsuperproject%2Fmain
+[LLVM Project]: https://github.com/llvm/llvm-project/
+[NDK]: https://android.googlesource.com/platform/ndk/
+[NDK legacy cmake toolchain]: https://android.googlesource.com/platform/ndk/+/refs/heads/main/build/cmake/android-legacy.toolchain.cmake#490
+[NDK new cmake toolchain]: https://android.googlesource.com/platform/ndk/+/refs/heads/main/build/cmake/flags.cmake#56
+[Soong]: https://cs.android.com/android/platform/superproject/main/+/main:build/soong/cc/sanitize.go?q=sanitize.go&ss=android%2Fplatform%2Fsuperproject%2Fmain
+[decideTaggingLevel]: https://cs.android.com/android/platform/superproject/main/+/main:frameworks/base/core/java/com/android/internal/os/Zygote.java?q=symbol:decideTaggingLevel
+[picked up by libc init]: https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/bionic/libc_init_static.cpp?q=symbol:__get_tagging_level%20f:bionic
+[enables all three]: https://github.com/llvm/llvm-project/blob/e732d1ce86783b1d7fe30645fcb30434109505b9/clang/include/clang/Basic/Sanitizers.def#L62
+[passed in]: https://github.com/llvm/llvm-project/blob/ff2e619dfcd77328812a42d2ba2b11c3ff96f410/clang/lib/Driver/SanitizerArgs.cpp#L719
+[passed in by the clang driver]: https://github.com/llvm/llvm-project/blob/ff2e619dfcd77328812a42d2ba2b11c3ff96f410/clang/lib/Driver/ToolChains/CommonArgs.cpp#L1595
+[adding the Android-specific ELF note]: https://github.com/llvm/llvm-project/blob/435cb0dc5eca08cdd8d9ed0d887fa1693cc2bf33/lld/ELF/Driver.cpp#L1258
+[ELF standard]: https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#6dynamic-section
+[add appropriate dynamic sections]: https://github.com/llvm/llvm-project/blob/7022498ac2f236e411e8a0f9a48669e754000a4b/lld/ELF/SyntheticSections.cpp#L1473
+[storeTags]: https://cs.android.com/android/platform/superproject/main/+/main:external/scudo/standalone/memtag.h?q=f:scudo%20f:memtag.h%20function:storeTags
+[SAC page on MTE]: https://source.android.com/docs/security/test/memory-safety/arm-mte
+[NDK page on MTE]: https://developer.android.com/ndk/guides/arm-mte
+[AArch64StackTagging]: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64StackTagging.cpp
+[scudo's memtag.h]: https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/scudo/standalone/memtag.h
+[combined.h]: https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/scudo/standalone/combined.h
+[secondary.h]: https://github.com/llvm/llvm-project/blob/main/compiler-rt/lib/scudo/standalone/secondary.h
+[__pthread_internal_remap_stack_with_mte]: https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/bionic/pthread_internal.cpp?q=__pthread_internal_remap_stack_with_mte
+[stack script]: https://cs.android.com/android/platform/superproject/main/+/main:development/scripts/stack?q=stack
+[Android make]: https://cs.android.com/android/platform/superproject/main/+/main:build/make/core/config_sanitizers.mk
+[store a function pointer]: https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/bionic/libc_init_dynamic.cpp;l=168?q=memtag_stack_dlopen_callback
+[tombstone\_proto.cpp]: https://cs.android.com/android/platform/superproject/main/+/main:system/core/debuggerd/libdebuggerd/tombstone_proto.cpp?q=tombstone_proto.cpp
+[tombstone\_proto\_to\_text.cpp]: https://cs.android.com/android/platform/superproject/main/+/main:system/core/debuggerd/libdebuggerd/tombstone_proto_to_text.cpp
+[SAC MTE bootloader support]: https://source.android.com/docs/security/test/memory-safety/bootloader-support
+[system/extras/mtectrl]: https://cs.android.com/android/platform/superproject/main/+/main:system/extras/mtectrl/
+[API in Device Policy Manager]: https://cs.android.com/android/platform/superproject/main/+/main:frameworks/base/core/java/android/app/admin/DevicePolicyManager.java?q=symbol:setMtePolicy%20f:DevicePolicyManager.java
+[debuggerd\_signal_handler]: https://cs.android.com/android/platform/superproject/main/+/main:system/core/debuggerd/handler/debuggerd_handler.cpp?q=f:debuggerd_handler.cpp%20symbol:debuggerd_signal_handler
+[enable_mte_signal_handler]: https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/bionic/libc_init_static.cpp?q=symbol:__enable_mte_signal_handler
+[in init]: https://cs.android.com/android/platform/superproject/main/+/main:system/core/init/service.cpp?q=f:system%2Fcore%2Finit%2Fservice.cpp%20should_upgrade_mte
+[crashes in ASYNC mode]: https://cs.android.com/android/platform/superproject/main/+/main:system/core/debuggerd/handler/debuggerd_handler.cpp;l=799?q=BIONIC_SIGNAL_ART_PROFILER
+[libc startup]: https://cs.android.com/android/platform/superproject/main/+/main:bionic/libc/bionic/libc_init_static.cpp?q=BIONIC_MEMTAG_UPGRADE_SECS
diff --git a/libc/Android.bp b/libc/Android.bp
index d9b3658..5ae8c4f 100644
--- a/libc/Android.bp
+++ b/libc/Android.bp
@@ -264,6 +264,7 @@
     name: "libc_init_static",
     defaults: ["libc_defaults"],
     srcs: [
+        "bionic/libc_init_mte.cpp",
         "bionic/libc_init_static.cpp",
         ":elf_note_sources",
     ],
@@ -1489,6 +1490,7 @@
     srcs: [
         "arch-common/bionic/crtbegin_so.c",
         "arch-common/bionic/crtbrand.S",
+        "bionic/android_mallopt.cpp",
         "bionic/gwp_asan_wrappers.cpp",
         "bionic/heap_tagging.cpp",
         "bionic/icu.cpp",
@@ -1507,6 +1509,7 @@
 filegroup {
     name: "libc_sources_static",
     srcs: [
+        "bionic/android_mallopt.cpp",
         "bionic/gwp_asan_wrappers.cpp",
         "bionic/heap_tagging.cpp",
         "bionic/icu_static.cpp",
diff --git a/libc/bionic/__libc_init_main_thread.cpp b/libc/bionic/__libc_init_main_thread.cpp
index 1b539f2..0d557f1 100644
--- a/libc/bionic/__libc_init_main_thread.cpp
+++ b/libc/bionic/__libc_init_main_thread.cpp
@@ -44,7 +44,7 @@
 // Declared in "private/bionic_ssp.h".
 uintptr_t __stack_chk_guard = 0;
 
-static pthread_internal_t main_thread;
+BIONIC_USED_BEFORE_LINKER_RELOCATES static pthread_internal_t main_thread;
 
 // Setup for the main thread. For dynamic executables, this is called by the
 // linker _before_ libc is mapped in memory. This means that all writes to
diff --git a/libc/bionic/android_mallopt.cpp b/libc/bionic/android_mallopt.cpp
new file mode 100644
index 0000000..79e4072
--- /dev/null
+++ b/libc/bionic/android_mallopt.cpp
@@ -0,0 +1,146 @@
+/*
+ * Copyright (C) 2009 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <errno.h>
+#include <stdatomic.h>
+
+#include <platform/bionic/malloc.h>
+#include <private/bionic_globals.h>
+
+#include "gwp_asan_wrappers.h"
+#include "malloc_limit.h"
+
+#if !defined(LIBC_STATIC)
+#include <stdio.h>
+
+#include <private/bionic_defs.h>
+
+#include "malloc_heapprofd.h"
+
+extern bool gZygoteChild;
+extern _Atomic bool gZygoteChildProfileable;
+
+bool WriteMallocLeakInfo(FILE* fp);
+bool GetMallocLeakInfo(android_mallopt_leak_info_t* leak_info);
+bool FreeMallocLeakInfo(android_mallopt_leak_info_t* leak_info);
+#endif
+
+// =============================================================================
+// Platform-internal mallopt variant.
+// =============================================================================
+#if !defined(LIBC_STATIC)
+__BIONIC_WEAK_FOR_NATIVE_BRIDGE
+#endif
+extern "C" bool android_mallopt(int opcode, void* arg, size_t arg_size) {
+  // Functionality available in both static and dynamic libc.
+  if (opcode == M_GET_DECAY_TIME_ENABLED) {
+    if (arg == nullptr || arg_size != sizeof(bool)) {
+      errno = EINVAL;
+      return false;
+    }
+    *reinterpret_cast<bool*>(arg) = atomic_load(&__libc_globals->decay_time_enabled);
+    return true;
+  }
+  if (opcode == M_INITIALIZE_GWP_ASAN) {
+    if (arg == nullptr || arg_size != sizeof(android_mallopt_gwp_asan_options_t)) {
+      errno = EINVAL;
+      return false;
+    }
+
+    return EnableGwpAsan(*reinterpret_cast<android_mallopt_gwp_asan_options_t*>(arg));
+  }
+  if (opcode == M_MEMTAG_STACK_IS_ON) {
+    if (arg == nullptr || arg_size != sizeof(bool)) {
+      errno = EINVAL;
+      return false;
+    }
+    *reinterpret_cast<bool*>(arg) = atomic_load(&__libc_memtag_stack);
+    return true;
+  }
+  if (opcode == M_SET_ALLOCATION_LIMIT_BYTES) {
+    return LimitEnable(arg, arg_size);
+  }
+
+#if defined(LIBC_STATIC)
+  errno = ENOTSUP;
+  return false;
+#else
+  if (opcode == M_SET_ZYGOTE_CHILD) {
+    if (arg != nullptr || arg_size != 0) {
+      errno = EINVAL;
+      return false;
+    }
+    gZygoteChild = true;
+    return true;
+  }
+  if (opcode == M_INIT_ZYGOTE_CHILD_PROFILING) {
+    if (arg != nullptr || arg_size != 0) {
+      errno = EINVAL;
+      return false;
+    }
+    atomic_store_explicit(&gZygoteChildProfileable, true, memory_order_release);
+    // Also check if heapprofd should start profiling from app startup.
+    HeapprofdInitZygoteChildProfiling();
+    return true;
+  }
+  if (opcode == M_GET_PROCESS_PROFILEABLE) {
+    if (arg == nullptr || arg_size != sizeof(bool)) {
+      errno = EINVAL;
+      return false;
+    }
+    // Native processes are considered profileable. Zygote children are considered
+    // profileable only when appropriately tagged.
+    *reinterpret_cast<bool*>(arg) =
+        !gZygoteChild || atomic_load_explicit(&gZygoteChildProfileable, memory_order_acquire);
+    return true;
+  }
+  if (opcode == M_WRITE_MALLOC_LEAK_INFO_TO_FILE) {
+    if (arg == nullptr || arg_size != sizeof(FILE*)) {
+      errno = EINVAL;
+      return false;
+    }
+    return WriteMallocLeakInfo(reinterpret_cast<FILE*>(arg));
+  }
+  if (opcode == M_GET_MALLOC_LEAK_INFO) {
+    if (arg == nullptr || arg_size != sizeof(android_mallopt_leak_info_t)) {
+      errno = EINVAL;
+      return false;
+    }
+    return GetMallocLeakInfo(reinterpret_cast<android_mallopt_leak_info_t*>(arg));
+  }
+  if (opcode == M_FREE_MALLOC_LEAK_INFO) {
+    if (arg == nullptr || arg_size != sizeof(android_mallopt_leak_info_t)) {
+      errno = EINVAL;
+      return false;
+    }
+    return FreeMallocLeakInfo(reinterpret_cast<android_mallopt_leak_info_t*>(arg));
+  }
+  // Try heapprofd's mallopt, as it handles options not covered here.
+  return HeapprofdMallopt(opcode, arg, arg_size);
+#endif
+}
diff --git a/libc/bionic/bionic_call_ifunc_resolver.cpp b/libc/bionic/bionic_call_ifunc_resolver.cpp
index e44d998..d5a812c 100644
--- a/libc/bionic/bionic_call_ifunc_resolver.cpp
+++ b/libc/bionic/bionic_call_ifunc_resolver.cpp
@@ -31,6 +31,7 @@
 #include <sys/hwprobe.h>
 #include <sys/ifunc.h>
 
+#include "bionic/macros.h"
 #include "private/bionic_auxv.h"
 
 // This code is called in the linker before it has been relocated, so minimize calls into other
@@ -40,8 +41,8 @@
 ElfW(Addr) __bionic_call_ifunc_resolver(ElfW(Addr) resolver_addr) {
 #if defined(__aarch64__)
   typedef ElfW(Addr) (*ifunc_resolver_t)(uint64_t, __ifunc_arg_t*);
-  static __ifunc_arg_t arg;
-  static bool initialized = false;
+  BIONIC_USED_BEFORE_LINKER_RELOCATES static __ifunc_arg_t arg;
+  BIONIC_USED_BEFORE_LINKER_RELOCATES static bool initialized = false;
   if (!initialized) {
     initialized = true;
     arg._size = sizeof(__ifunc_arg_t);
diff --git a/libc/bionic/libc_init_mte.cpp b/libc/bionic/libc_init_mte.cpp
new file mode 100644
index 0000000..3c8ef7d
--- /dev/null
+++ b/libc/bionic/libc_init_mte.cpp
@@ -0,0 +1,325 @@
+/*
+ * Copyright (C) 2024 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <android/api-level.h>
+#include <elf.h>
+#include <errno.h>
+#include <malloc.h>
+#include <signal.h>
+#include <stddef.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/auxv.h>
+#include <sys/mman.h>
+
+#include "async_safe/log.h"
+#include "heap_tagging.h"
+#include "libc_init_common.h"
+#include "platform/bionic/macros.h"
+#include "platform/bionic/mte.h"
+#include "platform/bionic/page.h"
+#include "platform/bionic/reserved_signals.h"
+#include "private/KernelArgumentBlock.h"
+#include "private/bionic_asm.h"
+#include "private/bionic_asm_note.h"
+#include "private/bionic_call_ifunc_resolver.h"
+#include "private/bionic_elf_tls.h"
+#include "private/bionic_globals.h"
+#include "private/bionic_tls.h"
+#include "private/elf_note.h"
+#include "pthread_internal.h"
+#include "sys/system_properties.h"
+#include "sysprop_helpers.h"
+
+#ifdef __aarch64__
+extern "C" const char* __gnu_basename(const char* path);
+
+static HeapTaggingLevel __get_memtag_level_from_note(const ElfW(Phdr) * phdr_start, size_t phdr_ct,
+                                                     const ElfW(Addr) load_bias, bool* stack) {
+  const ElfW(Nhdr) * note;
+  const char* desc;
+  if (!__find_elf_note(NT_ANDROID_TYPE_MEMTAG, "Android", phdr_start, phdr_ct, &note, &desc,
+                       load_bias)) {
+    return M_HEAP_TAGGING_LEVEL_TBI;
+  }
+
+  // Previously (in Android 12), if the note was != 4 bytes, we check-failed
+  // here. Let's be more permissive to allow future expansion.
+  if (note->n_descsz < 4) {
+    async_safe_fatal("unrecognized android.memtag note: n_descsz = %d, expected >= 4",
+                     note->n_descsz);
+  }
+
+  // `desc` is always aligned due to ELF requirements, enforced in __find_elf_note().
+  ElfW(Word) note_val = *reinterpret_cast<const ElfW(Word)*>(desc);
+  *stack = (note_val & NT_MEMTAG_STACK) != 0;
+
+  // Warning: In Android 12, any value outside of bits [0..3] resulted in a check-fail.
+  if (!(note_val & (NT_MEMTAG_HEAP | NT_MEMTAG_STACK))) {
+    async_safe_format_log(ANDROID_LOG_INFO, "libc",
+                          "unrecognised memtag note_val did not specificy heap or stack: %u",
+                          note_val);
+    return M_HEAP_TAGGING_LEVEL_TBI;
+  }
+
+  unsigned mode = note_val & NT_MEMTAG_LEVEL_MASK;
+  switch (mode) {
+    case NT_MEMTAG_LEVEL_NONE:
+      // Note, previously (in Android 12), NT_MEMTAG_LEVEL_NONE was
+      // NT_MEMTAG_LEVEL_DEFAULT, which implied SYNC mode. This was never used
+      // by anyone, but we note it (heh) here for posterity, in case the zero
+      // level becomes meaningful, and binaries with this note can be executed
+      // on Android 12 devices.
+      return M_HEAP_TAGGING_LEVEL_TBI;
+    case NT_MEMTAG_LEVEL_ASYNC:
+      return M_HEAP_TAGGING_LEVEL_ASYNC;
+    case NT_MEMTAG_LEVEL_SYNC:
+    default:
+      // We allow future extensions to specify mode 3 (currently unused), with
+      // the idea that it might be used for ASYMM mode or something else. On
+      // this version of Android, it falls back to SYNC mode.
+      return M_HEAP_TAGGING_LEVEL_SYNC;
+  }
+}
+
+// Returns true if there's an environment setting (either sysprop or env var)
+// that should overwrite the ELF note, and places the equivalent heap tagging
+// level into *level.
+static bool get_environment_memtag_setting(HeapTaggingLevel* level) {
+  static const char kMemtagPrognameSyspropPrefix[] = "arm64.memtag.process.";
+  static const char kMemtagGlobalSysprop[] = "persist.arm64.memtag.default";
+  static const char kMemtagOverrideSyspropPrefix[] =
+      "persist.device_config.memory_safety_native.mode_override.process.";
+
+  const char* progname = __libc_shared_globals()->init_progname;
+  if (progname == nullptr) return false;
+
+  const char* basename = __gnu_basename(progname);
+
+  char options_str[PROP_VALUE_MAX];
+  char sysprop_name[512];
+  async_safe_format_buffer(sysprop_name, sizeof(sysprop_name), "%s%s", kMemtagPrognameSyspropPrefix,
+                           basename);
+  char remote_sysprop_name[512];
+  async_safe_format_buffer(remote_sysprop_name, sizeof(remote_sysprop_name), "%s%s",
+                           kMemtagOverrideSyspropPrefix, basename);
+  const char* sys_prop_names[] = {sysprop_name, remote_sysprop_name, kMemtagGlobalSysprop};
+
+  if (!get_config_from_env_or_sysprops("MEMTAG_OPTIONS", sys_prop_names, arraysize(sys_prop_names),
+                                       options_str, sizeof(options_str))) {
+    return false;
+  }
+
+  if (strcmp("sync", options_str) == 0) {
+    *level = M_HEAP_TAGGING_LEVEL_SYNC;
+  } else if (strcmp("async", options_str) == 0) {
+    *level = M_HEAP_TAGGING_LEVEL_ASYNC;
+  } else if (strcmp("off", options_str) == 0) {
+    *level = M_HEAP_TAGGING_LEVEL_TBI;
+  } else {
+    async_safe_format_log(
+        ANDROID_LOG_ERROR, "libc",
+        "unrecognized memtag level: \"%s\" (options are \"sync\", \"async\", or \"off\").",
+        options_str);
+    return false;
+  }
+
+  return true;
+}
+
+// Returns the initial heap tagging level. Note: This function will never return
+// M_HEAP_TAGGING_LEVEL_NONE, if MTE isn't enabled for this process we enable
+// M_HEAP_TAGGING_LEVEL_TBI.
+static HeapTaggingLevel __get_tagging_level(const memtag_dynamic_entries_t* memtag_dynamic_entries,
+                                            const void* phdr_start, size_t phdr_ct,
+                                            uintptr_t load_bias, bool* stack) {
+  HeapTaggingLevel level = M_HEAP_TAGGING_LEVEL_TBI;
+
+  // If the dynamic entries exist, use those. Otherwise, fall back to the old
+  // Android note, which is still used for fully static executables. When
+  // -fsanitize=memtag* is used in newer toolchains, currently both the dynamic
+  // entries and the old note are created, but we'd expect to move to just the
+  // dynamic entries for dynamically linked executables in the future. In
+  // addition, there's still some cleanup of the build system (that uses a
+  // manually-constructed note) needed. For more information about the dynamic
+  // entries, see:
+  // https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#dynamic-section
+  if (memtag_dynamic_entries && memtag_dynamic_entries->has_memtag_mode) {
+    switch (memtag_dynamic_entries->memtag_mode) {
+      case 0:
+        level = M_HEAP_TAGGING_LEVEL_SYNC;
+        break;
+      case 1:
+        level = M_HEAP_TAGGING_LEVEL_ASYNC;
+        break;
+      default:
+        async_safe_format_log(ANDROID_LOG_INFO, "libc",
+                              "unrecognised DT_AARCH64_MEMTAG_MODE value: %u",
+                              memtag_dynamic_entries->memtag_mode);
+    }
+    *stack = memtag_dynamic_entries->memtag_stack;
+  } else {
+    level = __get_memtag_level_from_note(reinterpret_cast<const ElfW(Phdr)*>(phdr_start), phdr_ct,
+                                         load_bias, stack);
+  }
+
+  // We can't short-circuit the environment override, as `stack` is still inherited from the
+  // binary's settings.
+  get_environment_memtag_setting(&level);
+  return level;
+}
+
+static void __enable_mte_signal_handler(int, siginfo_t* info, void*) {
+  if (info->si_code != SI_TIMER) {
+    async_safe_format_log(ANDROID_LOG_ERROR, "libc", "Got BIONIC_ENABLE_MTE not from SI_TIMER");
+    return;
+  }
+  int tagged_addr_ctrl = prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0);
+  if (tagged_addr_ctrl < 0) {
+    async_safe_fatal("failed to PR_GET_TAGGED_ADDR_CTRL: %m");
+  }
+  if ((tagged_addr_ctrl & PR_MTE_TCF_MASK) != PR_MTE_TCF_NONE) {
+    return;
+  }
+  async_safe_format_log(ANDROID_LOG_INFO, "libc",
+                        "Re-enabling MTE, value: %x (tagged_addr_ctrl %lu)",
+                        info->si_value.sival_int, info->si_value.sival_int & PR_MTE_TCF_MASK);
+  tagged_addr_ctrl =
+      (tagged_addr_ctrl & ~PR_MTE_TCF_MASK) | (info->si_value.sival_int & PR_MTE_TCF_MASK);
+  if (prctl(PR_SET_TAGGED_ADDR_CTRL, tagged_addr_ctrl, 0, 0, 0) < 0) {
+    async_safe_fatal("failed to PR_SET_TAGGED_ADDR_CTRL %d: %m", tagged_addr_ctrl);
+  }
+}
+
+static int64_t __get_memtag_upgrade_secs() {
+  char* env = getenv("BIONIC_MEMTAG_UPGRADE_SECS");
+  if (!env) return 0;
+  int64_t timed_upgrade = 0;
+  static const char kAppProcessName[] = "app_process64";
+  const char* progname = __libc_shared_globals()->init_progname;
+  progname = progname ? __gnu_basename(progname) : nullptr;
+  // disable timed upgrade for zygote, as the thread spawned will violate the requirement
+  // that it be single-threaded.
+  if (!progname || strncmp(progname, kAppProcessName, sizeof(kAppProcessName)) != 0) {
+    char* endptr;
+    timed_upgrade = strtoll(env, &endptr, 10);
+    if (*endptr != '\0' || timed_upgrade < 0) {
+      async_safe_format_log(ANDROID_LOG_ERROR, "libc",
+                            "Invalid value for BIONIC_MEMTAG_UPGRADE_SECS: %s", env);
+      timed_upgrade = 0;
+    }
+  }
+  // Make sure that this does not get passed to potential processes inheriting
+  // this environment.
+  unsetenv("BIONIC_MEMTAG_UPGRADE_SECS");
+  return timed_upgrade;
+}
+
+// Figure out the desired memory tagging mode (sync/async, heap/globals/stack) for this executable.
+// This function is called from the linker before the main executable is relocated.
+__attribute__((no_sanitize("hwaddress", "memtag"))) void __libc_init_mte(
+    const memtag_dynamic_entries_t* memtag_dynamic_entries, const void* phdr_start, size_t phdr_ct,
+    uintptr_t load_bias) {
+  bool memtag_stack = false;
+  HeapTaggingLevel level =
+      __get_tagging_level(memtag_dynamic_entries, phdr_start, phdr_ct, load_bias, &memtag_stack);
+  if (memtag_stack) __libc_shared_globals()->initial_memtag_stack_abi = true;
+
+  if (int64_t timed_upgrade = __get_memtag_upgrade_secs()) {
+    if (level == M_HEAP_TAGGING_LEVEL_ASYNC) {
+      async_safe_format_log(ANDROID_LOG_INFO, "libc",
+                            "Attempting timed MTE upgrade from async to sync.");
+      __libc_shared_globals()->heap_tagging_upgrade_timer_sec = timed_upgrade;
+      level = M_HEAP_TAGGING_LEVEL_SYNC;
+    } else if (level != M_HEAP_TAGGING_LEVEL_SYNC) {
+      async_safe_format_log(ANDROID_LOG_ERROR, "libc",
+                            "Requested timed MTE upgrade from invalid %s to sync. Ignoring.",
+                            DescribeTaggingLevel(level));
+    }
+  }
+  if (level == M_HEAP_TAGGING_LEVEL_SYNC || level == M_HEAP_TAGGING_LEVEL_ASYNC) {
+    unsigned long prctl_arg = PR_TAGGED_ADDR_ENABLE | PR_MTE_TAG_SET_NONZERO;
+    prctl_arg |= (level == M_HEAP_TAGGING_LEVEL_SYNC) ? PR_MTE_TCF_SYNC : PR_MTE_TCF_ASYNC;
+
+    // When entering ASYNC mode, specify that we want to allow upgrading to SYNC by OR'ing in the
+    // SYNC flag. But if the kernel doesn't support specifying multiple TCF modes, fall back to
+    // specifying a single mode.
+    if (prctl(PR_SET_TAGGED_ADDR_CTRL, prctl_arg | PR_MTE_TCF_SYNC, 0, 0, 0) == 0 ||
+        prctl(PR_SET_TAGGED_ADDR_CTRL, prctl_arg, 0, 0, 0) == 0) {
+      __libc_shared_globals()->initial_heap_tagging_level = level;
+
+      struct sigaction action = {};
+      action.sa_flags = SA_SIGINFO | SA_RESTART;
+      action.sa_sigaction = __enable_mte_signal_handler;
+      sigaction(BIONIC_ENABLE_MTE, &action, nullptr);
+      return;
+    }
+  }
+
+  // MTE was either not enabled, or wasn't supported on this device. Try and use
+  // TBI.
+  if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0) == 0) {
+    __libc_shared_globals()->initial_heap_tagging_level = M_HEAP_TAGGING_LEVEL_TBI;
+  }
+  // We did not enable MTE, so we do not need to arm the upgrade timer.
+  __libc_shared_globals()->heap_tagging_upgrade_timer_sec = 0;
+}
+
+// Figure out whether we need to map the stack as PROT_MTE.
+// For dynamic executables, this has to be called after loading all
+// DT_NEEDED libraries, in case one of them needs stack MTE.
+__attribute__((no_sanitize("hwaddress", "memtag"))) void __libc_init_mte_stack(void* stack_top) {
+  if (!__libc_shared_globals()->initial_memtag_stack_abi) {
+    return;
+  }
+
+  // Even if the device doesn't support MTE, we have to allocate stack
+  // history buffers for code compiled for stack MTE. That is because the
+  // codegen expects a buffer to be present in TLS_SLOT_STACK_MTE either
+  // way.
+  __get_bionic_tcb()->tls_slot(TLS_SLOT_STACK_MTE) = __allocate_stack_mte_ringbuffer(0, nullptr);
+
+  if (__libc_mte_enabled()) {
+    __libc_shared_globals()->initial_memtag_stack = true;
+    void* pg_start = reinterpret_cast<void*>(page_start(reinterpret_cast<uintptr_t>(stack_top)));
+    if (mprotect(pg_start, page_size(), PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN)) {
+      async_safe_fatal("error: failed to set PROT_MTE on main thread stack: %m");
+    }
+  }
+}
+
+#else   // __aarch64__
+void __libc_init_mte(const memtag_dynamic_entries_t*, const void*, size_t, uintptr_t) {}
+void __libc_init_mte_stack(void*) {}
+#endif  // __aarch64__
+
+bool __libc_mte_enabled() {
+  HeapTaggingLevel lvl = __libc_shared_globals()->initial_heap_tagging_level;
+  return lvl == M_HEAP_TAGGING_LEVEL_SYNC || lvl == M_HEAP_TAGGING_LEVEL_ASYNC;
+}
diff --git a/libc/bionic/libc_init_static.cpp b/libc/bionic/libc_init_static.cpp
index 7c46113..cd96375 100644
--- a/libc/bionic/libc_init_static.cpp
+++ b/libc/bionic/libc_init_static.cpp
@@ -157,260 +157,6 @@
 
   layout.finish_layout();
 }
-
-#ifdef __aarch64__
-static HeapTaggingLevel __get_memtag_level_from_note(const ElfW(Phdr) * phdr_start, size_t phdr_ct,
-                                                     const ElfW(Addr) load_bias, bool* stack) {
-  const ElfW(Nhdr) * note;
-  const char* desc;
-  if (!__find_elf_note(NT_ANDROID_TYPE_MEMTAG, "Android", phdr_start, phdr_ct, &note, &desc,
-                       load_bias)) {
-    return M_HEAP_TAGGING_LEVEL_TBI;
-  }
-
-  // Previously (in Android 12), if the note was != 4 bytes, we check-failed
-  // here. Let's be more permissive to allow future expansion.
-  if (note->n_descsz < 4) {
-    async_safe_fatal("unrecognized android.memtag note: n_descsz = %d, expected >= 4",
-                     note->n_descsz);
-  }
-
-  // `desc` is always aligned due to ELF requirements, enforced in __find_elf_note().
-  ElfW(Word) note_val = *reinterpret_cast<const ElfW(Word)*>(desc);
-  *stack = (note_val & NT_MEMTAG_STACK) != 0;
-
-  // Warning: In Android 12, any value outside of bits [0..3] resulted in a check-fail.
-  if (!(note_val & (NT_MEMTAG_HEAP | NT_MEMTAG_STACK))) {
-    async_safe_format_log(ANDROID_LOG_INFO, "libc",
-                          "unrecognised memtag note_val did not specificy heap or stack: %u",
-                          note_val);
-    return M_HEAP_TAGGING_LEVEL_TBI;
-  }
-
-  unsigned mode = note_val & NT_MEMTAG_LEVEL_MASK;
-  switch (mode) {
-    case NT_MEMTAG_LEVEL_NONE:
-      // Note, previously (in Android 12), NT_MEMTAG_LEVEL_NONE was
-      // NT_MEMTAG_LEVEL_DEFAULT, which implied SYNC mode. This was never used
-      // by anyone, but we note it (heh) here for posterity, in case the zero
-      // level becomes meaningful, and binaries with this note can be executed
-      // on Android 12 devices.
-      return M_HEAP_TAGGING_LEVEL_TBI;
-    case NT_MEMTAG_LEVEL_ASYNC:
-      return M_HEAP_TAGGING_LEVEL_ASYNC;
-    case NT_MEMTAG_LEVEL_SYNC:
-    default:
-      // We allow future extensions to specify mode 3 (currently unused), with
-      // the idea that it might be used for ASYMM mode or something else. On
-      // this version of Android, it falls back to SYNC mode.
-      return M_HEAP_TAGGING_LEVEL_SYNC;
-  }
-}
-
-// Returns true if there's an environment setting (either sysprop or env var)
-// that should overwrite the ELF note, and places the equivalent heap tagging
-// level into *level.
-static bool get_environment_memtag_setting(HeapTaggingLevel* level) {
-  static const char kMemtagPrognameSyspropPrefix[] = "arm64.memtag.process.";
-  static const char kMemtagGlobalSysprop[] = "persist.arm64.memtag.default";
-  static const char kMemtagOverrideSyspropPrefix[] =
-      "persist.device_config.memory_safety_native.mode_override.process.";
-
-  const char* progname = __libc_shared_globals()->init_progname;
-  if (progname == nullptr) return false;
-
-  const char* basename = __gnu_basename(progname);
-
-  char options_str[PROP_VALUE_MAX];
-  char sysprop_name[512];
-  async_safe_format_buffer(sysprop_name, sizeof(sysprop_name), "%s%s", kMemtagPrognameSyspropPrefix,
-                           basename);
-  char remote_sysprop_name[512];
-  async_safe_format_buffer(remote_sysprop_name, sizeof(remote_sysprop_name), "%s%s",
-                           kMemtagOverrideSyspropPrefix, basename);
-  const char* sys_prop_names[] = {sysprop_name, remote_sysprop_name, kMemtagGlobalSysprop};
-
-  if (!get_config_from_env_or_sysprops("MEMTAG_OPTIONS", sys_prop_names, arraysize(sys_prop_names),
-                                       options_str, sizeof(options_str))) {
-    return false;
-  }
-
-  if (strcmp("sync", options_str) == 0) {
-    *level = M_HEAP_TAGGING_LEVEL_SYNC;
-  } else if (strcmp("async", options_str) == 0) {
-    *level = M_HEAP_TAGGING_LEVEL_ASYNC;
-  } else if (strcmp("off", options_str) == 0) {
-    *level = M_HEAP_TAGGING_LEVEL_TBI;
-  } else {
-    async_safe_format_log(
-        ANDROID_LOG_ERROR, "libc",
-        "unrecognized memtag level: \"%s\" (options are \"sync\", \"async\", or \"off\").",
-        options_str);
-    return false;
-  }
-
-  return true;
-}
-
-// Returns the initial heap tagging level. Note: This function will never return
-// M_HEAP_TAGGING_LEVEL_NONE, if MTE isn't enabled for this process we enable
-// M_HEAP_TAGGING_LEVEL_TBI.
-static HeapTaggingLevel __get_tagging_level(const memtag_dynamic_entries_t* memtag_dynamic_entries,
-                                            const void* phdr_start, size_t phdr_ct,
-                                            uintptr_t load_bias, bool* stack) {
-  HeapTaggingLevel level = M_HEAP_TAGGING_LEVEL_TBI;
-
-  // If the dynamic entries exist, use those. Otherwise, fall back to the old
-  // Android note, which is still used for fully static executables. When
-  // -fsanitize=memtag* is used in newer toolchains, currently both the dynamic
-  // entries and the old note are created, but we'd expect to move to just the
-  // dynamic entries for dynamically linked executables in the future. In
-  // addition, there's still some cleanup of the build system (that uses a
-  // manually-constructed note) needed. For more information about the dynamic
-  // entries, see:
-  // https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#dynamic-section
-  if (memtag_dynamic_entries && memtag_dynamic_entries->has_memtag_mode) {
-    switch (memtag_dynamic_entries->memtag_mode) {
-      case 0:
-        level = M_HEAP_TAGGING_LEVEL_SYNC;
-        break;
-      case 1:
-        level = M_HEAP_TAGGING_LEVEL_ASYNC;
-        break;
-      default:
-        async_safe_format_log(ANDROID_LOG_INFO, "libc",
-                              "unrecognised DT_AARCH64_MEMTAG_MODE value: %u",
-                              memtag_dynamic_entries->memtag_mode);
-    }
-    *stack = memtag_dynamic_entries->memtag_stack;
-  } else {
-    level = __get_memtag_level_from_note(reinterpret_cast<const ElfW(Phdr)*>(phdr_start), phdr_ct,
-                                         load_bias, stack);
-  }
-
-  // We can't short-circuit the environment override, as `stack` is still inherited from the
-  // binary's settings.
-  get_environment_memtag_setting(&level);
-  return level;
-}
-
-static void __enable_mte_signal_handler(int, siginfo_t* info, void*) {
-  if (info->si_code != SI_TIMER) {
-    async_safe_format_log(ANDROID_LOG_ERROR, "libc", "Got BIONIC_ENABLE_MTE not from SI_TIMER");
-    return;
-  }
-  int tagged_addr_ctrl = prctl(PR_GET_TAGGED_ADDR_CTRL, 0, 0, 0, 0);
-  if (tagged_addr_ctrl < 0) {
-    async_safe_fatal("failed to PR_GET_TAGGED_ADDR_CTRL: %m");
-  }
-  if ((tagged_addr_ctrl & PR_MTE_TCF_MASK) != PR_MTE_TCF_NONE) {
-    return;
-  }
-  async_safe_format_log(ANDROID_LOG_INFO, "libc",
-                        "Re-enabling MTE, value: %x (tagged_addr_ctrl %lu)",
-                        info->si_value.sival_int, info->si_value.sival_int & PR_MTE_TCF_MASK);
-  tagged_addr_ctrl =
-      (tagged_addr_ctrl & ~PR_MTE_TCF_MASK) | (info->si_value.sival_int & PR_MTE_TCF_MASK);
-  if (prctl(PR_SET_TAGGED_ADDR_CTRL, tagged_addr_ctrl, 0, 0, 0) < 0) {
-    async_safe_fatal("failed to PR_SET_TAGGED_ADDR_CTRL %d: %m", tagged_addr_ctrl);
-  }
-}
-
-static int64_t __get_memtag_upgrade_secs() {
-  char* env = getenv("BIONIC_MEMTAG_UPGRADE_SECS");
-  if (!env) return 0;
-  int64_t timed_upgrade = 0;
-  static const char kAppProcessName[] = "app_process64";
-  const char* progname = __libc_shared_globals()->init_progname;
-  progname = progname ? __gnu_basename(progname) : nullptr;
-  // disable timed upgrade for zygote, as the thread spawned will violate the requirement
-  // that it be single-threaded.
-  if (!progname || strncmp(progname, kAppProcessName, sizeof(kAppProcessName)) != 0) {
-    char* endptr;
-    timed_upgrade = strtoll(env, &endptr, 10);
-    if (*endptr != '\0' || timed_upgrade < 0) {
-      async_safe_format_log(ANDROID_LOG_ERROR, "libc",
-                            "Invalid value for BIONIC_MEMTAG_UPGRADE_SECS: %s", env);
-      timed_upgrade = 0;
-    }
-  }
-  // Make sure that this does not get passed to potential processes inheriting
-  // this environment.
-  unsetenv("BIONIC_MEMTAG_UPGRADE_SECS");
-  return timed_upgrade;
-}
-
-// Figure out the desired memory tagging mode (sync/async, heap/globals/stack) for this executable.
-// This function is called from the linker before the main executable is relocated.
-__attribute__((no_sanitize("hwaddress", "memtag"))) void __libc_init_mte(
-    const memtag_dynamic_entries_t* memtag_dynamic_entries, const void* phdr_start, size_t phdr_ct,
-    uintptr_t load_bias, void* stack_top) {
-  bool memtag_stack = false;
-  HeapTaggingLevel level =
-      __get_tagging_level(memtag_dynamic_entries, phdr_start, phdr_ct, load_bias, &memtag_stack);
-  // initial_memtag_stack is used by the linker (in linker.cpp) to communicate than any library
-  // linked by this executable enables memtag-stack.
-  // memtag_stack is also set for static executables if they request memtag stack via the note,
-  // in which case it will differ from initial_memtag_stack.
-  if (__libc_shared_globals()->initial_memtag_stack || memtag_stack) {
-    memtag_stack = true;
-    __libc_shared_globals()->initial_memtag_stack_abi = true;
-    __get_bionic_tcb()->tls_slot(TLS_SLOT_STACK_MTE) = __allocate_stack_mte_ringbuffer(0, nullptr);
-  }
-  if (int64_t timed_upgrade = __get_memtag_upgrade_secs()) {
-    if (level == M_HEAP_TAGGING_LEVEL_ASYNC) {
-      async_safe_format_log(ANDROID_LOG_INFO, "libc",
-                            "Attempting timed MTE upgrade from async to sync.");
-      __libc_shared_globals()->heap_tagging_upgrade_timer_sec = timed_upgrade;
-      level = M_HEAP_TAGGING_LEVEL_SYNC;
-    } else if (level != M_HEAP_TAGGING_LEVEL_SYNC) {
-      async_safe_format_log(
-          ANDROID_LOG_ERROR, "libc",
-          "Requested timed MTE upgrade from invalid %s to sync. Ignoring.",
-          DescribeTaggingLevel(level));
-    }
-  }
-  if (level == M_HEAP_TAGGING_LEVEL_SYNC || level == M_HEAP_TAGGING_LEVEL_ASYNC) {
-    unsigned long prctl_arg = PR_TAGGED_ADDR_ENABLE | PR_MTE_TAG_SET_NONZERO;
-    prctl_arg |= (level == M_HEAP_TAGGING_LEVEL_SYNC) ? PR_MTE_TCF_SYNC : PR_MTE_TCF_ASYNC;
-
-    // When entering ASYNC mode, specify that we want to allow upgrading to SYNC by OR'ing in the
-    // SYNC flag. But if the kernel doesn't support specifying multiple TCF modes, fall back to
-    // specifying a single mode.
-    if (prctl(PR_SET_TAGGED_ADDR_CTRL, prctl_arg | PR_MTE_TCF_SYNC, 0, 0, 0) == 0 ||
-        prctl(PR_SET_TAGGED_ADDR_CTRL, prctl_arg, 0, 0, 0) == 0) {
-      __libc_shared_globals()->initial_heap_tagging_level = level;
-      __libc_shared_globals()->initial_memtag_stack = memtag_stack;
-
-      if (memtag_stack) {
-        void* pg_start =
-            reinterpret_cast<void*>(page_start(reinterpret_cast<uintptr_t>(stack_top)));
-        if (mprotect(pg_start, page_size(), PROT_READ | PROT_WRITE | PROT_MTE | PROT_GROWSDOWN)) {
-          async_safe_fatal("error: failed to set PROT_MTE on main thread stack: %m");
-        }
-      }
-      struct sigaction action = {};
-      action.sa_flags = SA_SIGINFO | SA_RESTART;
-      action.sa_sigaction = __enable_mte_signal_handler;
-      sigaction(BIONIC_ENABLE_MTE, &action, nullptr);
-      return;
-    }
-  }
-
-  // MTE was either not enabled, or wasn't supported on this device. Try and use
-  // TBI.
-  if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0) == 0) {
-    __libc_shared_globals()->initial_heap_tagging_level = M_HEAP_TAGGING_LEVEL_TBI;
-  }
-  // We did not enable MTE, so we do not need to arm the upgrade timer.
-  __libc_shared_globals()->heap_tagging_upgrade_timer_sec = 0;
-  // We also didn't enable memtag_stack.
-  __libc_shared_globals()->initial_memtag_stack = false;
-}
-#else   // __aarch64__
-void __libc_init_mte(const memtag_dynamic_entries_t*, const void*, size_t, uintptr_t, void*) {}
-#endif  // __aarch64__
-
 void __libc_init_profiling_handlers() {
   // The dynamic variant of this function is more interesting, but this
   // at least ensures that static binaries aren't killed by the kernel's
@@ -436,7 +182,8 @@
   __libc_init_common();
   __libc_init_mte(/*memtag_dynamic_entries=*/nullptr,
                   reinterpret_cast<ElfW(Phdr)*>(getauxval(AT_PHDR)), getauxval(AT_PHNUM),
-                  /*load_bias = */ 0, /*stack_top = */ raw_args);
+                  /*load_bias = */ 0);
+  __libc_init_mte_stack(/*stack_top = */ raw_args);
   __libc_init_scudo();
   __libc_init_profiling_handlers();
   __libc_init_fork_handler();
@@ -508,6 +255,6 @@
 // compiled with -ffreestanding to avoid implicit string.h function calls. (It shouldn't strictly
 // be necessary, though.)
 __LIBC_HIDDEN__ libc_shared_globals* __libc_shared_globals() {
-  static libc_shared_globals globals;
+  BIONIC_USED_BEFORE_LINKER_RELOCATES static libc_shared_globals globals;
   return &globals;
 }
diff --git a/libc/bionic/malloc_common.cpp b/libc/bionic/malloc_common.cpp
index 596a1fc..441d884 100644
--- a/libc/bionic/malloc_common.cpp
+++ b/libc/bionic/malloc_common.cpp
@@ -332,44 +332,6 @@
 #endif
 // =============================================================================
 
-// =============================================================================
-// Platform-internal mallopt variant.
-// =============================================================================
-#if defined(LIBC_STATIC)
-extern "C" bool android_mallopt(int opcode, void* arg, size_t arg_size) {
-  if (opcode == M_SET_ALLOCATION_LIMIT_BYTES) {
-    return LimitEnable(arg, arg_size);
-  }
-  if (opcode == M_INITIALIZE_GWP_ASAN) {
-    if (arg == nullptr || arg_size != sizeof(android_mallopt_gwp_asan_options_t)) {
-      errno = EINVAL;
-      return false;
-    }
-
-    return EnableGwpAsan(*reinterpret_cast<android_mallopt_gwp_asan_options_t*>(arg));
-  }
-  if (opcode == M_MEMTAG_STACK_IS_ON) {
-    if (arg == nullptr || arg_size != sizeof(bool)) {
-      errno = EINVAL;
-      return false;
-    }
-    *reinterpret_cast<bool*>(arg) = atomic_load(&__libc_memtag_stack);
-    return true;
-  }
-  if (opcode == M_GET_DECAY_TIME_ENABLED) {
-    if (arg == nullptr || arg_size != sizeof(bool)) {
-      errno = EINVAL;
-      return false;
-    }
-    *reinterpret_cast<bool*>(arg) = atomic_load(&__libc_globals->decay_time_enabled);
-    return true;
-  }
-  errno = ENOTSUP;
-  return false;
-}
-#endif
-// =============================================================================
-
 static constexpr MallocDispatch __libc_malloc_default_dispatch __attribute__((unused)) = {
   Malloc(calloc),
   Malloc(free),
diff --git a/libc/bionic/malloc_common_dynamic.cpp b/libc/bionic/malloc_common_dynamic.cpp
index 6db6251..7b6d7d4 100644
--- a/libc/bionic/malloc_common_dynamic.cpp
+++ b/libc/bionic/malloc_common_dynamic.cpp
@@ -80,7 +80,7 @@
 
 _Atomic bool gGlobalsMutating = false;
 
-static bool gZygoteChild = false;
+bool gZygoteChild = false;
 
 // In a Zygote child process, this is set to true if profiling of this process
 // is allowed. Note that this is set at a later time than gZygoteChild. The
@@ -89,7 +89,7 @@
 // domains if applicable). These two flags are read by the
 // BIONIC_SIGNAL_PROFILER handler, which does nothing if the process is not
 // profileable.
-static _Atomic bool gZygoteChildProfileable = false;
+_Atomic bool gZygoteChildProfileable = false;
 
 // =============================================================================
 
@@ -471,93 +471,6 @@
 }
 // =============================================================================
 
-// =============================================================================
-// Platform-internal mallopt variant.
-// =============================================================================
-__BIONIC_WEAK_FOR_NATIVE_BRIDGE
-extern "C" bool android_mallopt(int opcode, void* arg, size_t arg_size) {
-  if (opcode == M_SET_ZYGOTE_CHILD) {
-    if (arg != nullptr || arg_size != 0) {
-      errno = EINVAL;
-      return false;
-    }
-    gZygoteChild = true;
-    return true;
-  }
-  if (opcode == M_INIT_ZYGOTE_CHILD_PROFILING) {
-    if (arg != nullptr || arg_size != 0) {
-      errno = EINVAL;
-      return false;
-    }
-    atomic_store_explicit(&gZygoteChildProfileable, true, memory_order_release);
-    // Also check if heapprofd should start profiling from app startup.
-    HeapprofdInitZygoteChildProfiling();
-    return true;
-  }
-  if (opcode == M_GET_PROCESS_PROFILEABLE) {
-    if (arg == nullptr || arg_size != sizeof(bool)) {
-      errno = EINVAL;
-      return false;
-    }
-    // Native processes are considered profileable. Zygote children are considered
-    // profileable only when appropriately tagged.
-    *reinterpret_cast<bool*>(arg) =
-        !gZygoteChild || atomic_load_explicit(&gZygoteChildProfileable, memory_order_acquire);
-    return true;
-  }
-  if (opcode == M_SET_ALLOCATION_LIMIT_BYTES) {
-    return LimitEnable(arg, arg_size);
-  }
-  if (opcode == M_WRITE_MALLOC_LEAK_INFO_TO_FILE) {
-    if (arg == nullptr || arg_size != sizeof(FILE*)) {
-      errno = EINVAL;
-      return false;
-    }
-    return WriteMallocLeakInfo(reinterpret_cast<FILE*>(arg));
-  }
-  if (opcode == M_GET_MALLOC_LEAK_INFO) {
-    if (arg == nullptr || arg_size != sizeof(android_mallopt_leak_info_t)) {
-      errno = EINVAL;
-      return false;
-    }
-    return GetMallocLeakInfo(reinterpret_cast<android_mallopt_leak_info_t*>(arg));
-  }
-  if (opcode == M_FREE_MALLOC_LEAK_INFO) {
-    if (arg == nullptr || arg_size != sizeof(android_mallopt_leak_info_t)) {
-      errno = EINVAL;
-      return false;
-    }
-    return FreeMallocLeakInfo(reinterpret_cast<android_mallopt_leak_info_t*>(arg));
-  }
-  if (opcode == M_INITIALIZE_GWP_ASAN) {
-    if (arg == nullptr || arg_size != sizeof(android_mallopt_gwp_asan_options_t)) {
-      errno = EINVAL;
-      return false;
-    }
-
-    return EnableGwpAsan(*reinterpret_cast<android_mallopt_gwp_asan_options_t*>(arg));
-  }
-  if (opcode == M_MEMTAG_STACK_IS_ON) {
-    if (arg == nullptr || arg_size != sizeof(bool)) {
-      errno = EINVAL;
-      return false;
-    }
-    *reinterpret_cast<bool*>(arg) = atomic_load(&__libc_memtag_stack);
-    return true;
-  }
-  if (opcode == M_GET_DECAY_TIME_ENABLED) {
-    if (arg == nullptr || arg_size != sizeof(bool)) {
-      errno = EINVAL;
-      return false;
-    }
-    *reinterpret_cast<bool*>(arg) = atomic_load(&__libc_globals->decay_time_enabled);
-    return true;
-  }
-  // Try heapprofd's mallopt, as it handles options not covered here.
-  return HeapprofdMallopt(opcode, arg, arg_size);
-}
-// =============================================================================
-
 #if !defined(__LP64__) && defined(__arm__)
 // =============================================================================
 // Old platform only functions that some old 32 bit apps are still using.
diff --git a/libc/bionic/pthread_create.cpp b/libc/bionic/pthread_create.cpp
index ba20c51..54bfa20 100644
--- a/libc/bionic/pthread_create.cpp
+++ b/libc/bionic/pthread_create.cpp
@@ -351,15 +351,20 @@
 
 extern "C" int __rt_sigprocmask(int, const sigset64_t*, sigset64_t*, size_t);
 
-__attribute__((no_sanitize("hwaddress")))
+__attribute__((no_sanitize("hwaddress", "memtag")))
 #if defined(__aarch64__)
 // This function doesn't return, but it does appear in stack traces. Avoid using return PAC in this
 // function because we may end up resetting IA, which may confuse unwinders due to mismatching keys.
 __attribute__((target("branch-protection=bti")))
 #endif
-static int __pthread_start(void* arg) {
+static int
+__pthread_start(void* arg) {
   pthread_internal_t* thread = reinterpret_cast<pthread_internal_t*>(arg);
-
+#if defined(__aarch64__)
+  if (thread->should_allocate_stack_mte_ringbuffer) {
+    thread->bionic_tcb->tls_slot(TLS_SLOT_STACK_MTE) = __allocate_stack_mte_ringbuffer(0, thread);
+  }
+#endif
   __hwasan_thread_enter();
 
   // Wait for our creating thread to release us. This lets it have time to
@@ -450,9 +455,9 @@
 // This has to be done under g_thread_creation_lock or g_thread_list_lock to avoid racing with
 // __pthread_internal_remap_stack_with_mte.
 #ifdef __aarch64__
-  if (__libc_memtag_stack_abi) {
-    tcb->tls_slot(TLS_SLOT_STACK_MTE) = __allocate_stack_mte_ringbuffer(0, thread);
-  }
+  thread->should_allocate_stack_mte_ringbuffer = __libc_memtag_stack_abi;
+#else
+  thread->should_allocate_stack_mte_ringbuffer = false;
 #endif
 
   sigset64_t block_all_mask;
diff --git a/libc/bionic/pthread_exit.cpp b/libc/bionic/pthread_exit.cpp
index 0181aba..27d05c2 100644
--- a/libc/bionic/pthread_exit.cpp
+++ b/libc/bionic/pthread_exit.cpp
@@ -33,10 +33,11 @@
 #include <string.h>
 #include <sys/mman.h>
 
-#include "private/bionic_constants.h"
-#include "private/bionic_defs.h"
+#include "platform/bionic/mte.h"
 #include "private/ScopedRWLock.h"
 #include "private/ScopedSignalBlocker.h"
+#include "private/bionic_constants.h"
+#include "private/bionic_defs.h"
 #include "pthread_internal.h"
 
 extern "C" __noreturn void _exit_with_stack_teardown(void*, size_t);
@@ -67,7 +68,7 @@
 }
 
 __BIONIC_WEAK_FOR_NATIVE_BRIDGE
-void pthread_exit(void* return_value) {
+__attribute__((no_sanitize("memtag"))) void pthread_exit(void* return_value) {
   // Call dtors for thread_local objects first.
   __cxa_thread_finalize();
 
@@ -138,6 +139,13 @@
   __notify_thread_exit_callbacks();
   __hwasan_thread_exit();
 
+#if defined(__aarch64__)
+  if (void* stack_mte_tls = thread->bionic_tcb->tls_slot(TLS_SLOT_STACK_MTE)) {
+    stack_mte_free_ringbuffer(reinterpret_cast<uintptr_t>(stack_mte_tls));
+  }
+#endif
+  // Everything below this line needs to be no_sanitize("memtag").
+
   if (old_state == THREAD_DETACHED && thread->mmap_size != 0) {
     // We need to free mapped space for detached threads when they exit.
     // That's not something we can do in C.
diff --git a/libc/bionic/pthread_internal.cpp b/libc/bionic/pthread_internal.cpp
index 14cc7da..c6426ed 100644
--- a/libc/bionic/pthread_internal.cpp
+++ b/libc/bionic/pthread_internal.cpp
@@ -34,6 +34,7 @@
 #include <string.h>
 #include <sys/mman.h>
 #include <sys/prctl.h>
+#include <sys/types.h>
 
 #include <async_safe/log.h>
 #include <bionic/mte.h>
@@ -76,15 +77,6 @@
 }
 
 static void __pthread_internal_free(pthread_internal_t* thread) {
-#ifdef __aarch64__
-  if (void* stack_mte_tls = thread->bionic_tcb->tls_slot(TLS_SLOT_STACK_MTE)) {
-    size_t size =
-        stack_mte_ringbuffer_size_from_pointer(reinterpret_cast<uintptr_t>(stack_mte_tls));
-    void* ptr = reinterpret_cast<void*>(reinterpret_cast<uintptr_t>(stack_mte_tls) &
-                                        ((1ULL << 56ULL) - 1ULL));
-    munmap(ptr, size);
-  }
-#endif
   if (thread->mmap_size != 0) {
     // Free mapped space, including thread stack and pthread_internal_t.
     munmap(thread->mmap_base, thread->mmap_size);
@@ -216,7 +208,10 @@
   __libc_memtag_stack_abi = true;
 
   for (pthread_internal_t* t = g_thread_list; t != nullptr; t = t->next) {
-    if (t->terminating) continue;
+    // should_allocate_stack_mte_ringbuffer indicates the thread is already
+    // aware that this process requires stack MTE, and will allocate the
+    // ring buffer in __pthread_start.
+    if (t->terminating || t->should_allocate_stack_mte_ringbuffer) continue;
     t->bionic_tcb->tls_slot(TLS_SLOT_STACK_MTE) =
         __allocate_stack_mte_ringbuffer(0, t->is_main() ? nullptr : t);
   }
diff --git a/libc/bionic/pthread_internal.h b/libc/bionic/pthread_internal.h
index 5db42ab..cbaa9a6 100644
--- a/libc/bionic/pthread_internal.h
+++ b/libc/bionic/pthread_internal.h
@@ -181,6 +181,7 @@
 
   bionic_tcb* bionic_tcb;
   char stack_mte_ringbuffer_vma_name_buffer[32];
+  bool should_allocate_stack_mte_ringbuffer;
 
   bool is_main() { return start_routine == nullptr; }
 };
diff --git a/libc/include/android/versioning.h b/libc/include/android/versioning.h
index 1676a72..1cf6e51 100644
--- a/libc/include/android/versioning.h
+++ b/libc/include/android/versioning.h
@@ -80,10 +80,3 @@
 #define __INTRODUCED_IN_32(api_level)
 #define __INTRODUCED_IN_64(api_level) __BIONIC_AVAILABILITY(introduced=api_level)
 #endif
-
-// Vendor and product modules do not follow SDK versioning. Ignore NDK guards for these modules.
-#if defined(__ANDROID_VNDK__)
-#undef __BIONIC_AVAILABILITY
-#define __BIONIC_AVAILABILITY(api_level, ...)
-#define __BIONIC_AVAILABILITY_GUARD(api_level) 1
-#endif // defined(__ANDROID_VNDK__)
diff --git a/libc/include/ctype.h b/libc/include/ctype.h
index cb926a4..dc3f673 100644
--- a/libc/include/ctype.h
+++ b/libc/include/ctype.h
@@ -95,7 +95,7 @@
 
 /** Internal implementation detail. Do not use. */
 __attribute__((__no_sanitize__("unsigned-integer-overflow")))
-static inline int __bionic_ctype_in_range(unsigned __lo, int __ch, unsigned __hi) {
+__BIONIC_CTYPE_INLINE int __bionic_ctype_in_range(unsigned __lo, int __ch, unsigned __hi) {
   return (__BIONIC_CAST(static_cast, unsigned, __ch) - __lo) < (__hi - __lo + 1);
 }
 
diff --git a/libc/include/malloc.h b/libc/include/malloc.h
index dc2ca2b..ac27467 100644
--- a/libc/include/malloc.h
+++ b/libc/include/malloc.h
@@ -89,9 +89,9 @@
  */
 #if __ANDROID_API__ >= 29
 __nodiscard void* _Nullable reallocarray(void* _Nullable __ptr, size_t __item_count, size_t __item_size) __BIONIC_ALLOC_SIZE(2, 3) __INTRODUCED_IN(29);
-#else
+#elif defined(__ANDROID_UNAVAILABLE_SYMBOLS_ARE_WEAK__)
 #include <errno.h>
-static __inline __nodiscard void* _Nullable reallocarray(void* _Nullable __ptr, size_t __item_count, size_t __item_size) {
+static __inline __nodiscard void* _Nullable reallocarray(void* _Nullable __ptr, size_t __item_count, size_t __item_size) __BIONIC_ALLOC_SIZE(2, 3) {
   size_t __new_size;
   if (__builtin_mul_overflow(__item_count, __item_size, &__new_size)) {
     errno = ENOMEM;
diff --git a/libc/kernel/tools/defaults.py b/libc/kernel/tools/defaults.py
index 06afb25..a71318e 100644
--- a/libc/kernel/tools/defaults.py
+++ b/libc/kernel/tools/defaults.py
@@ -133,6 +133,9 @@
           # These are required to support the above functions.
           "__fswahw32",
           "__fswahb32",
+          # As are these, for ILP32.
+          "__arch_swab32",
+          "__arch_swab64",
           # This is used by various macros in <linux/ioprio.h>.
           "ioprio_value",
 
diff --git a/libc/kernel/uapi/asm-arm/asm/swab.h b/libc/kernel/uapi/asm-arm/asm/swab.h
index 7684c22..3fff953 100644
--- a/libc/kernel/uapi/asm-arm/asm/swab.h
+++ b/libc/kernel/uapi/asm-arm/asm/swab.h
@@ -11,7 +11,18 @@
 #ifndef __STRICT_ANSI__
 #define __SWAB_64_THRU_32__
 #endif
+static inline __attribute__((__const__)) __u32 __arch_swab32(__u32 x) {
+  __u32 t;
 #ifndef __thumb__
+  if(! __builtin_constant_p(x)) {
+    asm("eor\t%0, %1, %1, ror #16" : "=r" (t) : "r" (x));
+  } else
 #endif
+  t = x ^ ((x << 16) | (x >> 16));
+  x = (x << 24) | (x >> 8);
+  t &= ~0x00FF0000;
+  x ^= (t >> 8);
+  return x;
+}
 #define __arch_swab32 __arch_swab32
 #endif
diff --git a/libc/kernel/uapi/asm-x86/asm/swab.h b/libc/kernel/uapi/asm-x86/asm/swab.h
index 31c850d..ce43658 100644
--- a/libc/kernel/uapi/asm-x86/asm/swab.h
+++ b/libc/kernel/uapi/asm-x86/asm/swab.h
@@ -8,9 +8,27 @@
 #define _ASM_X86_SWAB_H
 #include <linux/types.h>
 #include <linux/compiler.h>
+static inline __attribute__((__const__)) __u32 __arch_swab32(__u32 val) {
+  asm("bswapl %0" : "=r" (val) : "0" (val));
+  return val;
+}
 #define __arch_swab32 __arch_swab32
+static inline __attribute__((__const__)) __u64 __arch_swab64(__u64 val) {
 #ifdef __i386__
+  union {
+    struct {
+      __u32 a;
+      __u32 b;
+    } s;
+    __u64 u;
+  } v;
+  v.u = val;
+  asm("bswapl %0; bswapl %1; xchgl %0,%1" : "=r" (v.s.a), "=r" (v.s.b) : "0" (v.s.a), "1" (v.s.b));
+  return v.u;
 #else
+  asm("bswapq %0" : "=r" (val) : "0" (val));
+  return val;
 #endif
+}
 #define __arch_swab64 __arch_swab64
 #endif
diff --git a/libc/platform/bionic/macros.h b/libc/platform/bionic/macros.h
index 93268c1..b2d6f96 100644
--- a/libc/platform/bionic/macros.h
+++ b/libc/platform/bionic/macros.h
@@ -97,3 +97,26 @@
 static inline T* _Nonnull untag_address(T* _Nonnull p) {
   return reinterpret_cast<T*>(untag_address(reinterpret_cast<uintptr_t>(p)));
 }
+
+// MTE globals protects internal and external global variables. One of the main
+// things that MTE globals does is force all global variable accesses to go
+// through the GOT. In the linker though, some global variables are accessed (or
+// address-taken) prior to relocations being processed. Because relocations
+// haven't run yet, the GOT entry hasn't been populated, and this leads to
+// crashes. Thus, any globals used by the linker prior to relocation should be
+// annotated with this attribute, which suppresses tagging of this global
+// variable, restoring the pc-relative address computation.
+//
+// A way to find global variables that need this attribute is to build the
+// linker/libc with `SANITIZE_TARGET=memtag_globals`, push them onto a device
+// (it doesn't have to be MTE capable), and then run an executable using
+// LD_LIBRARY_PATH and using the linker in interpreter mode (e.g.
+// `LD_LIBRARY_PATH=/data/tmp/ /data/tmp/linker64 /data/tmp/my_binary`). A
+// good heuristic is that the global variable is in a file that should be
+// compiled with `-ffreestanding` (but there are global variables there that
+// don't need this attribute).
+#if __has_feature(memtag_globals)
+#define BIONIC_USED_BEFORE_LINKER_RELOCATES __attribute__((no_sanitize("memtag")))
+#else  // __has_feature(memtag_globals)
+#define BIONIC_USED_BEFORE_LINKER_RELOCATES
+#endif  // __has_feature(memtag_globals)
diff --git a/libc/platform/bionic/mte.h b/libc/platform/bionic/mte.h
index 98b3d27..610cb45 100644
--- a/libc/platform/bionic/mte.h
+++ b/libc/platform/bionic/mte.h
@@ -28,6 +28,7 @@
 
 #pragma once
 
+#include <stddef.h>
 #include <sys/auxv.h>
 #include <sys/mman.h>
 #include <sys/prctl.h>
@@ -49,6 +50,36 @@
   return supported;
 }
 
+inline void* get_tagged_address(const void* ptr) {
+#if defined(__aarch64__)
+  if (mte_supported()) {
+    __asm__ __volatile__(".arch_extension mte; ldg %0, [%0]" : "+r"(ptr));
+  }
+#endif  // aarch64
+  return const_cast<void*>(ptr);
+}
+
+// Inserts a random tag tag to `ptr`, using any of the set lower 16 bits in
+// `mask` to exclude the corresponding tag from being generated. Note: This does
+// not tag memory. This generates a pointer to be used with set_memory_tag.
+inline void* insert_random_tag(const void* ptr, __attribute__((unused)) uint64_t mask = 0) {
+#if defined(__aarch64__)
+  if (mte_supported() && ptr) {
+    __asm__ __volatile__(".arch_extension mte; irg %0, %0, %1" : "+r"(ptr) : "r"(mask));
+  }
+#endif  // aarch64
+  return const_cast<void*>(ptr);
+}
+
+// Stores the address tag in `ptr` to memory, at `ptr`.
+inline void set_memory_tag(__attribute__((unused)) void* ptr) {
+#if defined(__aarch64__)
+  if (mte_supported()) {
+    __asm__ __volatile__(".arch_extension mte; stg %0, [%0]" : "+r"(ptr));
+  }
+#endif  // aarch64
+}
+
 #ifdef __aarch64__
 class ScopedDisableMTE {
   size_t prev_tco_;
@@ -86,6 +117,12 @@
   return ptr | ((1ULL << size_cls) << 56ULL);
 }
 
+inline void stack_mte_free_ringbuffer(uintptr_t stack_mte_tls) {
+  size_t size = stack_mte_ringbuffer_size_from_pointer(stack_mte_tls);
+  void* ptr = reinterpret_cast<void*>(stack_mte_tls & ((1ULL << 56ULL) - 1ULL));
+  munmap(ptr, size);
+}
+
 inline void* stack_mte_ringbuffer_allocate(size_t n, const char* name) {
   if (n > 7) return nullptr;
   // Allocation needs to be aligned to 2*size to make the fancy code-gen work.
diff --git a/libc/private/bionic_constants.h b/libc/private/bionic_constants.h
index 6274fe2..ce484d8 100644
--- a/libc/private/bionic_constants.h
+++ b/libc/private/bionic_constants.h
@@ -16,6 +16,7 @@
 
 #pragma once
 
+#define US_PER_S 1'000'000LL
 #define NS_PER_S 1'000'000'000LL
 
 // Size of the shadow call stack. This can be small because these stacks only
diff --git a/libc/private/bionic_globals.h b/libc/private/bionic_globals.h
index a1bebda..cd6dca9 100644
--- a/libc/private/bionic_globals.h
+++ b/libc/private/bionic_globals.h
@@ -157,6 +157,10 @@
 };
 
 __LIBC_HIDDEN__ libc_shared_globals* __libc_shared_globals();
+__LIBC_HIDDEN__ bool __libc_mte_enabled();
+__LIBC_HIDDEN__ void __libc_init_mte(const memtag_dynamic_entries_t*, const void*, size_t,
+                                     uintptr_t);
+__LIBC_HIDDEN__ void __libc_init_mte_stack(void*);
 __LIBC_HIDDEN__ void __libc_init_fdsan();
 __LIBC_HIDDEN__ void __libc_init_fdtrack();
 __LIBC_HIDDEN__ void __libc_init_profiling_handlers();
diff --git a/libc/private/bionic_time_conversions.h b/libc/private/bionic_time_conversions.h
index c6b3c78..ce7de0d 100644
--- a/libc/private/bionic_time_conversions.h
+++ b/libc/private/bionic_time_conversions.h
@@ -26,8 +26,7 @@
  * SUCH DAMAGE.
  */
 
-#ifndef _BIONIC_TIME_CONVERSIONS_H
-#define _BIONIC_TIME_CONVERSIONS_H
+#pragma once
 
 #include <errno.h>
 #include <time.h>
@@ -35,20 +34,21 @@
 
 #include "private/bionic_constants.h"
 
-__BEGIN_DECLS
+bool timespec_from_timeval(timespec& ts, const timeval& tv);
+void timespec_from_ms(timespec& ts, const int ms);
 
-__LIBC_HIDDEN__ bool timespec_from_timeval(timespec& ts, const timeval& tv);
-__LIBC_HIDDEN__ void timespec_from_ms(timespec& ts, const int ms);
+void timeval_from_timespec(timeval& tv, const timespec& ts);
 
-__LIBC_HIDDEN__ void timeval_from_timespec(timeval& tv, const timespec& ts);
+void monotonic_time_from_realtime_time(timespec& monotonic_time, const timespec& realtime_time);
+void realtime_time_from_monotonic_time(timespec& realtime_time, const timespec& monotonic_time);
 
-__LIBC_HIDDEN__ void monotonic_time_from_realtime_time(timespec& monotonic_time,
-                                                       const timespec& realtime_time);
+static inline int64_t to_ns(const timespec& ts) {
+  return ts.tv_sec * NS_PER_S + ts.tv_nsec;
+}
 
-__LIBC_HIDDEN__ void realtime_time_from_monotonic_time(timespec& realtime_time,
-                                                       const timespec& monotonic_time);
-
-__END_DECLS
+static inline int64_t to_us(const timeval& tv) {
+  return tv.tv_sec * US_PER_S + tv.tv_usec;
+}
 
 static inline int check_timespec(const timespec* ts, bool null_allowed) {
   if (null_allowed && ts == nullptr) {
@@ -76,5 +76,3 @@
   }
 }
 #endif
-
-#endif
diff --git a/linker/Android.bp b/linker/Android.bp
index a06ca29..847a9b2 100644
--- a/linker/Android.bp
+++ b/linker/Android.bp
@@ -108,6 +108,12 @@
 
     // We need to access Bionic private headers in the linker.
     include_dirs: ["bionic/libc"],
+
+    sanitize: {
+        // Supporting memtag_globals in the linker would be tricky,
+        // because it relocates itself very early.
+        memtag_globals: false,
+    },
 }
 
 // ========================================================
diff --git a/linker/dlfcn.cpp b/linker/dlfcn.cpp
index fee19f4..82f2728 100644
--- a/linker/dlfcn.cpp
+++ b/linker/dlfcn.cpp
@@ -331,6 +331,7 @@
     __libdl_info->gnu_bloom_filter_ = linker_si.gnu_bloom_filter_;
     __libdl_info->gnu_bucket_ = linker_si.gnu_bucket_;
     __libdl_info->gnu_chain_ = linker_si.gnu_chain_;
+    __libdl_info->memtag_dynamic_entries_ = linker_si.memtag_dynamic_entries_;
 
     __libdl_info->ref_count_ = 1;
     __libdl_info->strtab_size_ = linker_si.strtab_size_;
diff --git a/linker/linker.cpp b/linker/linker.cpp
index fe7d348..8f78915 100644
--- a/linker/linker.cpp
+++ b/linker/linker.cpp
@@ -51,6 +51,7 @@
 #include <android-base/scopeguard.h>
 #include <async_safe/log.h>
 #include <bionic/pthread_internal.h>
+#include <platform/bionic/mte.h>
 
 // Private C library headers.
 
@@ -1697,11 +1698,19 @@
     }
   }
 
+  // The WebView loader uses RELRO sharing in order to promote page sharing of the large RELRO
+  // segment, as it's full of C++ vtables. Because MTE globals, by default, applies random tags to
+  // each global variable, the RELRO segment is polluted and unique for each process. In order to
+  // allow sharing, but still provide some protection, we use deterministic global tagging schemes
+  // for DSOs that are loaded through android_dlopen_ext, such as those loaded by WebView.
+  bool dlext_use_relro =
+      extinfo && extinfo->flags & (ANDROID_DLEXT_WRITE_RELRO | ANDROID_DLEXT_USE_RELRO);
+
   // Step 3: pre-link all DT_NEEDED libraries in breadth first order.
   bool any_memtag_stack = false;
   for (auto&& task : load_tasks) {
     soinfo* si = task->get_soinfo();
-    if (!si->is_linked() && !si->prelink_image()) {
+    if (!si->is_linked() && !si->prelink_image(dlext_use_relro)) {
       return false;
     }
     // si->memtag_stack() needs to be called after si->prelink_image() which populates
@@ -1720,7 +1729,7 @@
     } else {
       // find_library is used by the initial linking step, so we communicate that we
       // want memtag_stack enabled to __libc_init_mte.
-      __libc_shared_globals()->initial_memtag_stack = true;
+      __libc_shared_globals()->initial_memtag_stack_abi = true;
     }
   }
 
@@ -2361,7 +2370,7 @@
         void* tls_block = get_tls_block_for_this_thread(tls_module, /*should_alloc=*/true);
         *symbol = static_cast<char*>(tls_block) + sym->st_value;
       } else {
-        *symbol = reinterpret_cast<void*>(found->resolve_symbol_address(sym));
+        *symbol = get_tagged_address(reinterpret_cast<void*>(found->resolve_symbol_address(sym)));
       }
       failure_guard.Disable();
       LD_LOG(kLogDlsym,
@@ -2791,15 +2800,25 @@
   return true;
 }
 
-static void apply_relr_reloc(ElfW(Addr) offset, ElfW(Addr) load_bias) {
-  ElfW(Addr) address = offset + load_bias;
-  *reinterpret_cast<ElfW(Addr)*>(address) += load_bias;
+static void apply_relr_reloc(ElfW(Addr) offset, ElfW(Addr) load_bias, bool has_memtag_globals) {
+  ElfW(Addr) destination = offset + load_bias;
+  if (!has_memtag_globals) {
+    *reinterpret_cast<ElfW(Addr)*>(destination) += load_bias;
+    return;
+  }
+
+  ElfW(Addr)* tagged_destination =
+      reinterpret_cast<ElfW(Addr)*>(get_tagged_address(reinterpret_cast<void*>(destination)));
+  ElfW(Addr) tagged_value = reinterpret_cast<ElfW(Addr)>(
+      get_tagged_address(reinterpret_cast<void*>(*tagged_destination + load_bias)));
+  *tagged_destination = tagged_value;
 }
 
 // Process relocations in SHT_RELR section (experimental).
 // Details of the encoding are described in this post:
 //   https://groups.google.com/d/msg/generic-abi/bX460iggiKg/Pi9aSwwABgAJ
-bool relocate_relr(const ElfW(Relr)* begin, const ElfW(Relr)* end, ElfW(Addr) load_bias) {
+bool relocate_relr(const ElfW(Relr) * begin, const ElfW(Relr) * end, ElfW(Addr) load_bias,
+                   bool has_memtag_globals) {
   constexpr size_t wordsize = sizeof(ElfW(Addr));
 
   ElfW(Addr) base = 0;
@@ -2810,7 +2829,7 @@
     if ((entry&1) == 0) {
       // Even entry: encodes the offset for next relocation.
       offset = static_cast<ElfW(Addr)>(entry);
-      apply_relr_reloc(offset, load_bias);
+      apply_relr_reloc(offset, load_bias, has_memtag_globals);
       // Set base offset for subsequent bitmap entries.
       base = offset + wordsize;
       continue;
@@ -2821,7 +2840,7 @@
     while (entry != 0) {
       entry >>= 1;
       if ((entry&1) != 0) {
-        apply_relr_reloc(offset, load_bias);
+        apply_relr_reloc(offset, load_bias, has_memtag_globals);
       }
       offset += wordsize;
     }
@@ -2836,7 +2855,7 @@
 // An empty list of soinfos
 static soinfo_list_t g_empty_list;
 
-bool soinfo::prelink_image() {
+bool soinfo::prelink_image(bool dlext_use_relro) {
   if (flags_ & FLAG_PRELINKED) return true;
   /* Extract dynamic section */
   ElfW(Word) dynamic_flags = 0;
@@ -3325,6 +3344,18 @@
   // it each time we look up a symbol with a version.
   if (!validate_verdef_section(this)) return false;
 
+  // MTE globals requires remapping data segments with PROT_MTE as anonymous mappings, because file
+  // based mappings may not be backed by tag-capable memory (see "MAP_ANONYMOUS" on
+  // https://www.kernel.org/doc/html/latest/arch/arm64/memory-tagging-extension.html). This is only
+  // done if the binary has MTE globals (evidenced by the dynamic table entries), as it destroys
+  // page sharing. It's also only done on devices that support MTE, because the act of remapping
+  // pages is unnecessary on non-MTE devices (where we might still run MTE-globals enabled code).
+  if (should_tag_memtag_globals() &&
+      remap_memtag_globals_segments(phdr, phnum, base) == 0) {
+    tag_globals(dlext_use_relro);
+    protect_memtag_globals_ro_segments(phdr, phnum, base);
+  }
+
   flags_ |= FLAG_PRELINKED;
   return true;
 }
@@ -3397,6 +3428,13 @@
     return false;
   }
 
+  if (should_tag_memtag_globals()) {
+    std::list<std::string>* vma_names_ptr = vma_names();
+    // should_tag_memtag_globals -> __aarch64__ -> vma_names() != nullptr
+    CHECK(vma_names_ptr);
+    name_memtag_globals_segments(phdr, phnum, base, get_realpath(), vma_names_ptr);
+  }
+
   /* Handle serializing/sharing the RELRO segment */
   if (extinfo && (extinfo->flags & ANDROID_DLEXT_WRITE_RELRO)) {
     if (phdr_table_serialize_gnu_relro(phdr, phnum, load_bias,
@@ -3435,6 +3473,54 @@
   return true;
 }
 
+// https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst#global-variable-tagging
+void soinfo::tag_globals(bool dlext_use_relro) {
+  if (is_linked()) return;
+  if (flags_ & FLAG_GLOBALS_TAGGED) return;
+  flags_ |= FLAG_GLOBALS_TAGGED;
+
+  constexpr size_t kTagGranuleSize = 16;
+  const uint8_t* descriptor_stream = reinterpret_cast<const uint8_t*>(memtag_globals());
+
+  if (memtag_globalssz() == 0) {
+    DL_ERR("Invalid memtag descriptor pool size: %zu", memtag_globalssz());
+  }
+
+  uint64_t addr = load_bias;
+  uleb128_decoder decoder(descriptor_stream, memtag_globalssz());
+  // Don't ever generate tag zero, to easily distinguish between tagged and
+  // untagged globals in register/tag dumps.
+  uint64_t last_tag_mask = 1;
+  uint64_t last_tag = 1;
+  constexpr uint64_t kDistanceReservedBits = 3;
+
+  while (decoder.has_bytes()) {
+    uint64_t value = decoder.pop_front();
+    uint64_t distance = (value >> kDistanceReservedBits) * kTagGranuleSize;
+    uint64_t ngranules = value & ((1 << kDistanceReservedBits) - 1);
+    if (ngranules == 0) {
+      ngranules = decoder.pop_front() + 1;
+    }
+
+    addr += distance;
+    void* tagged_addr;
+    if (dlext_use_relro) {
+      tagged_addr = reinterpret_cast<void*>(addr | (last_tag++ << 56));
+      if (last_tag > (1 << kTagGranuleSize)) last_tag = 1;
+    } else {
+      tagged_addr = insert_random_tag(reinterpret_cast<void*>(addr), last_tag_mask);
+      uint64_t tag = (reinterpret_cast<uint64_t>(tagged_addr) >> 56) & 0x0f;
+      last_tag_mask = 1 | (1 << tag);
+    }
+
+    for (size_t k = 0; k < ngranules; k++) {
+      auto* granule = static_cast<uint8_t*>(tagged_addr) + k * kTagGranuleSize;
+      set_memory_tag(static_cast<void*>(granule));
+    }
+    addr += ngranules * kTagGranuleSize;
+  }
+}
+
 static std::vector<android_namespace_t*> init_default_namespace_no_config(bool is_asan, bool is_hwasan) {
   g_default_namespace.set_isolated(false);
   auto default_ld_paths = is_asan ? kAsanDefaultLdPaths : (
diff --git a/linker/linker.h b/linker/linker.h
index ac2222d..b696fd9 100644
--- a/linker/linker.h
+++ b/linker/linker.h
@@ -179,7 +179,8 @@
 int get_application_target_sdk_version();
 ElfW(Versym) find_verdef_version_index(const soinfo* si, const version_info* vi);
 bool validate_verdef_section(const soinfo* si);
-bool relocate_relr(const ElfW(Relr)* begin, const ElfW(Relr)* end, ElfW(Addr) load_bias);
+bool relocate_relr(const ElfW(Relr) * begin, const ElfW(Relr) * end, ElfW(Addr) load_bias,
+                   bool has_memtag_globals);
 
 struct platform_properties {
 #if defined(__aarch64__)
diff --git a/linker/linker_main.cpp b/linker/linker_main.cpp
index 48ed723..f65f82d 100644
--- a/linker/linker_main.cpp
+++ b/linker/linker_main.cpp
@@ -46,6 +46,7 @@
 #include "linker_tls.h"
 #include "linker_utils.h"
 
+#include "platform/bionic/macros.h"
 #include "private/KernelArgumentBlock.h"
 #include "private/bionic_call_ifunc_resolver.h"
 #include "private/bionic_globals.h"
@@ -71,7 +72,9 @@
 static void set_bss_vma_name(soinfo* si);
 
 void __libc_init_mte(const memtag_dynamic_entries_t* memtag_dynamic_entries, const void* phdr_start,
-                     size_t phdr_count, uintptr_t load_bias, void* stack_top);
+                     size_t phdr_count, uintptr_t load_bias);
+
+void __libc_init_mte_stack(void* stack_top);
 
 static void __linker_cannot_link(const char* argv0) {
   __linker_error("CANNOT LINK EXECUTABLE \"%s\": %s", argv0, linker_get_error_buffer());
@@ -365,6 +368,8 @@
   init_link_map_head(*solinker);
 
 #if defined(__aarch64__)
+  __libc_init_mte(somain->memtag_dynamic_entries(), somain->phdr, somain->phnum, somain->load_bias);
+
   if (exe_to_load == nullptr) {
     // Kernel does not add PROT_BTI to executable pages of the loaded ELF.
     // Apply appropriate protections here if it is needed.
@@ -465,8 +470,7 @@
 #if defined(__aarch64__)
   // This has to happen after the find_libraries, which will have collected any possible
   // libraries that request memtag_stack in the dynamic section.
-  __libc_init_mte(somain->memtag_dynamic_entries(), somain->phdr, somain->phnum, somain->load_bias,
-                  args.argv);
+  __libc_init_mte_stack(args.argv);
 #endif
 
   linker_finalize_static_tls();
@@ -625,8 +629,13 @@
     // Apply RELR relocations first so that the GOT is initialized for ifunc
     // resolvers.
     if (relr && relrsz) {
+      // Nothing has tagged the memtag globals here, so it is pointless either
+      // way to handle them, the tags will be zero anyway.
+      // That is moot though, because the linker does not use memtag_globals
+      // in the first place.
       relocate_relr(reinterpret_cast<ElfW(Relr*)>(ehdr + relr),
-                    reinterpret_cast<ElfW(Relr*)>(ehdr + relr + relrsz), ehdr);
+                    reinterpret_cast<ElfW(Relr*)>(ehdr + relr + relrsz), ehdr,
+                    /*has_memtag_globals=*/ false);
     }
     if (pltrel && pltrelsz) {
       call_ifunc_resolvers_for_section(reinterpret_cast<RelType*>(ehdr + pltrel),
@@ -646,6 +655,16 @@
   }
 }
 
+// Remapping MTE globals segments happens before the linker relocates itself, and so can't use
+// memcpy() from string.h. This function is compiled with -ffreestanding.
+void linker_memcpy(void* dst, const void* src, size_t n) {
+  char* dst_bytes = reinterpret_cast<char*>(dst);
+  const char* src_bytes = reinterpret_cast<const char*>(src);
+  for (size_t i = 0; i < n; ++i) {
+    dst_bytes[i] = src_bytes[i];
+  }
+}
+
 // Detect an attempt to run the linker on itself. e.g.:
 //   /system/bin/linker64 /system/bin/linker64
 // Use priority-1 to run this constructor before other constructors.
diff --git a/linker/linker_main.h b/linker/linker_main.h
index 724f43c..ffbcf0f 100644
--- a/linker/linker_main.h
+++ b/linker/linker_main.h
@@ -70,3 +70,5 @@
 soinfo* solist_get_head();
 soinfo* solist_get_somain();
 soinfo* solist_get_vdso();
+
+void linker_memcpy(void* dst, const void* src, size_t n);
diff --git a/linker/linker_phdr.cpp b/linker/linker_phdr.cpp
index 7691031..c37066b 100644
--- a/linker/linker_phdr.cpp
+++ b/linker/linker_phdr.cpp
@@ -37,9 +37,12 @@
 #include <unistd.h>
 
 #include "linker.h"
+#include "linker_debug.h"
 #include "linker_dlwarning.h"
 #include "linker_globals.h"
-#include "linker_debug.h"
+#include "linker_logger.h"
+#include "linker_main.h"
+#include "linker_soinfo.h"
 #include "linker_utils.h"
 
 #include "private/bionic_asm_note.h"
@@ -1172,6 +1175,125 @@
                                    should_use_16kib_app_compat);
 }
 
+static bool segment_needs_memtag_globals_remapping(const ElfW(Phdr) * phdr) {
+  // For now, MTE globals is only supported on writeable data segments.
+  return phdr->p_type == PT_LOAD && !(phdr->p_flags & PF_X) && (phdr->p_flags & PF_W);
+}
+
+/* When MTE globals are requested by the binary, and when the hardware supports
+ * it, remap the executable's PT_LOAD data pages to have PROT_MTE.
+ *
+ * Returns 0 on success, -1 on failure (error code in errno).
+ */
+int remap_memtag_globals_segments(const ElfW(Phdr) * phdr_table __unused,
+                                  size_t phdr_count __unused, ElfW(Addr) load_bias __unused) {
+#if defined(__aarch64__)
+  for (const ElfW(Phdr)* phdr = phdr_table; phdr < phdr_table + phdr_count; phdr++) {
+    if (!segment_needs_memtag_globals_remapping(phdr)) {
+      continue;
+    }
+
+    uintptr_t seg_page_start = page_start(phdr->p_vaddr) + load_bias;
+    uintptr_t seg_page_end = page_end(phdr->p_vaddr + phdr->p_memsz) + load_bias;
+    size_t seg_page_aligned_size = seg_page_end - seg_page_start;
+
+    int prot = PFLAGS_TO_PROT(phdr->p_flags);
+    // For anonymous private mappings, it may be possible to simply mprotect()
+    // the PROT_MTE flag over the top. For file-based mappings, this will fail,
+    // and we'll need to fall back. We also allow PROT_WRITE here to allow
+    // writing memory tags (in `soinfo::tag_globals()`), and set these sections
+    // back to read-only after tags are applied (similar to RELRO).
+    prot |= PROT_MTE;
+    if (mprotect(reinterpret_cast<void*>(seg_page_start), seg_page_aligned_size,
+                 prot | PROT_WRITE) == 0) {
+      continue;
+    }
+
+    void* mapping_copy = mmap(nullptr, seg_page_aligned_size, PROT_READ | PROT_WRITE,
+                              MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+    linker_memcpy(mapping_copy, reinterpret_cast<void*>(seg_page_start), seg_page_aligned_size);
+
+    void* seg_addr = mmap(reinterpret_cast<void*>(seg_page_start), seg_page_aligned_size,
+                          prot | PROT_WRITE, MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+    if (seg_addr == MAP_FAILED) return -1;
+
+    linker_memcpy(seg_addr, mapping_copy, seg_page_aligned_size);
+    munmap(mapping_copy, seg_page_aligned_size);
+  }
+#endif  // defined(__aarch64__)
+  return 0;
+}
+
+void protect_memtag_globals_ro_segments(const ElfW(Phdr) * phdr_table __unused,
+                                        size_t phdr_count __unused, ElfW(Addr) load_bias __unused) {
+#if defined(__aarch64__)
+  for (const ElfW(Phdr)* phdr = phdr_table; phdr < phdr_table + phdr_count; phdr++) {
+    int prot = PFLAGS_TO_PROT(phdr->p_flags);
+    if (!segment_needs_memtag_globals_remapping(phdr) || (prot & PROT_WRITE)) {
+      continue;
+    }
+
+    prot |= PROT_MTE;
+
+    uintptr_t seg_page_start = page_start(phdr->p_vaddr) + load_bias;
+    uintptr_t seg_page_end = page_end(phdr->p_vaddr + phdr->p_memsz) + load_bias;
+    size_t seg_page_aligned_size = seg_page_end - seg_page_start;
+    mprotect(reinterpret_cast<void*>(seg_page_start), seg_page_aligned_size, prot);
+  }
+#endif  // defined(__aarch64__)
+}
+
+void name_memtag_globals_segments(const ElfW(Phdr) * phdr_table, size_t phdr_count,
+                                  ElfW(Addr) load_bias, const char* soname,
+                                  std::list<std::string>* vma_names) {
+  for (const ElfW(Phdr)* phdr = phdr_table; phdr < phdr_table + phdr_count; phdr++) {
+    if (!segment_needs_memtag_globals_remapping(phdr)) {
+      continue;
+    }
+
+    uintptr_t seg_page_start = page_start(phdr->p_vaddr) + load_bias;
+    uintptr_t seg_page_end = page_end(phdr->p_vaddr + phdr->p_memsz) + load_bias;
+    size_t seg_page_aligned_size = seg_page_end - seg_page_start;
+
+    // For file-based mappings that we're now forcing to be anonymous mappings, set the VMA name to
+    // make debugging easier.
+    // Once we are targeting only devices that run kernel 5.10 or newer (and thus include
+    // https://android-review.git.corp.google.com/c/kernel/common/+/1934723 which causes the
+    // VMA_ANON_NAME to be copied into the kernel), we can get rid of the storage here.
+    // For now, that is not the case:
+    // https://source.android.com/docs/core/architecture/kernel/android-common#compatibility-matrix
+    constexpr int kVmaNameLimit = 80;
+    std::string& vma_name = vma_names->emplace_back(kVmaNameLimit, '\0');
+    int full_vma_length =
+        async_safe_format_buffer(vma_name.data(), kVmaNameLimit, "mt:%s+%" PRIxPTR, soname,
+                                 page_start(phdr->p_vaddr)) +
+        /* include the null terminator */ 1;
+    // There's an upper limit of 80 characters, including the null terminator, in the anonymous VMA
+    // name. If we run over that limit, we end up truncating the segment offset and parts of the
+    // DSO's name, starting on the right hand side of the basename. Because the basename is the most
+    // important thing, chop off the soname from the left hand side first.
+    //
+    // Example (with '#' as the null terminator):
+    //   - "mt:/data/nativetest64/bionic-unit-tests/bionic-loader-test-libs/libdlext_test.so+e000#"
+    //     is a `full_vma_length` == 86.
+    //
+    // We need to left-truncate (86 - 80) 6 characters from the soname, plus the
+    // `vma_truncation_prefix`, so 9 characters total.
+    if (full_vma_length > kVmaNameLimit) {
+      const char vma_truncation_prefix[] = "...";
+      int soname_truncated_bytes =
+          full_vma_length - kVmaNameLimit + sizeof(vma_truncation_prefix) - 1;
+      async_safe_format_buffer(vma_name.data(), kVmaNameLimit, "mt:%s%s+%" PRIxPTR,
+                               vma_truncation_prefix, soname + soname_truncated_bytes,
+                               page_start(phdr->p_vaddr));
+    }
+    if (prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, reinterpret_cast<void*>(seg_page_start),
+              seg_page_aligned_size, vma_name.data()) != 0) {
+      DL_WARN("Failed to rename memtag global segment: %m");
+    }
+  }
+}
+
 /* Change the protection of all loaded segments in memory to writable.
  * This is useful before performing relocations. Once completed, you
  * will have to call phdr_table_protect_segments to restore the original
diff --git a/linker/linker_phdr.h b/linker/linker_phdr.h
index 2f159f3..e15ece4 100644
--- a/linker/linker_phdr.h
+++ b/linker/linker_phdr.h
@@ -39,6 +39,8 @@
 #include "linker_mapped_file_fragment.h"
 #include "linker_note_gnu_property.h"
 
+#include <list>
+
 #define MAYBE_MAP_FLAG(x, from, to)  (((x) & (from)) ? (to) : 0)
 #define PFLAGS_TO_PROT(x)            (MAYBE_MAP_FLAG((x), PF_X, PROT_EXEC) | \
                                       MAYBE_MAP_FLAG((x), PF_R, PROT_READ) | \
@@ -188,3 +190,13 @@
                                             ElfW(Addr) load_bias);
 
 bool page_size_migration_supported();
+
+int remap_memtag_globals_segments(const ElfW(Phdr) * phdr_table, size_t phdr_count,
+                                  ElfW(Addr) load_bias);
+
+void protect_memtag_globals_ro_segments(const ElfW(Phdr) * phdr_table, size_t phdr_count,
+                                        ElfW(Addr) load_bias);
+
+void name_memtag_globals_segments(const ElfW(Phdr) * phdr_table, size_t phdr_count,
+                                  ElfW(Addr) load_bias, const char* soname,
+                                  std::list<std::string>* vma_names);
diff --git a/linker/linker_relocate.cpp b/linker/linker_relocate.cpp
index 0470f87..bbf8359 100644
--- a/linker/linker_relocate.cpp
+++ b/linker/linker_relocate.cpp
@@ -44,6 +44,8 @@
 #include "linker_soinfo.h"
 #include "private/bionic_globals.h"
 
+#include <platform/bionic/mte.h>
+
 static bool is_tls_reloc(ElfW(Word) type) {
   switch (type) {
     case R_GENERIC_TLS_DTPMOD:
@@ -163,7 +165,8 @@
 static bool process_relocation_impl(Relocator& relocator, const rel_t& reloc) {
   constexpr bool IsGeneral = Mode == RelocMode::General;
 
-  void* const rel_target = reinterpret_cast<void*>(reloc.r_offset + relocator.si->load_bias);
+  void* const rel_target = reinterpret_cast<void*>(
+      relocator.si->apply_memtag_if_mte_globals(reloc.r_offset + relocator.si->load_bias));
   const uint32_t r_type = ELFW(R_TYPE)(reloc.r_info);
   const uint32_t r_sym = ELFW(R_SYM)(reloc.r_info);
 
@@ -316,6 +319,7 @@
     // common in non-platform binaries.
     if (r_type == R_GENERIC_ABSOLUTE) {
       count_relocation_if<IsGeneral>(kRelocAbsolute);
+      if (found_in) sym_addr = found_in->apply_memtag_if_mte_globals(sym_addr);
       const ElfW(Addr) result = sym_addr + get_addend_rel();
       LD_DEBUG(reloc && IsGeneral, "RELO ABSOLUTE %16p <- %16p %s",
                rel_target, reinterpret_cast<void*>(result), sym_name);
@@ -326,6 +330,7 @@
       // document (IHI0044F) specifies that R_ARM_GLOB_DAT has an addend, but Bionic isn't adding
       // it.
       count_relocation_if<IsGeneral>(kRelocAbsolute);
+      if (found_in) sym_addr = found_in->apply_memtag_if_mte_globals(sym_addr);
       const ElfW(Addr) result = sym_addr + get_addend_norel();
       LD_DEBUG(reloc && IsGeneral, "RELO GLOB_DAT %16p <- %16p %s",
                rel_target, reinterpret_cast<void*>(result), sym_name);
@@ -335,7 +340,18 @@
       // In practice, r_sym is always zero, but if it weren't, the linker would still look up the
       // referenced symbol (and abort if the symbol isn't found), even though it isn't used.
       count_relocation_if<IsGeneral>(kRelocRelative);
-      const ElfW(Addr) result = relocator.si->load_bias + get_addend_rel();
+      ElfW(Addr) result = relocator.si->load_bias + get_addend_rel();
+      // MTE globals reuses the place bits for additional tag-derivation metadata for
+      // R_AARCH64_RELATIVE relocations, which makes it incompatible with
+      // `-Wl,--apply-dynamic-relocs`. This is enforced by lld, however there's nothing stopping
+      // Android binaries (particularly prebuilts) from building with this linker flag if they're
+      // not built with MTE globals. Thus, don't use the new relocation semantics if this DSO
+      // doesn't have MTE globals.
+      if (relocator.si->should_tag_memtag_globals()) {
+        int64_t* place = static_cast<int64_t*>(rel_target);
+        int64_t offset = *place;
+        result = relocator.si->apply_memtag_if_mte_globals(result + offset) - offset;
+      }
       LD_DEBUG(reloc && IsGeneral, "RELO RELATIVE %16p <- %16p",
                rel_target, reinterpret_cast<void*>(result));
       *static_cast<ElfW(Addr)*>(rel_target) = result;
@@ -600,7 +616,7 @@
     LD_DEBUG(reloc, "[ relocating %s relr ]", get_realpath());
     const ElfW(Relr)* begin = relr_;
     const ElfW(Relr)* end = relr_ + relr_count_;
-    if (!relocate_relr(begin, end, load_bias)) {
+    if (!relocate_relr(begin, end, load_bias, should_tag_memtag_globals())) {
       return false;
     }
   }
diff --git a/linker/linker_sleb128.h b/linker/linker_sleb128.h
index 6bb3199..f48fda8 100644
--- a/linker/linker_sleb128.h
+++ b/linker/linker_sleb128.h
@@ -69,3 +69,32 @@
   const uint8_t* current_;
   const uint8_t* const end_;
 };
+
+class uleb128_decoder {
+ public:
+  uleb128_decoder(const uint8_t* buffer, size_t count) : current_(buffer), end_(buffer + count) {}
+
+  uint64_t pop_front() {
+    uint64_t value = 0;
+
+    size_t shift = 0;
+    uint8_t byte;
+
+    do {
+      if (current_ >= end_) {
+        async_safe_fatal("uleb128_decoder ran out of bounds");
+      }
+      byte = *current_++;
+      value |= (static_cast<size_t>(byte & 127) << shift);
+      shift += 7;
+    } while (byte & 128);
+
+    return value;
+  }
+
+  bool has_bytes() { return current_ < end_; }
+
+ private:
+  const uint8_t* current_;
+  const uint8_t* const end_;
+};
diff --git a/linker/linker_soinfo.cpp b/linker/linker_soinfo.cpp
index 0549d36..176c133 100644
--- a/linker/linker_soinfo.cpp
+++ b/linker/linker_soinfo.cpp
@@ -44,6 +44,8 @@
 #include "linker_logger.h"
 #include "linker_relocate.h"
 #include "linker_utils.h"
+#include "platform/bionic/mte.h"
+#include "private/bionic_globals.h"
 
 SymbolLookupList::SymbolLookupList(soinfo* si)
     : sole_lib_(si->get_lookup_lib()), begin_(&sole_lib_), end_(&sole_lib_ + 1) {
@@ -304,6 +306,12 @@
   return is_gnu_hash() ? gnu_lookup(symbol_name, vi) : elf_lookup(symbol_name, vi);
 }
 
+ElfW(Addr) soinfo::apply_memtag_if_mte_globals(ElfW(Addr) sym_addr) const {
+  if (!should_tag_memtag_globals()) return sym_addr;
+  if (sym_addr == 0) return sym_addr;  // Handle undefined weak symbols.
+  return reinterpret_cast<ElfW(Addr)>(get_tagged_address(reinterpret_cast<void*>(sym_addr)));
+}
+
 const ElfW(Sym)* soinfo::gnu_lookup(SymbolName& symbol_name, const version_info* vi) const {
   const uint32_t hash = symbol_name.gnu_hash();
 
diff --git a/linker/linker_soinfo.h b/linker/linker_soinfo.h
index 9bec2aa..4d02676 100644
--- a/linker/linker_soinfo.h
+++ b/linker/linker_soinfo.h
@@ -30,6 +30,7 @@
 
 #include <link.h>
 
+#include <list>
 #include <memory>
 #include <string>
 #include <vector>
@@ -66,6 +67,7 @@
                                          // soinfo is executed and this flag is
                                          // unset.
 #define FLAG_PRELINKED        0x00000400 // prelink_image has successfully processed this soinfo
+#define FLAG_GLOBALS_TAGGED   0x00000800 // globals have been tagged by MTE.
 #define FLAG_NEW_SOINFO       0x40000000 // new soinfo format
 
 #define SOINFO_VERSION 6
@@ -252,11 +254,14 @@
   void call_constructors();
   void call_destructors();
   void call_pre_init_constructors();
-  bool prelink_image();
+  bool prelink_image(bool deterministic_memtag_globals = false);
   bool link_image(const SymbolLookupList& lookup_list, soinfo* local_group_root,
                   const android_dlextinfo* extinfo, size_t* relro_fd_offset);
   bool protect_relro();
 
+  void tag_globals(bool deterministic_memtag_globals);
+  ElfW(Addr) apply_memtag_if_mte_globals(ElfW(Addr) sym_addr) const;
+
   void add_child(soinfo* child);
   void remove_all_links();
 
@@ -295,7 +300,7 @@
 #else
     // If you make this return non-true in the case where
     // __work_around_b_24465209__ is not defined, you will have to change
-    // memtag_dynamic_entries.
+    // memtag_dynamic_entries() and vma_names().
     return true;
 #endif
   }
@@ -394,6 +399,18 @@
    should_pad_segments_ = should_pad_segments;
   }
   bool should_pad_segments() const { return should_pad_segments_; }
+  bool should_tag_memtag_globals() const {
+    return !is_linker() && memtag_globals() && memtag_globalssz() > 0 && __libc_mte_enabled();
+  }
+  std::list<std::string>* vma_names() {
+#ifdef __aarch64__
+#ifdef __work_around_b_24465209__
+#error "Assuming aarch64 does not use versioned soinfo."
+#endif
+    return &vma_names_;
+#endif
+    return nullptr;
+};
 
   void set_should_use_16kib_app_compat(bool should_use_16kib_app_compat) {
     should_use_16kib_app_compat_ = should_use_16kib_app_compat;
@@ -489,6 +506,7 @@
 
   // __aarch64__ only, which does not use versioning.
   memtag_dynamic_entries_t memtag_dynamic_entries_;
+  std::list<std::string> vma_names_;
 
   // Pad gaps between segments when memory mapping?
   bool should_pad_segments_ = false;
diff --git a/tests/Android.bp b/tests/Android.bp
index b537483..22fa542 100644
--- a/tests/Android.bp
+++ b/tests/Android.bp
@@ -428,6 +428,7 @@
         "malloc_test.cpp",
         "math_test.cpp",
         "membarrier_test.cpp",
+        "memtag_globals_test.cpp",
         "memtag_stack_test.cpp",
         "mntent_test.cpp",
         "mte_test.cpp",
@@ -891,6 +892,11 @@
         "ld_preload_test_helper",
         "ld_preload_test_helper_lib1",
         "ld_preload_test_helper_lib2",
+        "memtag_globals_binary",
+        "memtag_globals_binary_static",
+        "memtag_globals_dso",
+        "mte_globals_relr_regression_test_b_314038442",
+        "mte_globals_relr_regression_test_b_314038442_mte",
         "ns_hidden_child_helper",
         "preinit_getauxval_test_helper",
         "preinit_syscall_test_helper",
@@ -1199,6 +1205,35 @@
 }
 
 cc_test {
+    name: "memtag_stack_abi_test",
+    enabled: false,
+    // This does not use bionic_tests_defaults because it is not supported on
+    // host.
+    arch: {
+        arm64: {
+            enabled: true,
+        },
+    },
+    // We don't use `sanitize:` so we generate the appropriate ELF note, but
+    // still support non-MTE devices.
+    // TODO(fmayer): also add a test that enables stack MTE for MTE devices,
+    // which would test for more bugs.
+    ldflags: ["-fsanitize=memtag-stack"],
+    // Turn off all other sanitizers from SANITIZE_TARGET.
+    sanitize: {
+        never: true,
+    },
+    shared_libs: [
+        "libbase",
+    ],
+    srcs: [
+        "memtag_stack_abi_test.cpp",
+    ],
+    header_libs: ["bionic_libc_platform_headers"],
+    test_suites: ["device-tests"],
+}
+
+cc_test {
     name: "bionic-stress-tests",
     defaults: [
         "bionic_tests_defaults",
@@ -1287,6 +1322,11 @@
         "heap_tagging_static_disabled_helper",
         "heap_tagging_static_sync_helper",
         "heap_tagging_sync_helper",
+        "memtag_globals_binary",
+        "memtag_globals_binary_static",
+        "memtag_globals_dso",
+        "mte_globals_relr_regression_test_b_314038442",
+        "mte_globals_relr_regression_test_b_314038442_mte",
         "stack_tagging_helper",
         "stack_tagging_static_helper",
     ],
diff --git a/tests/dlext_test.cpp b/tests/dlext_test.cpp
index 570da2a..8b26cb0 100644
--- a/tests/dlext_test.cpp
+++ b/tests/dlext_test.cpp
@@ -21,6 +21,7 @@
 #include <errno.h>
 #include <fcntl.h>
 #include <inttypes.h>
+#include <link.h>
 #include <stdio.h>
 #include <string.h>
 #include <unistd.h>
@@ -40,11 +41,13 @@
 #include <procinfo/process_map.h>
 #include <ziparchive/zip_archive.h>
 
+#include "bionic/mte.h"
+#include "bionic/page.h"
 #include "core_shared_libs.h"
-#include "gtest_globals.h"
-#include "utils.h"
 #include "dlext_private.h"
 #include "dlfcn_symlink_support.h"
+#include "gtest_globals.h"
+#include "utils.h"
 
 #define ASSERT_DL_NOTNULL(ptr) \
     ASSERT_TRUE((ptr) != nullptr) << "dlerror: " << dlerror()
@@ -1958,6 +1961,14 @@
   dlclose(ns_a_handle3);
 }
 
+static inline int MapPflagsToProtFlags(uint32_t flags) {
+  int prot_flags = 0;
+  if (PF_X & flags) prot_flags |= PROT_EXEC;
+  if (PF_W & flags) prot_flags |= PROT_WRITE;
+  if (PF_R & flags) prot_flags |= PROT_READ;
+  return prot_flags;
+}
+
 TEST(dlext, ns_anonymous) {
   static const char* root_lib = "libnstest_root.so";
   std::string shared_libs = g_core_shared_libs + ":" + g_public_lib;
@@ -1999,30 +2010,45 @@
   typedef const char* (*fn_t)();
   fn_t ns_get_dlopened_string_private = reinterpret_cast<fn_t>(ns_get_dlopened_string_addr);
 
-  std::vector<map_record> maps;
-  Maps::parse_maps(&maps);
-
+  Dl_info private_library_info;
+  ASSERT_NE(dladdr(reinterpret_cast<void*>(ns_get_dlopened_string_addr), &private_library_info), 0)
+      << dlerror();
+  std::vector<map_record> maps_to_copy;
+  bool has_executable_segment = false;
   uintptr_t addr_start = 0;
   uintptr_t addr_end = 0;
-  bool has_executable_segment = false;
-  std::vector<map_record> maps_to_copy;
+  std::tuple dl_iterate_arg = {&private_library_info, &maps_to_copy, &has_executable_segment,
+                               &addr_start, &addr_end};
+  ASSERT_EQ(
+      1, dl_iterate_phdr(
+             [](dl_phdr_info* info, size_t /*size*/, void* data) -> int {
+               auto [private_library_info, maps_to_copy, has_executable_segment, addr_start,
+                     addr_end] = *reinterpret_cast<decltype(dl_iterate_arg)*>(data);
+               if (info->dlpi_addr != reinterpret_cast<ElfW(Addr)>(private_library_info->dli_fbase))
+                 return 0;
 
-  for (const auto& rec : maps) {
-    if (rec.pathname == private_library_absolute_path) {
-      if (addr_start == 0) {
-        addr_start = rec.addr_start;
-      }
-      addr_end = rec.addr_end;
-      has_executable_segment = has_executable_segment || (rec.perms & PROT_EXEC) != 0;
-
-      maps_to_copy.push_back(rec);
-    }
-  }
+               for (size_t i = 0; i < info->dlpi_phnum; ++i) {
+                 const ElfW(Phdr)* phdr = info->dlpi_phdr + i;
+                 if (phdr->p_type != PT_LOAD) continue;
+                 *has_executable_segment |= phdr->p_flags & PF_X;
+                 uintptr_t mapping_start = page_start(info->dlpi_addr + phdr->p_vaddr);
+                 uintptr_t mapping_end = page_end(info->dlpi_addr + phdr->p_vaddr + phdr->p_memsz);
+                 if (*addr_start == 0 || mapping_start < *addr_start) *addr_start = mapping_start;
+                 if (*addr_end == 0 || mapping_end > *addr_end) *addr_end = mapping_end;
+                 maps_to_copy->push_back({
+                     .addr_start = mapping_start,
+                     .addr_end = mapping_end,
+                     .perms = MapPflagsToProtFlags(phdr->p_flags),
+                 });
+               }
+               return 1;
+             },
+             &dl_iterate_arg));
 
   // Some validity checks.
+  ASSERT_NE(maps_to_copy.size(), 0u);
   ASSERT_TRUE(addr_start > 0);
   ASSERT_TRUE(addr_end > 0);
-  ASSERT_TRUE(maps_to_copy.size() > 0);
   ASSERT_TRUE(ns_get_dlopened_string_addr > addr_start);
   ASSERT_TRUE(ns_get_dlopened_string_addr < addr_end);
 
@@ -2052,19 +2078,26 @@
   ASSERT_EQ(ret, 0) << "Failed to stat library";
   size_t file_size = file_stat.st_size;
 
-  for (const auto& rec : maps_to_copy) {
-    uintptr_t offset = rec.addr_start - addr_start;
-    size_t size = rec.addr_end - rec.addr_start;
-    void* addr = reinterpret_cast<void*>(reserved_addr + offset);
-    void* map = mmap(addr, size, PROT_READ | PROT_WRITE,
-                     MAP_ANON | MAP_PRIVATE | MAP_FIXED, -1, 0);
-    ASSERT_TRUE(map != MAP_FAILED);
-    // Attempting the below memcpy from a portion of the map that is off the end of
-    // the backing file will cause the kernel to throw a SIGBUS
-    size_t _size = ::android::procinfo::MappedFileSize(rec.addr_start, rec.addr_end,
-                                                       rec.offset, file_size);
-    memcpy(map, reinterpret_cast<void*>(rec.addr_start), _size);
-    mprotect(map, size, rec.perms);
+  {
+    // Disable MTE while copying the PROT_MTE-protected global variables from
+    // the existing mappings. We don't really care about turning on PROT_MTE for
+    // the new copy of the mappings, as this isn't the behaviour under test and
+    // tags will be ignored. This only applies for MTE-enabled devices.
+    ScopedDisableMTE disable_mte_for_copying_global_variables;
+    for (const auto& rec : maps_to_copy) {
+      uintptr_t offset = rec.addr_start - addr_start;
+      size_t size = rec.addr_end - rec.addr_start;
+      void* addr = reinterpret_cast<void*>(reserved_addr + offset);
+      void* map =
+          mmap(addr, size, PROT_READ | PROT_WRITE, MAP_ANON | MAP_PRIVATE | MAP_FIXED, -1, 0);
+      ASSERT_TRUE(map != MAP_FAILED);
+      // Attempting the below memcpy from a portion of the map that is off the end of
+      // the backing file will cause the kernel to throw a SIGBUS
+      size_t _size =
+          ::android::procinfo::MappedFileSize(rec.addr_start, rec.addr_end, rec.offset, file_size);
+      memcpy(map, reinterpret_cast<void*>(rec.addr_start), _size);
+      mprotect(map, size, rec.perms);
+    }
   }
 
   // call the function copy
diff --git a/tests/libs/Android.bp b/tests/libs/Android.bp
index d13ee60..5b86e78 100644
--- a/tests/libs/Android.bp
+++ b/tests/libs/Android.bp
@@ -1903,3 +1903,89 @@
         " $(location soong_zip) -o $(out).unaligned -L 0 -C $(genDir)/zipdir -D $(genDir)/zipdir &&" +
         " $(location bionic_tests_zipalign) 16384 $(out).unaligned $(out)",
 }
+
+cc_defaults {
+    name: "memtag_globals_defaults",
+    defaults: [
+        "bionic_testlib_defaults",
+        "bionic_targets_only"
+    ],
+    cflags: [
+        "-Wno-array-bounds",
+        "-Wno-unused-variable",
+    ],
+    header_libs: ["bionic_libc_platform_headers"],
+    sanitize: {
+        hwaddress: false,
+        memtag_heap: true,
+        memtag_globals: true,
+        diag: {
+            memtag_heap: true,
+        }
+    },
+}
+
+cc_test_library {
+    name: "memtag_globals_dso",
+    defaults: [ "memtag_globals_defaults" ],
+    srcs: ["memtag_globals_dso.cpp"],
+}
+
+cc_test {
+    name: "memtag_globals_binary",
+    defaults: [ "memtag_globals_defaults" ],
+    srcs: ["memtag_globals_binary.cpp"],
+    shared_libs: [ "memtag_globals_dso" ],
+    // This binary is used in the bionic-unit-tests as a data dependency, and is
+    // in the same folder as memtag_globals_dso. But, the default cc_test rules
+    // make this binary (when just explicitly built and shoved in
+    // /data/nativetest64/) end up in a subfolder called
+    // 'memtag_globals_binary'. When this happens, the explicit build fails to
+    // find the DSO because the default rpath is just ${ORIGIN}, and because we
+    // want this to be usable both from bionic-unit-tests and explicit builds,
+    // let's just not put it in a subdirectory.
+    no_named_install_directory: true,
+}
+
+cc_test {
+    name: "memtag_globals_binary_static",
+    defaults: [ "memtag_globals_defaults" ],
+    srcs: ["memtag_globals_binary.cpp"],
+    static_libs: [ "memtag_globals_dso" ],
+    no_named_install_directory: true,
+    static_executable: true,
+}
+
+// This is a regression test for b/314038442, where binaries built *without* MTE
+// globals would have out-of-bounds RELR relocations, which where then `ldg`'d,
+// which resulted in linker crashes.
+cc_test {
+  name: "mte_globals_relr_regression_test_b_314038442",
+  defaults: [
+        "bionic_testlib_defaults",
+        "bionic_targets_only"
+    ],
+    cflags: [ "-Wno-array-bounds" ],
+    ldflags: [ "-Wl,--pack-dyn-relocs=relr" ],
+    srcs: ["mte_globals_relr_regression_test_b_314038442.cpp"],
+    no_named_install_directory: true,
+    sanitize: {
+        memtag_globals: false,
+    },
+}
+
+// Same test as above, but also for MTE globals, just for the sake of it.
+cc_test {
+  name: "mte_globals_relr_regression_test_b_314038442_mte",
+  defaults: [
+        "bionic_testlib_defaults",
+        "bionic_targets_only"
+    ],
+    cflags: [ "-Wno-array-bounds" ],
+    ldflags: [ "-Wl,--pack-dyn-relocs=relr" ],
+    srcs: ["mte_globals_relr_regression_test_b_314038442.cpp"],
+    no_named_install_directory: true,
+    sanitize: {
+      memtag_globals: true,
+    },
+}
diff --git a/tests/libs/memtag_globals.h b/tests/libs/memtag_globals.h
new file mode 100644
index 0000000..a03abae
--- /dev/null
+++ b/tests/libs/memtag_globals.h
@@ -0,0 +1,43 @@
+/*
+ * Copyright (C) 2024 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <utility>
+#include <vector>
+
+void check_tagged(const void* a);
+void check_untagged(const void* a);
+void check_matching_tags(const void* a, const void* b);
+void check_eq(const void* a, const void* b);
+
+void dso_check_assertions(bool enforce_tagged);
+void dso_print_variables();
+
+void print_variable_address(const char* name, const void* ptr);
+void print_variables(const char* header,
+                     const std::vector<std::pair<const char*, const void*>>& tagged_variables,
+                     const std::vector<std::pair<const char*, const void*>>& untagged_variables);
diff --git a/tests/libs/memtag_globals_binary.cpp b/tests/libs/memtag_globals_binary.cpp
new file mode 100644
index 0000000..9248728
--- /dev/null
+++ b/tests/libs/memtag_globals_binary.cpp
@@ -0,0 +1,195 @@
+/*
+ * Copyright (C) 2024 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <string>
+#include <vector>
+
+#include "memtag_globals.h"
+
+// Adapted from the LLD test suite: lld/test/ELF/Inputs/aarch64-memtag-globals.s
+
+/// Global variables defined here, of various semantics.
+char global[30] = {};
+__attribute__((no_sanitize("memtag"))) int global_untagged = 0;
+const int const_global = 0;
+static const int hidden_const_global = 0;
+static char hidden_global[12] = {};
+__attribute__((visibility("hidden"))) int hidden_attr_global = 0;
+__attribute__((visibility("hidden"))) const int hidden_attr_const_global = 0;
+
+/// Should be untagged.
+__thread int tls_global;
+__thread static int hidden_tls_global;
+
+/// Tagged, from the other file.
+extern int global_extern;
+/// Untagged, from the other file.
+extern __attribute__((no_sanitize("memtag"))) int global_extern_untagged;
+/// Tagged here, but untagged in the definition found in the sister objfile
+/// (explicitly).
+extern int global_extern_untagged_definition_but_tagged_import;
+
+/// ABS64 relocations. Also, forces symtab entries for local and external
+/// globals.
+char* pointer_to_global = &global[0];
+char* pointer_inside_global = &global[17];
+char* pointer_to_global_end = &global[30];
+char* pointer_past_global_end = &global[48];
+int* pointer_to_global_untagged = &global_untagged;
+const int* pointer_to_const_global = &const_global;
+/// RELATIVE relocations.
+const int* pointer_to_hidden_const_global = &hidden_const_global;
+char* pointer_to_hidden_global = &hidden_global[0];
+int* pointer_to_hidden_attr_global = &hidden_attr_global;
+const int* pointer_to_hidden_attr_const_global = &hidden_attr_const_global;
+/// RELATIVE relocations with special AArch64 MemtagABI semantics, with the
+/// offset ('12' or '16') encoded in the place.
+char* pointer_to_hidden_global_end = &hidden_global[12];
+char* pointer_past_hidden_global_end = &hidden_global[16];
+/// ABS64 relocations.
+int* pointer_to_global_extern = &global_extern;
+int* pointer_to_global_extern_untagged = &global_extern_untagged;
+int* pointer_to_global_extern_untagged_definition_but_tagged_import =
+    &global_extern_untagged_definition_but_tagged_import;
+
+// Force materialization of these globals into the symtab.
+int* get_address_to_tls_global() {
+  return &tls_global;
+}
+int* get_address_to_hidden_tls_global() {
+  return &hidden_tls_global;
+}
+
+static const std::vector<std::pair<const char*, const void*>>& get_expected_tagged_vars() {
+  static std::vector<std::pair<const char*, const void*>> expected_tagged_vars = {
+      {"global", &global},
+      {"pointer_inside_global", pointer_inside_global},
+      {"pointer_to_global_end", pointer_to_global_end},
+      {"pointer_past_global_end", pointer_past_global_end},
+      {"hidden_global", &hidden_global},
+      {"hidden_attr_global", &hidden_attr_global},
+      {"global_extern", &global_extern},
+  };
+  return expected_tagged_vars;
+}
+
+static const std::vector<std::pair<const char*, const void*>>& get_expected_untagged_vars() {
+  static std::vector<std::pair<const char*, const void*>> expected_untagged_vars = {
+      {"global_extern_untagged", &global_extern_untagged},
+      {"global_extern_untagged_definition_but_tagged_import",
+       &global_extern_untagged_definition_but_tagged_import},
+      {"global_untagged", &global_untagged},
+      {"const_global", &const_global},
+      {"hidden_const_global", &hidden_const_global},
+      {"hidden_attr_const_global", &hidden_attr_const_global},
+      {"tls_global", &tls_global},
+      {"hidden_tls_global", &hidden_tls_global},
+  };
+  return expected_untagged_vars;
+}
+
+void exe_print_variables() {
+  print_variables("  Variables accessible from the binary:\n", get_expected_tagged_vars(),
+                  get_expected_untagged_vars());
+}
+
+// Dump the addresses of the global variables to stderr
+void dso_print();
+void dso_print_others();
+
+void exe_check_assertions(bool check_pointers_are_tagged) {
+  // Check that non-const variables are writeable.
+  *pointer_to_global = 0;
+  *pointer_inside_global = 0;
+  *(pointer_to_global_end - 1) = 0;
+  *pointer_to_global_untagged = 0;
+  *pointer_to_hidden_global = 0;
+  *pointer_to_hidden_attr_global = 0;
+  *(pointer_to_hidden_global_end - 1) = 0;
+  *pointer_to_global_extern = 0;
+  *pointer_to_global_extern_untagged = 0;
+  *pointer_to_global_extern_untagged_definition_but_tagged_import = 0;
+
+  if (check_pointers_are_tagged) {
+    for (const auto& [_, pointer] : get_expected_tagged_vars()) {
+      check_tagged(pointer);
+    }
+  }
+
+  for (const auto& [_, pointer] : get_expected_untagged_vars()) {
+    check_untagged(pointer);
+  }
+
+  check_matching_tags(pointer_to_global, pointer_inside_global);
+  check_matching_tags(pointer_to_global, pointer_to_global_end);
+  check_matching_tags(pointer_to_global, pointer_past_global_end);
+  check_eq(pointer_inside_global, pointer_to_global + 17);
+  check_eq(pointer_to_global_end, pointer_to_global + 30);
+  check_eq(pointer_past_global_end, pointer_to_global + 48);
+
+  check_matching_tags(pointer_to_hidden_global, pointer_to_hidden_global_end);
+  check_matching_tags(pointer_to_hidden_global, pointer_past_hidden_global_end);
+  check_eq(pointer_to_hidden_global_end, pointer_to_hidden_global + 12);
+  check_eq(pointer_past_hidden_global_end, pointer_to_hidden_global + 16);
+}
+
+void crash() {
+  *pointer_past_global_end = 0;
+}
+
+int main(int argc, char** argv) {
+  bool check_pointers_are_tagged = false;
+  // For an MTE-capable device, provide argv[1] == '1' to enable the assertions
+  // that pointers should be tagged.
+  if (argc >= 2 && argv[1][0] == '1') {
+    check_pointers_are_tagged = true;
+  }
+
+  char* heap_ptr = static_cast<char*>(malloc(1));
+  print_variable_address("heap address", heap_ptr);
+  *heap_ptr = 0;
+  if (check_pointers_are_tagged) check_tagged(heap_ptr);
+  free(heap_ptr);
+
+  exe_print_variables();
+  dso_print_variables();
+
+  exe_check_assertions(check_pointers_are_tagged);
+  dso_check_assertions(check_pointers_are_tagged);
+
+  printf("Assertions were passed. Now doing a global-buffer-overflow.\n");
+  fflush(stdout);
+  crash();
+  printf("global-buffer-overflow went uncaught.\n");
+  return 0;
+}
diff --git a/tests/libs/memtag_globals_dso.cpp b/tests/libs/memtag_globals_dso.cpp
new file mode 100644
index 0000000..9ed264e
--- /dev/null
+++ b/tests/libs/memtag_globals_dso.cpp
@@ -0,0 +1,165 @@
+/*
+ * Copyright (C) 2024 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <vector>
+
+#include "memtag_globals.h"
+
+// Adapted from the LLD test suite: lld/test/ELF/Inputs/aarch64-memtag-globals.s
+
+int global_extern;
+static int global_extern_hidden;
+__attribute__((no_sanitize("memtag"))) int global_extern_untagged;
+__attribute__((no_sanitize("memtag"))) int global_extern_untagged_definition_but_tagged_import;
+
+void assertion_failure() {
+  exit(1);
+}
+
+void check_tagged(const void* a) {
+  uintptr_t a_uptr = reinterpret_cast<uintptr_t>(a);
+#if defined(__aarch64__)
+  if ((a_uptr >> 56) == 0) {
+    fprintf(stderr, "**********************************\n");
+    fprintf(stderr, "Failed assertion:\n");
+    fprintf(stderr, "  tag(0x%zx) != 0\n", a_uptr);
+    fprintf(stderr, "**********************************\n");
+
+    assertion_failure();
+  }
+#endif  // defined(__aarch64__)
+}
+
+void check_untagged(const void* a) {
+  uintptr_t a_uptr = reinterpret_cast<uintptr_t>(a);
+#if defined(__aarch64__)
+  if ((a_uptr >> 56) != 0) {
+    fprintf(stderr, "**********************************\n");
+    fprintf(stderr, "Failed assertion:\n");
+    fprintf(stderr, "  tag(0x%zx) == 0\n", a_uptr);
+    fprintf(stderr, "**********************************\n");
+
+    assertion_failure();
+  }
+#endif  // defined(__aarch64__)
+}
+
+void check_matching_tags(const void* a, const void* b) {
+  uintptr_t a_uptr = reinterpret_cast<uintptr_t>(a);
+  uintptr_t b_uptr = reinterpret_cast<uintptr_t>(b);
+#if defined(__aarch64__)
+  if (a_uptr >> 56 != b_uptr >> 56) {
+    fprintf(stderr, "**********************************\n");
+    fprintf(stderr, "Failed assertion:\n");
+    fprintf(stderr, "  tag(0x%zx) != tag(0x%zx)\n", a_uptr, b_uptr);
+    fprintf(stderr, "**********************************\n");
+
+    assertion_failure();
+  }
+#endif  // defined(__aarch64__)
+}
+
+void check_eq(const void* a, const void* b) {
+  if (a != b) {
+    fprintf(stderr, "**********************************\n");
+    fprintf(stderr, "Failed assertion:\n");
+    fprintf(stderr, "  %p != %p\n", a, b);
+    fprintf(stderr, "**********************************\n");
+
+    assertion_failure();
+  }
+}
+
+#define LONGEST_VARIABLE_NAME "51"
+void print_variable_address(const char* name, const void* ptr) {
+  printf("%" LONGEST_VARIABLE_NAME "s: %16p\n", name, ptr);
+}
+
+static const std::vector<std::pair<const char*, const void*>>& get_expected_tagged_vars() {
+  static std::vector<std::pair<const char*, const void*>> expected_tagged_vars = {
+      {"global_extern", &global_extern},
+      {"global_extern_hidden", &global_extern_hidden},
+  };
+  return expected_tagged_vars;
+}
+
+static const std::vector<std::pair<const char*, const void*>>& get_expected_untagged_vars() {
+  static std::vector<std::pair<const char*, const void*>> expected_untagged_vars = {
+      {"global_extern_untagged", &global_extern_untagged},
+      {"global_extern_untagged_definition_but_tagged_import",
+       &global_extern_untagged_definition_but_tagged_import},
+  };
+  return expected_untagged_vars;
+}
+
+void dso_print_variables() {
+  print_variables("  Variables declared in the DSO:\n", get_expected_tagged_vars(),
+                  get_expected_untagged_vars());
+}
+
+void print_variables(const char* header,
+                     const std::vector<std::pair<const char*, const void*>>& tagged_variables,
+                     const std::vector<std::pair<const char*, const void*>>& untagged_variables) {
+  printf("==========================================================\n");
+  printf("%s", header);
+  printf("==========================================================\n");
+  printf(" Variables expected to be tagged:\n");
+  printf("----------------------------------------------------------\n");
+  for (const auto& [name, pointer] : tagged_variables) {
+    print_variable_address(name, pointer);
+  }
+
+  printf("\n----------------------------------------------------------\n");
+  printf(" Variables expected to be untagged:\n");
+  printf("----------------------------------------------------------\n");
+  for (const auto& [name, pointer] : untagged_variables) {
+    print_variable_address(name, pointer);
+  }
+  printf("\n");
+}
+
+void dso_check_assertions(bool check_pointers_are_tagged) {
+  // Check that non-const variables are writeable.
+  global_extern = 0;
+  global_extern_hidden = 0;
+  global_extern_untagged = 0;
+  global_extern_untagged_definition_but_tagged_import = 0;
+
+  if (check_pointers_are_tagged) {
+    for (const auto& [_, pointer] : get_expected_tagged_vars()) {
+      check_tagged(pointer);
+    }
+  }
+
+  for (const auto& [_, pointer] : get_expected_untagged_vars()) {
+    check_untagged(pointer);
+  }
+}
diff --git a/tests/libs/mte_globals_relr_regression_test_b_314038442.cpp b/tests/libs/mte_globals_relr_regression_test_b_314038442.cpp
new file mode 100644
index 0000000..20bbba9
--- /dev/null
+++ b/tests/libs/mte_globals_relr_regression_test_b_314038442.cpp
@@ -0,0 +1,55 @@
+/*
+ * Copyright (C) 2024 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <stdint.h>
+#include <stdio.h>
+
+static volatile char array[0x10000];
+volatile char* volatile oob_ptr = &array[0x111111111];
+
+unsigned char get_tag(__attribute__((unused)) volatile void* ptr) {
+#if defined(__aarch64__)
+  return static_cast<unsigned char>(reinterpret_cast<uintptr_t>(ptr) >> 56) & 0xf;
+#else   // !defined(__aarch64__)
+  return 0;
+#endif  // defined(__aarch64__)
+}
+
+int main() {
+  printf("Program loaded successfully. %p %p. ", array, oob_ptr);
+  if (get_tag(array) != get_tag(oob_ptr)) {
+    printf("Tags are mismatched!\n");
+    return 1;
+  }
+  if (get_tag(array) == 0) {
+    printf("Tags are zero!\n");
+  } else {
+    printf("Tags are non-zero\n");
+  }
+  return 0;
+}
diff --git a/tests/memtag_globals_test.cpp b/tests/memtag_globals_test.cpp
new file mode 100644
index 0000000..ff93e7b
--- /dev/null
+++ b/tests/memtag_globals_test.cpp
@@ -0,0 +1,113 @@
+/*
+ * Copyright (C) 2024 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <gtest/gtest.h>
+
+#if defined(__BIONIC__)
+#include "gtest_globals.h"
+#include "utils.h"
+#endif  // defined(__BIONIC__)
+
+#include <android-base/test_utils.h>
+#include <sys/stat.h>
+#include <unistd.h>
+#include <string>
+#include <tuple>
+
+#include "platform/bionic/mte.h"
+
+class MemtagGlobalsTest : public testing::TestWithParam<bool> {};
+
+TEST_P(MemtagGlobalsTest, test) {
+  SKIP_WITH_HWASAN << "MTE globals tests are incompatible with HWASan";
+#if defined(__BIONIC__) && defined(__aarch64__)
+  std::string binary = GetTestLibRoot() + "/memtag_globals_binary";
+  bool is_static = MemtagGlobalsTest::GetParam();
+  if (is_static) {
+    binary += "_static";
+  }
+
+  chmod(binary.c_str(), 0755);
+  ExecTestHelper eth;
+  eth.SetArgs({binary.c_str(), nullptr});
+  eth.Run(
+      [&]() {
+        execve(binary.c_str(), eth.GetArgs(), eth.GetEnv());
+        GTEST_FAIL() << "Failed to execve: " << strerror(errno) << " " << binary.c_str();
+      },
+      // We catch the global-buffer-overflow and crash only when MTE globals is
+      // supported. Note that MTE globals is unsupported for fully static
+      // executables, but we should still make sure the binary passes its
+      // assertions, just that global variables won't be tagged.
+      (mte_supported() && !is_static) ? -SIGSEGV : 0, "Assertions were passed");
+#else
+  GTEST_SKIP() << "bionic/arm64 only";
+#endif
+}
+
+INSTANTIATE_TEST_SUITE_P(MemtagGlobalsTest, MemtagGlobalsTest, testing::Bool(),
+                         [](const ::testing::TestParamInfo<MemtagGlobalsTest::ParamType>& info) {
+                           if (info.param) return "MemtagGlobalsTest_static";
+                           return "MemtagGlobalsTest";
+                         });
+
+TEST(MemtagGlobalsTest, RelrRegressionTestForb314038442) {
+  SKIP_WITH_HWASAN << "MTE globals tests are incompatible with HWASan";
+#if defined(__BIONIC__) && defined(__aarch64__)
+  std::string binary = GetTestLibRoot() + "/mte_globals_relr_regression_test_b_314038442";
+  chmod(binary.c_str(), 0755);
+  ExecTestHelper eth;
+  eth.SetArgs({binary.c_str(), nullptr});
+  eth.Run(
+      [&]() {
+        execve(binary.c_str(), eth.GetArgs(), eth.GetEnv());
+        GTEST_FAIL() << "Failed to execve: " << strerror(errno) << " " << binary.c_str();
+      },
+      /* exit code */ 0, "Program loaded successfully.*Tags are zero!");
+#else
+  GTEST_SKIP() << "bionic/arm64 only";
+#endif
+}
+
+TEST(MemtagGlobalsTest, RelrRegressionTestForb314038442WithMteGlobals) {
+  if (!mte_supported()) GTEST_SKIP() << "Must have MTE support.";
+#if defined(__BIONIC__) && defined(__aarch64__)
+  std::string binary = GetTestLibRoot() + "/mte_globals_relr_regression_test_b_314038442_mte";
+  chmod(binary.c_str(), 0755);
+  ExecTestHelper eth;
+  eth.SetArgs({binary.c_str(), nullptr});
+  eth.Run(
+      [&]() {
+        execve(binary.c_str(), eth.GetArgs(), eth.GetEnv());
+        GTEST_FAIL() << "Failed to execve: " << strerror(errno) << " " << binary.c_str();
+      },
+      /* exit code */ 0, "Program loaded successfully.*Tags are non-zero");
+#else
+  GTEST_SKIP() << "bionic/arm64 only";
+#endif
+}
diff --git a/tests/memtag_stack_abi_test.cpp b/tests/memtag_stack_abi_test.cpp
new file mode 100644
index 0000000..4725c8d
--- /dev/null
+++ b/tests/memtag_stack_abi_test.cpp
@@ -0,0 +1,102 @@
+/*
+ * Copyright (C) 2024 The Android Open Source Project
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *  * Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  * Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
+ * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
+ * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
+ * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
+ * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
+ * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
+ * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
+ * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
+ * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include <filesystem>
+#include <fstream>
+#include <iterator>
+#include <string>
+#include <thread>
+
+#include <dlfcn.h>
+#include <stdlib.h>
+
+#include <android-base/logging.h>
+#include <gtest/gtest.h>
+
+static size_t NumberBuffers() {
+  size_t bufs = 0;
+  std::ifstream file("/proc/self/maps");
+  CHECK(file.is_open());
+  std::string line;
+  while (std::getline(file, line)) {
+    if (line.find("stack_mte_ring") != std::string::npos) {
+      ++bufs;
+    }
+  }
+  return bufs;
+}
+
+static size_t NumberThreads() {
+  std::filesystem::directory_iterator di("/proc/self/task");
+  return std::distance(begin(di), end(di));
+}
+
+TEST(MemtagStackAbiTest, MainThread) {
+#if defined(__BIONIC__) && defined(__aarch64__)
+  ASSERT_EQ(NumberBuffers(), 1U);
+  ASSERT_EQ(NumberBuffers(), NumberThreads());
+#else
+  GTEST_SKIP() << "requires bionic arm64";
+#endif
+}
+
+TEST(MemtagStackAbiTest, JoinableThread) {
+#if defined(__BIONIC__) && defined(__aarch64__)
+  ASSERT_EQ(NumberBuffers(), 1U);
+  ASSERT_EQ(NumberBuffers(), NumberThreads());
+  std::thread th([] {
+    ASSERT_EQ(NumberBuffers(), 2U);
+    ASSERT_EQ(NumberBuffers(), NumberThreads());
+  });
+  th.join();
+  ASSERT_EQ(NumberBuffers(), 1U);
+  ASSERT_EQ(NumberBuffers(), NumberThreads());
+#else
+  GTEST_SKIP() << "requires bionic arm64";
+#endif
+}
+
+TEST(MemtagStackAbiTest, DetachedThread) {
+#if defined(__BIONIC__) && defined(__aarch64__)
+  ASSERT_EQ(NumberBuffers(), 1U);
+  ASSERT_EQ(NumberBuffers(), NumberThreads());
+  std::thread th([] {
+    ASSERT_EQ(NumberBuffers(), 2U);
+    ASSERT_EQ(NumberBuffers(), NumberThreads());
+  });
+  th.detach();
+  // Leave the thread some time to exit.
+  for (int i = 0; NumberBuffers() != 1 && i < 3; ++i) {
+    sleep(1);
+  }
+  ASSERT_EQ(NumberBuffers(), 1U);
+  ASSERT_EQ(NumberBuffers(), NumberThreads());
+#else
+  GTEST_SKIP() << "requires bionic arm64";
+#endif
+}
diff --git a/tests/pthread_test.cpp b/tests/pthread_test.cpp
index 2bf755b..7223784 100644
--- a/tests/pthread_test.cpp
+++ b/tests/pthread_test.cpp
@@ -45,6 +45,7 @@
 #include <android-base/test_utils.h>
 
 #include "private/bionic_constants.h"
+#include "private/bionic_time_conversions.h"
 #include "SignalUtils.h"
 #include "utils.h"
 
@@ -2437,23 +2438,25 @@
   ts.tv_sec = -1;
   ASSERT_EQ(ETIMEDOUT, lock_function(&m, &ts));
 
-  // check we wait long enough for the lock.
+  // Check we wait long enough for the lock before timing out...
+
+  // What time is it before we start?
   ASSERT_EQ(0, clock_gettime(clock, &ts));
-  const int64_t start_ns = ts.tv_sec * NS_PER_S + ts.tv_nsec;
-
-  // add a second to get deadline.
+  const int64_t start_ns = to_ns(ts);
+  // Add a second to get deadline, and wait until we time out.
   ts.tv_sec += 1;
-
   ASSERT_EQ(ETIMEDOUT, lock_function(&m, &ts));
 
+  // What time is it now we've timed out?
+  timespec ts2;
+  clock_gettime(clock, &ts2);
+  const int64_t end_ns = to_ns(ts2);
+
   // The timedlock must have waited at least 1 second before returning.
-  clock_gettime(clock, &ts);
-  const int64_t end_ns = ts.tv_sec * NS_PER_S + ts.tv_nsec;
-  ASSERT_GT(end_ns - start_ns, NS_PER_S);
+  ASSERT_GE(end_ns - start_ns, NS_PER_S);
 
   // If the mutex is unlocked, pthread_mutex_timedlock should succeed.
   ASSERT_EQ(0, pthread_mutex_unlock(&m));
-
   ASSERT_EQ(0, clock_gettime(clock, &ts));
   ts.tv_sec += 1;
   ASSERT_EQ(0, lock_function(&m, &ts));
@@ -2474,12 +2477,19 @@
 #endif  // __BIONIC__
 }
 
-TEST(pthread, pthread_mutex_clocklock) {
+TEST(pthread, pthread_mutex_clocklock_MONOTONIC) {
 #if defined(__BIONIC__)
   pthread_mutex_timedlock_helper(
       CLOCK_MONOTONIC, [](pthread_mutex_t* __mutex, const timespec* __timeout) {
         return pthread_mutex_clocklock(__mutex, CLOCK_MONOTONIC, __timeout);
       });
+#else   // __BIONIC__
+  GTEST_SKIP() << "pthread_mutex_clocklock not available";
+#endif  // __BIONIC__
+}
+
+TEST(pthread, pthread_mutex_clocklock_REALTIME) {
+#if defined(__BIONIC__)
   pthread_mutex_timedlock_helper(
       CLOCK_REALTIME, [](pthread_mutex_t* __mutex, const timespec* __timeout) {
         return pthread_mutex_clocklock(__mutex, CLOCK_REALTIME, __timeout);
diff --git a/tests/struct_layout_test.cpp b/tests/struct_layout_test.cpp
index 1f04344..b9fd315 100644
--- a/tests/struct_layout_test.cpp
+++ b/tests/struct_layout_test.cpp
@@ -30,7 +30,7 @@
 #define CHECK_OFFSET(name, field, offset) \
     check_offset(#name, #field, offsetof(name, field), offset);
 #ifdef __LP64__
-  CHECK_SIZE(pthread_internal_t, 816);
+  CHECK_SIZE(pthread_internal_t, 824);
   CHECK_OFFSET(pthread_internal_t, next, 0);
   CHECK_OFFSET(pthread_internal_t, prev, 8);
   CHECK_OFFSET(pthread_internal_t, tid, 16);
@@ -57,6 +57,7 @@
   CHECK_OFFSET(pthread_internal_t, errno_value, 768);
   CHECK_OFFSET(pthread_internal_t, bionic_tcb, 776);
   CHECK_OFFSET(pthread_internal_t, stack_mte_ringbuffer_vma_name_buffer, 784);
+  CHECK_OFFSET(pthread_internal_t, should_allocate_stack_mte_ringbuffer, 816);
   CHECK_SIZE(bionic_tls, 12200);
   CHECK_OFFSET(bionic_tls, key_data, 0);
   CHECK_OFFSET(bionic_tls, locale, 2080);
@@ -74,7 +75,7 @@
   CHECK_OFFSET(bionic_tls, bionic_systrace_disabled, 12193);
   CHECK_OFFSET(bionic_tls, padding, 12194);
 #else
-  CHECK_SIZE(pthread_internal_t, 704);
+  CHECK_SIZE(pthread_internal_t, 708);
   CHECK_OFFSET(pthread_internal_t, next, 0);
   CHECK_OFFSET(pthread_internal_t, prev, 4);
   CHECK_OFFSET(pthread_internal_t, tid, 8);
@@ -101,6 +102,7 @@
   CHECK_OFFSET(pthread_internal_t, errno_value, 664);
   CHECK_OFFSET(pthread_internal_t, bionic_tcb, 668);
   CHECK_OFFSET(pthread_internal_t, stack_mte_ringbuffer_vma_name_buffer, 672);
+  CHECK_OFFSET(pthread_internal_t, should_allocate_stack_mte_ringbuffer, 704);
   CHECK_SIZE(bionic_tls, 11080);
   CHECK_OFFSET(bionic_tls, key_data, 0);
   CHECK_OFFSET(bionic_tls, locale, 1040);
diff --git a/tests/sys_time_test.cpp b/tests/sys_time_test.cpp
index ff9271f..b0e52aa 100644
--- a/tests/sys_time_test.cpp
+++ b/tests/sys_time_test.cpp
@@ -23,6 +23,7 @@
 
 #include <android-base/file.h>
 
+#include "private/bionic_time_conversions.h"
 #include "utils.h"
 
 // http://b/11383777
@@ -147,14 +148,6 @@
   ASSERT_EQ(0, syscall(__NR_gettimeofday, &tv2, nullptr));
 
   // What's the difference between the two?
-  tv2.tv_sec -= tv1.tv_sec;
-  tv2.tv_usec -= tv1.tv_usec;
-  if (tv2.tv_usec < 0) {
-    --tv2.tv_sec;
-    tv2.tv_usec += 1000000;
-  }
-
   // To try to avoid flakiness we'll accept answers within 10,000us (0.01s).
-  ASSERT_EQ(0, tv2.tv_sec);
-  ASSERT_LT(tv2.tv_usec, 10'000);
+  ASSERT_LT(to_us(tv2) - to_us(tv1), 10'000);
 }