Merge "Start using libarm-optimized-routines in libm."
diff --git a/android-changes-for-ndk-developers.md b/android-changes-for-ndk-developers.md
new file mode 120000
index 0000000..328d3eb
--- /dev/null
+++ b/android-changes-for-ndk-developers.md
@@ -0,0 +1 @@
+docs/android-changes-for-ndk-developers.md
\ No newline at end of file
diff --git a/android-changes-for-ndk-developers.md b/docs/android-changes-for-ndk-developers.md
similarity index 100%
rename from android-changes-for-ndk-developers.md
rename to docs/android-changes-for-ndk-developers.md
diff --git a/docs/libc_assembler.md b/docs/libc_assembler.md
new file mode 100644
index 0000000..151f265
--- /dev/null
+++ b/docs/libc_assembler.md
@@ -0,0 +1,148 @@
+Validing libc Assembler Routines
+================================
+This document describes how to verify incoming assembler libc routines.
+
+## Quick Start
+* First, benchmark the previous version of the routine.
+* Update the routine, run the bionic unit tests to verify the routine doesn't
+have any bugs. See the [Testing](#Testing) section for details about how to
+verify that the routine is being properly tested.
+* Rerun the benchmarks using the updated image that uses the code for
+the new routine. See the [Performance](#Performance) section for details about
+benchmarking.
+* Verify that unwind information for new routine looks sane. See the [Unwind Info](#unwind-info) section for details about how to verify this.
+
+When benchmarking, it's best to verify on the latest Pixel device supported.
+Make sure that you benchmark both the big and little cores to verify that
+there is no major difference in performance on each.
+
+Benchmark 64 bit memcmp:
+
+ /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --bionic_xml=string.xml memcmp
+
+Benchmark 32 bit memcmp:
+
+ /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --bionic_xml=string.xml memcmp
+
+Locking to a specific cpu:
+
+ /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --bionic_cpu=2 --bionic_xml=string.xml memcmp
+
+## Performance
+The bionic benchmarks are used to verify the performance of changes to
+routines. For most routines, there should already be benchmarks available.
+
+Building
+--------
+The bionic benchmarks are not built by default, they must be built separately
+and pushed on to the device. The commands below show how to do this.
+
+ mmma -j bionic/benchmarks
+ adb sync data
+
+Running
+-------
+There are two bionic benchmarks executables:
+
+ /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks
+
+This is for 64 bit libc routines.
+
+ /data/benchmarktest/bionic-benchmarks/bionic-benchmarks
+
+This is for 32 bit libc routines.
+
+Here is an example of how the benchmark should be executed. For this
+command to work, you need to change directory to one of the above
+directories.
+
+ bionic-benchmarks --bionic_xml=suites/string.xml memcmp
+
+The last argument is the name of the one function that you want to
+benchmark.
+
+Almost all routines are already defined in the **string.xml** file in
+**bionic/benchmarks/suites**. Look at the examples in that file to see
+how to add a benchmark for a function that doesn't already exist.
+
+It can take a long time to run these tests since it attempts to test a
+large number of sizes and alignments.
+
+Results
+-------
+Bionic benchmarks is based on the [Google Benchmarks](https://github.com/google/benchmark)
+library. An example of the output looks like this:
+
+ Run on (8 X 1844 MHz CPU s)
+ CPU Caches:
+ L1 Data 32K (x8)
+ L1 Instruction 32K (x8)
+ L2 Unified 512K (x2)
+ ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
+ -------------------------------------------------------------------------------------------
+ Benchmark Time CPU Iterations
+ -------------------------------------------------------------------------------------------
+ BM_string_memcmp/1/0/0 6 ns 6 ns 120776418 164.641MB/s
+ BM_string_memcmp/1/1/1 6 ns 6 ns 120856788 164.651MB/s
+
+The smaller the time, the better the performance.
+
+Caveats
+-------
+When running the benchmarks, CPU scaling is not normally enabled. This means
+that if the device does not get up to the maximum cpu frequency, the results
+can vary wildly. It's possible to lock the cpu to the maximum frequency, but
+is beyond the scope of this document. However, most of the benchmarks max
+out the cpu very quickly on Pixel devices, and don't affect the results.
+
+Another potential issue is that the device can overheat when running the
+benchmarks. To avoid this, you can run the device in a cool environment,
+or choose a device that is less likely to overheat. To detect these kind
+of issues, you can run a subset of the tests again. At the very least, it's
+always a good idea to rerun the suite a couple of times to verify that
+there isn't a high variation in the numbers.
+
+## Testing
+
+Run the bionic tests to verify that the new routines are valid. However,
+you should verify that there is coverage of the new routines. This is
+especially important if this is the first time a routine is assembler.
+
+Caveats
+-------
+When verifying an assembler routine that operates on buffer data (such as
+memcpy/strcpy), it's important to verify these corner cases:
+
+* Verify the routine does not read past the end of the buffers. Many
+assembler routines optimize by reading multipe bytes at a time and can
+read past the end. This kind of bug results in an infrequent and difficult to
+diagnosis crash.
+* Verify the routine handles unaligned buffers properly. Usually, a failure
+can result in an unaligned exception.
+* Verify the routine handles different sized buffers.
+
+If there are not sufficient tests for a new routine, there are a set of helper
+functions that can be used to verify the above corner cases. See the
+header **bionic/tests/buffer\_tests.h** for these routines and look at
+**bionic/tests/string\_test.cpp** for examples of how to use it.
+
+## Unwind Info
+It is also important to verify that the unwind information for these
+routines are properly set up. Here is a quick checklist of what to check:
+
+* Verify that all labels are of the format .LXXX, where XXX is any valid string
+for a label. If any other label is used, entries in the symbol table
+will be generated that include these labels. In that case, you will get
+an unwind with incorrect function information.
+* Verify that all places where pop/pushes or instructions that modify the
+sp in any way have corresponding cfi information. Along with this item,
+verify that when registers are pushed on the stack that there is cfi
+information indicating how to get the register.
+* Verify that only cfi directives are being used. This only matters for
+arm32, where it's possible to use ARM specific unwind directives.
+
+This list is not meant to be exhaustive, but a minimal set of items to verify
+before submitting a new libc assembler routine. There are difficult
+to verify unwind cases, such as around branches, where unwind information
+can be drastically different for the target of the branch and for the
+code after a branch instruction.
diff --git a/linker/linker_main.cpp b/linker/linker_main.cpp
index 43f12d3..3410f90 100644
--- a/linker/linker_main.cpp
+++ b/linker/linker_main.cpp
@@ -56,6 +56,9 @@
static ElfW(Addr) get_elf_exec_load_bias(const ElfW(Ehdr)* elf);
+static void get_elf_base_from_phdr(const ElfW(Phdr)* phdr_table, size_t phdr_count,
+ ElfW(Addr)* base, ElfW(Addr)* load_bias);
+
// These should be preserved static to avoid emitting
// RELATIVE relocations for the part of the code running
// before linker links itself.
@@ -321,24 +324,8 @@
si->phdr = reinterpret_cast<ElfW(Phdr)*>(args.getauxval(AT_PHDR));
si->phnum = args.getauxval(AT_PHNUM);
- /* Compute the value of si->base. We can't rely on the fact that
- * the first entry is the PHDR because this will not be true
- * for certain executables (e.g. some in the NDK unit test suite)
- */
- si->base = 0;
+ get_elf_base_from_phdr(si->phdr, si->phnum, &si->base, &si->load_bias);
si->size = phdr_table_get_load_size(si->phdr, si->phnum);
- si->load_bias = 0;
- for (size_t i = 0; i < si->phnum; ++i) {
- if (si->phdr[i].p_type == PT_PHDR) {
- si->load_bias = reinterpret_cast<ElfW(Addr)>(si->phdr) - si->phdr[i].p_vaddr;
- si->base = reinterpret_cast<ElfW(Addr)>(si->phdr) - si->phdr[i].p_offset;
- break;
- }
- }
-
- if (si->base == 0) {
- async_safe_fatal("Could not find a PHDR: broken executable?");
- }
si->dynamic = nullptr;
@@ -503,6 +490,23 @@
return 0;
}
+/* Find the load bias and base address of an executable or shared object loaded
+ * by the kernel. The ELF file's PHDR table must have a PT_PHDR entry.
+ *
+ * A VDSO doesn't have a PT_PHDR entry in its PHDR table.
+ */
+static void get_elf_base_from_phdr(const ElfW(Phdr)* phdr_table, size_t phdr_count,
+ ElfW(Addr)* base, ElfW(Addr)* load_bias) {
+ for (size_t i = 0; i < phdr_count; ++i) {
+ if (phdr_table[i].p_type == PT_PHDR) {
+ *load_bias = reinterpret_cast<ElfW(Addr)>(phdr_table) - phdr_table[i].p_vaddr;
+ *base = reinterpret_cast<ElfW(Addr)>(phdr_table) - phdr_table[i].p_offset;
+ return;
+ }
+ }
+ async_safe_fatal("Could not find a PHDR: broken executable?");
+}
+
static ElfW(Addr) __attribute__((noinline))
__linker_init_post_relocation(KernelArgumentBlock& args,
ElfW(Addr) linker_addr,
@@ -524,22 +528,17 @@
__libc_init_sysinfo(args);
#endif
- // AT_BASE is set to 0 in the case when linker is run by iself
- // so in order to link the linker it needs to calcuate AT_BASE
- // using information at hand. The trick below takes advantage
- // of the fact that the value of linktime_addr before relocations
- // are run is an offset and this can be used to calculate AT_BASE.
- static uintptr_t linktime_addr = reinterpret_cast<uintptr_t>(&linktime_addr);
- ElfW(Addr) linker_addr = reinterpret_cast<uintptr_t>(&linktime_addr) - linktime_addr;
-
-#if defined(__clang_analyzer__)
- // The analyzer assumes that linker_addr will always be null. Make it an
- // unknown value so we don't have to mark N places with NOLINTs.
- //
- // (`+=`, rather than `=`, allows us to sidestep a potential "unused store"
- // complaint)
- linker_addr += reinterpret_cast<uintptr_t>(raw_args);
-#endif
+ ElfW(Addr) linker_addr = args.getauxval(AT_BASE);
+ if (linker_addr == 0) {
+ // When the linker is run by itself (rather than as an interpreter for
+ // another program), AT_BASE is 0. In that case, the AT_PHDR and AT_PHNUM
+ // aux values describe the linker, so use the phdr to find the linker's
+ // base address.
+ ElfW(Addr) load_bias;
+ get_elf_base_from_phdr(
+ reinterpret_cast<ElfW(Phdr)*>(args.getauxval(AT_PHDR)), args.getauxval(AT_PHNUM),
+ &linker_addr, &load_bias);
+ }
ElfW(Ehdr)* elf_hdr = reinterpret_cast<ElfW(Ehdr)*>(linker_addr);
ElfW(Phdr)* phdr = reinterpret_cast<ElfW(Phdr)*>(linker_addr + elf_hdr->e_phoff);
diff --git a/tests/eventfd_test.cpp b/tests/eventfd_test.cpp
index aa88a3b..68d9192 100644
--- a/tests/eventfd_test.cpp
+++ b/tests/eventfd_test.cpp
@@ -19,20 +19,7 @@
#include <gtest/gtest.h>
-#if defined(__BIONIC__) // Android's prebuilt gcc's header files don't include <sys/eventfd.h>.
#include <sys/eventfd.h>
-#else
-// Include the necessary components of sys/eventfd.h right here.
-#include <stdint.h>
-
-typedef uint64_t eventfd_t;
-
-__BEGIN_DECLS
-extern int eventfd(int, int);
-extern int eventfd_read(int, eventfd_t*);
-extern int eventfd_write(int, eventfd_t);
-__END_DECLS
-#endif
TEST(eventfd, smoke) {
unsigned int initial_value = 2;