Merge "Start using libarm-optimized-routines in libm."

commit: e2cab6422a54ab2c8b88f4db043fab94db79bead [log] [tgz]
author: Treehugger Robot <treehugger-gerrit@google.com> Fri Aug 03 17:48:33 2018 +0000
committer: Gerrit Code Review <noreply-gerritcodereview@google.com> Fri Aug 03 17:48:33 2018 +0000
tree: 2b9905f0231c51b1d9d0d90770256a72b3c2d651
parent: 65f82092a17518080178ff7004cc6db362ebfbcd [diff]
parent: bc46fb86d009232109ab6097d205a77eb21f404e [diff]
diff --git a/android-changes-for-ndk-developers.md b/android-changes-for-ndk-developers.md
new file mode 120000
index 0000000..328d3eb
--- /dev/null
+++ b/android-changes-for-ndk-developers.md

@@ -0,0 +1 @@
+docs/android-changes-for-ndk-developers.md
\ No newline at end of file

diff --git a/android-changes-for-ndk-developers.md b/docs/android-changes-for-ndk-developers.md
similarity index 100%
rename from android-changes-for-ndk-developers.md
rename to docs/android-changes-for-ndk-developers.md


diff --git a/docs/libc_assembler.md b/docs/libc_assembler.md
new file mode 100644
index 0000000..151f265
--- /dev/null
+++ b/docs/libc_assembler.md

@@ -0,0 +1,148 @@
+Validing libc Assembler Routines
+================================
+This document describes how to verify incoming assembler libc routines.
+
+## Quick Start
+* First, benchmark the previous version of the routine.
+* Update the routine, run the bionic unit tests to verify the routine doesn't
+have any bugs. See the [Testing](#Testing) section for details about how to
+verify that the routine is being properly tested.
+* Rerun the benchmarks using the updated image that uses the code for
+the new routine. See the [Performance](#Performance) section for details about
+benchmarking.
+* Verify that unwind information for new routine looks sane. See the [Unwind Info](#unwind-info) section for details about how to verify this.
+
+When benchmarking, it's best to verify on the latest Pixel device supported.
+Make sure that you benchmark both the big and little cores to verify that
+there is no major difference in performance on each.
+
+Benchmark 64 bit memcmp:
+
+    /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks --bionic_xml=string.xml memcmp
+
+Benchmark 32 bit memcmp:
+
+    /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --bionic_xml=string.xml memcmp
+
+Locking to a specific cpu:
+
+    /data/benchmarktest/bionic-benchmarks/bionic-benchmarks --bionic_cpu=2 --bionic_xml=string.xml memcmp
+
+## Performance
+The bionic benchmarks are used to verify the performance of changes to
+routines. For most routines, there should already be benchmarks available.
+
+Building
+--------
+The bionic benchmarks are not built by default, they must be built separately
+and pushed on to the device. The commands below show how to do this.
+
+    mmma -j bionic/benchmarks
+    adb sync data
+
+Running
+-------
+There are two bionic benchmarks executables:
+
+    /data/benchmarktest64/bionic-benchmarks/bionic-benchmarks
+
+This is for 64 bit libc routines.
+
+    /data/benchmarktest/bionic-benchmarks/bionic-benchmarks
+
+This is for 32 bit libc routines.
+
+Here is an example of how the benchmark should be executed. For this
+command to work, you need to change directory to one of the above
+directories.
+
+    bionic-benchmarks --bionic_xml=suites/string.xml memcmp
+
+The last argument is the name of the one function that you want to
+benchmark.
+
+Almost all routines are already defined in the **string.xml** file in
+**bionic/benchmarks/suites**. Look at the examples in that file to see
+how to add a benchmark for a function that doesn't already exist.
+
+It can take a long time to run these tests since it attempts to test a
+large number of sizes and alignments.
+
+Results
+-------
+Bionic benchmarks is based on the [Google Benchmarks](https://github.com/google/benchmark)
+library. An example of the output looks like this:
+
+    Run on (8 X 1844 MHz CPU s)
+    CPU Caches:
+      L1 Data 32K (x8)
+      L1 Instruction 32K (x8)
+      L2 Unified 512K (x2)
+    ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
+    -------------------------------------------------------------------------------------------
+    Benchmark                                                    Time           CPU Iterations
+    -------------------------------------------------------------------------------------------
+    BM_string_memcmp/1/0/0                                       6 ns          6 ns  120776418   164.641MB/s
+    BM_string_memcmp/1/1/1                                       6 ns          6 ns  120856788   164.651MB/s
+
+The smaller the time, the better the performance.
+
+Caveats
+-------
+When running the benchmarks, CPU scaling is not normally enabled. This means
+that if the device does not get up to the maximum cpu frequency, the results
+can vary wildly. It's possible to lock the cpu to the maximum frequency, but
+is beyond the scope of this document. However, most of the benchmarks max
+out the cpu very quickly on Pixel devices, and don't affect the results.
+
+Another potential issue is that the device can overheat when running the
+benchmarks. To avoid this, you can run the device in a cool environment,
+or choose a device that is less likely to overheat. To detect these kind
+of issues, you can run a subset of the tests again. At the very least, it's
+always a good idea to rerun the suite a couple of times to verify that
+there isn't a high variation in the numbers.
+
+## Testing
+
+Run the bionic tests to verify that the new routines are valid. However,
+you should verify that there is coverage of the new routines. This is
+especially important if this is the first time a routine is assembler.
+
+Caveats
+-------
+When verifying an assembler routine that operates on buffer data (such as
+memcpy/strcpy), it's important to verify these corner cases:
+
+* Verify the routine does not read past the end of the buffers. Many
+assembler routines optimize by reading multipe bytes at a time and can
+read past the end. This kind of bug results in an infrequent and difficult to
+diagnosis crash.
+* Verify the routine handles unaligned buffers properly. Usually, a failure
+can result in an unaligned exception.
+* Verify the routine handles different sized buffers.
+
+If there are not sufficient tests for a new routine, there are a set of helper
+functions that can be used to verify the above corner cases. See the
+header **bionic/tests/buffer\_tests.h** for these routines and look at
+**bionic/tests/string\_test.cpp** for examples of how to use it.
+
+## Unwind Info
+It is also important to verify that the unwind information for these
+routines are properly set up. Here is a quick checklist of what to check:
+
+* Verify that all labels are of the format .LXXX, where XXX is any valid string
+for a label. If any other label is used, entries in the symbol table
+will be generated that include these labels. In that case, you will get
+an unwind with incorrect function information.
+* Verify that all places where pop/pushes or instructions that modify the
+sp in any way have corresponding cfi information. Along with this item,
+verify that when registers are pushed on the stack that there is cfi
+information indicating how to get the register.
+* Verify that only cfi directives are being used. This only matters for
+arm32, where it's possible to use ARM specific unwind directives.
+
+This list is not meant to be exhaustive, but a minimal set of items to verify
+before submitting a new libc assembler routine. There are difficult
+to verify unwind cases, such as around branches, where unwind information
+can be drastically different for the target of the branch and for the
+code after a branch instruction.

diff --git a/linker/linker_main.cpp b/linker/linker_main.cpp
index 43f12d3..3410f90 100644
--- a/linker/linker_main.cpp
+++ b/linker/linker_main.cpp

@@ -56,6 +56,9 @@
 
 static ElfW(Addr) get_elf_exec_load_bias(const ElfW(Ehdr)* elf);
 
+static void get_elf_base_from_phdr(const ElfW(Phdr)* phdr_table, size_t phdr_count,
+                                   ElfW(Addr)* base, ElfW(Addr)* load_bias);
+
 // These should be preserved static to avoid emitting
 // RELATIVE relocations for the part of the code running
 // before linker links itself.
@@ -321,24 +324,8 @@
   si->phdr = reinterpret_cast<ElfW(Phdr)*>(args.getauxval(AT_PHDR));
   si->phnum = args.getauxval(AT_PHNUM);
 
-  /* Compute the value of si->base. We can't rely on the fact that
-   * the first entry is the PHDR because this will not be true
-   * for certain executables (e.g. some in the NDK unit test suite)
-   */
-  si->base = 0;
+  get_elf_base_from_phdr(si->phdr, si->phnum, &si->base, &si->load_bias);
   si->size = phdr_table_get_load_size(si->phdr, si->phnum);
-  si->load_bias = 0;
-  for (size_t i = 0; i < si->phnum; ++i) {
-    if (si->phdr[i].p_type == PT_PHDR) {
-      si->load_bias = reinterpret_cast<ElfW(Addr)>(si->phdr) - si->phdr[i].p_vaddr;
-      si->base = reinterpret_cast<ElfW(Addr)>(si->phdr) - si->phdr[i].p_offset;
-      break;
-    }
-  }
-
-  if (si->base == 0) {
-    async_safe_fatal("Could not find a PHDR: broken executable?");
-  }
 
   si->dynamic = nullptr;
 
@@ -503,6 +490,23 @@
   return 0;
 }
 
+/* Find the load bias and base address of an executable or shared object loaded
+ * by the kernel. The ELF file's PHDR table must have a PT_PHDR entry.
+ *
+ * A VDSO doesn't have a PT_PHDR entry in its PHDR table.
+ */
+static void get_elf_base_from_phdr(const ElfW(Phdr)* phdr_table, size_t phdr_count,
+                                   ElfW(Addr)* base, ElfW(Addr)* load_bias) {
+  for (size_t i = 0; i < phdr_count; ++i) {
+    if (phdr_table[i].p_type == PT_PHDR) {
+      *load_bias = reinterpret_cast<ElfW(Addr)>(phdr_table) - phdr_table[i].p_vaddr;
+      *base = reinterpret_cast<ElfW(Addr)>(phdr_table) - phdr_table[i].p_offset;
+      return;
+    }
+  }
+  async_safe_fatal("Could not find a PHDR: broken executable?");
+}
+
 static ElfW(Addr) __attribute__((noinline))
 __linker_init_post_relocation(KernelArgumentBlock& args,
                               ElfW(Addr) linker_addr,
@@ -524,22 +528,17 @@
   __libc_init_sysinfo(args);
 #endif
 
-  // AT_BASE is set to 0 in the case when linker is run by iself
-  // so in order to link the linker it needs to calcuate AT_BASE
-  // using information at hand. The trick below takes advantage
-  // of the fact that the value of linktime_addr before relocations
-  // are run is an offset and this can be used to calculate AT_BASE.
-  static uintptr_t linktime_addr = reinterpret_cast<uintptr_t>(&linktime_addr);
-  ElfW(Addr) linker_addr = reinterpret_cast<uintptr_t>(&linktime_addr) - linktime_addr;
-
-#if defined(__clang_analyzer__)
-  // The analyzer assumes that linker_addr will always be null. Make it an
-  // unknown value so we don't have to mark N places with NOLINTs.
-  //
-  // (`+=`, rather than `=`, allows us to sidestep a potential "unused store"
-  // complaint)
-  linker_addr += reinterpret_cast<uintptr_t>(raw_args);
-#endif
+  ElfW(Addr) linker_addr = args.getauxval(AT_BASE);
+  if (linker_addr == 0) {
+    // When the linker is run by itself (rather than as an interpreter for
+    // another program), AT_BASE is 0. In that case, the AT_PHDR and AT_PHNUM
+    // aux values describe the linker, so use the phdr to find the linker's
+    // base address.
+    ElfW(Addr) load_bias;
+    get_elf_base_from_phdr(
+      reinterpret_cast<ElfW(Phdr)*>(args.getauxval(AT_PHDR)), args.getauxval(AT_PHNUM),
+      &linker_addr, &load_bias);
+  }
 
   ElfW(Ehdr)* elf_hdr = reinterpret_cast<ElfW(Ehdr)*>(linker_addr);
   ElfW(Phdr)* phdr = reinterpret_cast<ElfW(Phdr)*>(linker_addr + elf_hdr->e_phoff);

diff --git a/tests/eventfd_test.cpp b/tests/eventfd_test.cpp
index aa88a3b..68d9192 100644
--- a/tests/eventfd_test.cpp
+++ b/tests/eventfd_test.cpp

@@ -19,20 +19,7 @@
 
 #include <gtest/gtest.h>
 
-#if defined(__BIONIC__) // Android's prebuilt gcc's header files don't include <sys/eventfd.h>.
 #include <sys/eventfd.h>
-#else
-// Include the necessary components of sys/eventfd.h right here.
-#include <stdint.h>
-
-typedef uint64_t eventfd_t;
-
-__BEGIN_DECLS
-extern int eventfd(int, int);
-extern int eventfd_read(int, eventfd_t*);
-extern int eventfd_write(int, eventfd_t);
-__END_DECLS
-#endif
 
 TEST(eventfd, smoke) {
   unsigned int initial_value = 2;
commit	e2cab6422a54ab2c8b88f4db043fab94db79bead	[log] [tgz]
author	Treehugger Robot <treehugger-gerrit@google.com>	Fri Aug 03 17:48:33 2018 +0000
committer	Gerrit Code Review <noreply-gerritcodereview@google.com>	Fri Aug 03 17:48:33 2018 +0000
tree	2b9905f0231c51b1d9d0d90770256a72b3c2d651
parent	65f82092a17518080178ff7004cc6db362ebfbcd [diff]
parent	bc46fb86d009232109ab6097d205a77eb21f404e [diff]