[AArch64] Improve strncmp for mutually misaligned inputs

This patch was originally written by Siddhesh Poyarekar and pushed on
cortex-strings [1]. The mutually misaligned inputs on aarch64 are
compared with a simple byte copy, which is not very efficient.
This patch enhances the comparison similar to strcmp by loading a
double-word at a time.

Comparison on the default bionic and proposed optimized routines
shows the following performance improvements on A54 (using the
new proposed memcmp input data from test_strncmp.xml):

  - No noticeable change on aligned inputs or with same alignment.

  - Large improvements on unaligned inputs from sizes larger than
    16 bytes.

Benchmark                               Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------
BM_string_strncmp/1/0/0              -0.0954         -0.0954            19            17            19            17
BM_string_strncmp/2/0/0              -0.0344         -0.0344            19            18            19            18
BM_string_strncmp/3/0/0              +0.1768         +0.1768            15            18            15            18
BM_string_strncmp/4/0/0              -0.0344         -0.0344            19            18            19            18
BM_string_strncmp/5/0/0              -0.0344         -0.0344            19            18            19            18
BM_string_strncmp/6/0/0              +0.1589         +0.1589            15            18            15            18
BM_string_strncmp/7/0/0              -0.0344         -0.0344            19            18            19            18
BM_string_strncmp/8/0/0              -0.0998         -0.0998            19            17            19            17
BM_string_strncmp/9/0/0              -0.0277         -0.0277            23            22            23            22
BM_string_strncmp/10/0/0             -0.0270         -0.0270            23            22            23            22
BM_string_strncmp/11/0/0             -0.0331         -0.0331            23            22            23            22
BM_string_strncmp/12/0/0             -0.0270         -0.0270            23            22            23            22
BM_string_strncmp/13/0/0             -0.0284         -0.0284            23            22            23            22
BM_string_strncmp/14/0/0             +0.1042         +0.1042            20            22            20            22
BM_string_strncmp/15/0/0             -0.0277         -0.0277            23            22            23            22
BM_string_strncmp/16/0/0             +0.0214         +0.0215            22            22            22            22
BM_string_strncmp/24/0/0             -0.1291         -0.1291            24            21            24            21
BM_string_strncmp/32/0/0             -0.0470         -0.0470            27            26            27            26
BM_string_strncmp/40/0/0             -0.0433         -0.0433            29            28            29            28
BM_string_strncmp/48/0/0             -0.0301         -0.0301            31            30            31            30
BM_string_strncmp/56/0/0             -0.0800         -0.0800            33            31            33            31
BM_string_strncmp/64/0/0             +0.0188         +0.0188            34            34            34            34
BM_string_strncmp/72/0/0             -0.0334         -0.0334            38            37            38            37
BM_string_strncmp/80/0/0             -0.0000         -0.0000            40            40            40            40
BM_string_strncmp/88/0/0             +0.0413         +0.0413            61            64            61            64
BM_string_strncmp/96/0/0             -0.0215         -0.0216            69            67            69            67
BM_string_strncmp/104/0/0            -0.0208         -0.0208            72            70            72            70
BM_string_strncmp/112/0/0            -0.0173         -0.0173            75            74            75            74
BM_string_strncmp/120/0/0            -0.0166         -0.0166            78            77            78            77
BM_string_strncmp/128/0/0            -0.0158         -0.0158            81            80            81            80
BM_string_strncmp/136/0/0            -0.0149         -0.0149            84            83            84            83
BM_string_strncmp/144/0/0            -0.0201         -0.0201            88            86            88            86
BM_string_strncmp/160/0/0            -0.0136         -0.0136            94            93            94            93
BM_string_strncmp/176/0/0            +0.0224         +0.0224            96            98            96            98
BM_string_strncmp/192/0/0            +0.0289         +0.0289           102           105           102           105
BM_string_strncmp/208/0/0            +0.0101         +0.0101           111           112           111           112
BM_string_strncmp/224/0/0            -0.0107         -0.0107           119           118           119           118
BM_string_strncmp/240/0/0            -0.0088         -0.0088           126           125           126           125
BM_string_strncmp/256/0/0            -0.0101         -0.0101           132           131           132           131
BM_string_strncmp/512/0/0            -0.0056         -0.0056           235           233           235           233
BM_string_strncmp/1024/0/0           -0.0030         -0.0030           439           437           439           437
BM_string_strncmp/8192/0/0           -0.0431         -0.0431          3799          3635          3799          3635
BM_string_strncmp/16384/0/0          -0.0069         -0.0069          6778          6732          6779          6732
BM_string_strncmp/32768/0/0          -0.0001         -0.0002         13405         13403         13405         13403
BM_string_strncmp/65536/0/0          +0.0005         +0.0005         26968         26981         26968         26981
BM_string_strncmp/131072/0/0         -0.0057         -0.0057         53959         53650         53958         53650
BM_string_strncmp/1/4/0              -0.1352         -0.1352            12            10            12            10
BM_string_strncmp/2/4/0              +0.0020         +0.0020            15            15            15            15
BM_string_strncmp/3/4/0              -0.1560         -0.1560            20            17            20            17
BM_string_strncmp/4/4/0              +0.0296         +0.0296            22            22            22            22
BM_string_strncmp/5/4/0              +0.0573         +0.0573            22            23            22            23
BM_string_strncmp/6/4/0              -0.0340         -0.0340            25            24            25            24
BM_string_strncmp/7/4/0              +0.0185         +0.0185            26            26            26            26
BM_string_strncmp/8/4/0              -0.0050         -0.0050            27            27            27            27
BM_string_strncmp/9/4/0              -0.1294         -0.1294            28            24            28            24
BM_string_strncmp/10/4/0             +0.0109         +0.0109            29            29            29            29
BM_string_strncmp/11/4/0             -0.0000         -0.0001            30            30            30            30
BM_string_strncmp/12/4/0             +0.0055         +0.0055            50            50            50            50
BM_string_strncmp/13/4/0             -0.0249         -0.0249            51            50            51            50
BM_string_strncmp/14/4/0             -0.0289         -0.0289            53            52            53            52
BM_string_strncmp/15/4/0             -0.0205         -0.0205            55            54            55            54
BM_string_strncmp/16/4/0             -0.4616         -0.4616            57            31            57            31
BM_string_strncmp/24/4/0             -0.4871         -0.4871            72            37            72            37
BM_string_strncmp/32/4/0             -0.5549         -0.5549            87            39            87            39
BM_string_strncmp/40/4/0             -0.5964         -0.5964           103            42           103            42
BM_string_strncmp/48/4/0             -0.6647         -0.6647           118            40           118            40
BM_string_strncmp/56/4/0             -0.6551         -0.6551           134            46           134            46
BM_string_strncmp/64/4/0             -0.6609         -0.6609           145            49           145            49
BM_string_strncmp/72/4/0             -0.5709         -0.5710           164            70           164            70
BM_string_strncmp/80/4/0             -0.5929         -0.5929           180            73           180            73
BM_string_strncmp/88/4/0             -0.6051         -0.6051           195            77           195            77
BM_string_strncmp/96/4/0             -0.6160         -0.6160           210            81           210            81
BM_string_strncmp/104/4/0            -0.6199         -0.6199           223            85           223            85
BM_string_strncmp/112/4/0            -0.6293         -0.6293           240            89           240            89
BM_string_strncmp/120/4/0            -0.6439         -0.6439           255            91           255            91
BM_string_strncmp/128/4/0            -0.6493         -0.6493           271            95           271            95
BM_string_strncmp/136/4/0            -0.6704         -0.6704           287            95           287            95
BM_string_strncmp/144/4/0            -0.6744         -0.6744           302            98           302            98
BM_string_strncmp/160/4/0            -0.6700         -0.6700           333           110           333           110
BM_string_strncmp/176/4/0            -0.6821         -0.6821           364           116           364           116
BM_string_strncmp/192/4/0            -0.6887         -0.6887           394           123           394           123
BM_string_strncmp/208/4/0            -0.6949         -0.6949           425           130           425           130
BM_string_strncmp/224/4/0            -0.7069         -0.7069           456           134           456           134
BM_string_strncmp/240/4/0            -0.7042         -0.7042           486           144           486           144
BM_string_strncmp/256/4/0            -0.7043         -0.7043           514           152           514           152
BM_string_strncmp/1/0/4              +0.0227         +0.0227            14            14            14            14
BM_string_strncmp/2/0/4              +0.0442         +0.0442            15            16            15            16
BM_string_strncmp/3/0/4              +0.5829         +0.5829            17            27            17            27
BM_string_strncmp/4/0/4              -0.1593         -0.1593            22            19            22            19
BM_string_strncmp/5/0/4              -0.0516         -0.0516            23            22            23            22
BM_string_strncmp/6/0/4              -0.1684         -0.1684            25            20            25            20
BM_string_strncmp/7/0/4              +0.0170         +0.0170            26            26            26            26
BM_string_strncmp/8/0/4              +0.0006         +0.0006            27            27            27            27
BM_string_strncmp/9/0/4              +0.1272         +0.1272            25            28            25            28
BM_string_strncmp/10/0/4             +0.0108         +0.0108            29            29            29            29
BM_string_strncmp/11/0/4             -0.0001         -0.0001            30            30            30            30
BM_string_strncmp/12/0/4             -0.3557         -0.3557            50            32            50            32
BM_string_strncmp/13/0/4             -0.3370         -0.3370            51            34            51            34
BM_string_strncmp/14/0/4             -0.3444         -0.3444            53            35            53            35
BM_string_strncmp/15/0/4             +0.0946         +0.0946            51            56            51            56
BM_string_strncmp/16/0/4             -0.5203         -0.5203            53            25            53            25
BM_string_strncmp/24/0/4             -0.6109         -0.6109            72            28            72            28
BM_string_strncmp/32/0/4             -0.6934         -0.6934            88            27            88            27
BM_string_strncmp/40/0/4             -0.6833         -0.6833           103            33           103            33
BM_string_strncmp/48/0/4             -0.6973         -0.6973           118            36           118            36
BM_string_strncmp/56/0/4             -0.7116         -0.7116           134            39           134            39
BM_string_strncmp/64/0/4             -0.6017         -0.6018           149            59           149            59
BM_string_strncmp/72/0/4             -0.6268         -0.6268           164            61           164            61
BM_string_strncmp/80/0/4             -0.6409         -0.6409           179            64           179            64
BM_string_strncmp/88/0/4             -0.6465         -0.6465           195            69           195            69
BM_string_strncmp/96/0/4             -0.6551         -0.6551           210            72           210            72
BM_string_strncmp/104/0/4            -0.6662         -0.6662           227            76           227            76
BM_string_strncmp/112/0/4            -0.6700         -0.6700           240            79           240            79
BM_string_strncmp/120/0/4            -0.6740         -0.6740           256            83           256            83
BM_string_strncmp/128/0/4            -0.6862         -0.6862           271            85           271            85
BM_string_strncmp/136/0/4            -0.6883         -0.6883           287            89           287            89
BM_string_strncmp/144/0/4            -0.7031         -0.7031           297            88           297            88
BM_string_strncmp/160/0/4            -0.6985         -0.6985           333           100           333           100
BM_string_strncmp/176/0/4            -0.7082         -0.7082           364           106           364           106
BM_string_strncmp/192/0/4            -0.7223         -0.7223           396           110           396           110
BM_string_strncmp/208/0/4            -0.7135         -0.7135           421           121           421           121
BM_string_strncmp/224/0/4            -0.7194         -0.7194           455           128           455           128
BM_string_strncmp/240/0/4            -0.7233         -0.7233           487           135           487           135
BM_string_strncmp/256/0/4            -0.7239         -0.7239           516           143           516           143
BM_string_strncmp/1/4/4              +0.0224         +0.0225            21            22            21            22
BM_string_strncmp/2/4/4              -0.0001         -0.0001            22            22            22            22
BM_string_strncmp/3/4/4              -0.0001         -0.0001            22            22            22            22
BM_string_strncmp/4/4/4              -0.0435         -0.0435            22            21            22            21
BM_string_strncmp/5/4/4              -0.0118         -0.0118            27            27            27            27
BM_string_strncmp/6/4/4              -0.0118         -0.0118            27            27            27            27
BM_string_strncmp/7/4/4              -0.0117         -0.0117            27            27            27            27
BM_string_strncmp/8/4/4              -0.0118         -0.0118            27            27            27            27
BM_string_strncmp/9/4/4              -0.0117         -0.0117            27            27            27            27
BM_string_strncmp/10/4/4             +0.1447         +0.1447            23            27            23            27
BM_string_strncmp/11/4/4             -0.0062         -0.0062            27            27            27            27
BM_string_strncmp/12/4/4             -0.0454         -0.0454            28            27            28            27
BM_string_strncmp/13/4/4             -0.1507         -0.1507            29            24            29            24
BM_string_strncmp/14/4/4             -0.0003         -0.0003            29            29            29            29
BM_string_strncmp/15/4/4             -0.0002         -0.0003            29            29            29            29
BM_string_strncmp/16/4/4             +0.0047         +0.0047            29            29            29            29
BM_string_strncmp/24/4/4             -0.0104         -0.0104            31            30            31            30
BM_string_strncmp/32/4/4             -0.0290         -0.0290            33            32            33            32
BM_string_strncmp/40/4/4             -0.0189         -0.0189            34            33            34            33
BM_string_strncmp/48/4/4             -0.0059         -0.0059            36            36            36            36
BM_string_strncmp/56/4/4             +0.0000         +0.0000            39            39            39            39
BM_string_strncmp/64/4/4             +0.0000         +0.0000            42            42            42            42
BM_string_strncmp/72/4/4             +0.0000         +0.0000            45            45            45            45
BM_string_strncmp/80/4/4             +0.0391         +0.0392            65            68            65            68
BM_string_strncmp/88/4/4             -0.0090         -0.0090            71            70            71            70
BM_string_strncmp/96/4/4             -0.0034         -0.0034            74            74            74            74
BM_string_strncmp/104/4/4            -0.0482         -0.0482            77            73            77            73
BM_string_strncmp/112/4/4            +0.0387         +0.0387            77            80            77            80
BM_string_strncmp/120/4/4            -0.0072         -0.0073            84            83            84            83
BM_string_strncmp/128/4/4            -0.0071         -0.0071            87            86            87            86
BM_string_strncmp/136/4/4            +0.0366         +0.0366            86            89            86            89
BM_string_strncmp/144/4/4            -0.0068         -0.0068            93            93            93            93
BM_string_strncmp/160/4/4            -0.0064         -0.0064           100            99           100            99
BM_string_strncmp/176/4/4            -0.0063         -0.0063           106           105           106           105
BM_string_strncmp/192/4/4            -0.0012         -0.0012           112           112           112           112
BM_string_strncmp/208/4/4            -0.0098         -0.0098           119           118           119           118
BM_string_strncmp/224/4/4            -0.0050         -0.0050           125           125           125           125
BM_string_strncmp/240/4/4            -0.0060         -0.0060           132           131           132           131
BM_string_strncmp/256/4/4            -0.0046         -0.0046           138           137           138           137

[1] Commit id: 26cc4faec37a55529e5d0a39949f7b6ec81008f9

Test: bionic tests and benchmarks on aarch64.
Change-Id: Ied579d2044b4092fc95fad486af6541d1eb71dc3
1 file changed
tree: c9805f240955fa1784c5ffab7928717931e4bef8
  1. benchmarks/
  2. build/
  3. docs/
  4. libc/
  5. libdl/
  6. libm/
  7. libstdc++/
  8. linker/
  9. tests/
  10. tools/
  11. .clang-format
  12. .gitignore
  13. android-changes-for-ndk-developers.md
  14. Android.bp
  15. Android.mk
  16. CleanSpec.mk
  17. CPPLINT.cfg
  18. OWNERS
  19. PREUPLOAD.cfg
  20. README.md
README.md

Using bionic

See the additional documentation.

Working on bionic

What are the big pieces of bionic?

libc/ --- libc.so, libc.a

The C library. Stuff like fopen(3) and kill(2).

libm/ --- libm.so, libm.a

The math library. Traditionally Unix systems kept stuff like sin(3) and cos(3) in a separate library to save space in the days before shared libraries.

libdl/ --- libdl.so

The dynamic linker interface library. This is actually just a bunch of stubs that the dynamic linker replaces with pointers to its own implementation at runtime. This is where stuff like dlopen(3) lives.

libstdc++/ --- libstdc++.so

The C++ ABI support functions. The C++ compiler doesn't know how to implement thread-safe static initialization and the like, so it just calls functions that are supplied by the system. Stuff like __cxa_guard_acquire and __cxa_pure_virtual live here.

linker/ --- /system/bin/linker and /system/bin/linker64

The dynamic linker. When you run a dynamically-linked executable, its ELF file has a DT_INTERP entry that says "use the following program to start me". On Android, that's either linker or linker64 (depending on whether it's a 32-bit or 64-bit executable). It's responsible for loading the ELF executable into memory and resolving references to symbols (so that when your code tries to jump to fopen(3), say, it lands in the right place).

tests/ --- unit tests

The tests/ directory contains unit tests. Roughly arranged as one file per publicly-exported header file.

benchmarks/ --- benchmarks

The benchmarks/ directory contains benchmarks, with its own documentation.

What's in libc/?

Adding libc wrappers for system calls

The first question you should ask is "should I add a libc wrapper for this system call?". The answer is usually "no".

The answer is "yes" if the system call is part of the POSIX standard.

The answer is probably "yes" if the system call has a wrapper in at least one other C library.

The answer may be "yes" if the system call has three/four distinct users in different projects, and there isn't a more specific library that would make more sense as the place to add the wrapper.

In all other cases, you should use syscall(3) instead.

Adding a system call usually involves:

  1. Add entries to SYSCALLS.TXT. See SYSCALLS.TXT itself for documentation on the format.
  2. Run the gensyscalls.py script.
  3. Add constants (and perhaps types) to the appropriate header file. Note that you should check to see whether the constants are already in kernel uapi header files, in which case you just need to make sure that the appropriate POSIX header file in libc/include/ includes the relevant file or files.
  4. Add function declarations to the appropriate header file. Don't forget to include the appropriate __INTRODUCED_IN().
  5. Add the function name to the correct section in libc/libc.map.txt and run ./libc/tools/genversion-scripts.py.
  6. Add at least basic tests. Even a test that deliberately supplies an invalid argument helps check that we're generating the right symbol and have the right declaration in the header file, and that you correctly updated the maps in step 5. (You can use strace(1) to confirm that the correct system call is being made.)

Updating kernel header files

As mentioned above, this is currently a two-step process:

  1. Use generate_uapi_headers.sh to go from a Linux source tree to appropriate contents for external/kernel-headers/.
  2. Run update_all.py to scrub those headers and import them into bionic.

Note that if you're actually just trying to expose device-specific headers to build your device drivers, you shouldn't modify bionic. Instead use TARGET_DEVICE_KERNEL_HEADERS and friends described in config.mk.

Updating tzdata

This is fully automated (and these days handled by the libcore team, because they own icu, and that needs to be updated in sync with bionic):

  1. Run update-tzdata.py in external/icu/tools/.

Verifying changes

If you make a change that is likely to have a wide effect on the tree (such as a libc header change), you should run make checkbuild. A regular make will not build the entire tree; just the minimum number of projects that are required for the device. Tests, additional developer tools, and various other modules will not be built. Note that make checkbuild will not be complete either, as make tests covers a few additional modules, but generally speaking make checkbuild is enough.

Running the tests

The tests are all built from the tests/ directory.

Device tests

$ mma # In $ANDROID_ROOT/bionic.
$ adb root && adb remount && adb sync
$ adb shell /data/nativetest/bionic-unit-tests/bionic-unit-tests
$ adb shell \
    /data/nativetest/bionic-unit-tests-static/bionic-unit-tests-static
# Only for 64-bit targets
$ adb shell /data/nativetest64/bionic-unit-tests/bionic-unit-tests
$ adb shell \
    /data/nativetest64/bionic-unit-tests-static/bionic-unit-tests-static

Note that we use our own custom gtest runner that offers a superset of the options documented at https://github.com/google/googletest/blob/master/googletest/docs/AdvancedGuide.md#running-test-programs-advanced-options, in particular for test isolation and parallelism (both on by default).

Device tests via CTS

Most of the unit tests are executed by CTS. By default, CTS runs as a non-root user, so the unit tests must also pass when not run as root. Some tests cannot do any useful work unless run as root. In this case, the test should check getuid() == 0 and do nothing otherwise (typically we log in this case to prevent accidents!). Obviously, if the test can be rewritten to not require root, that's an even better solution.

Currently, the list of bionic CTS tests is generated at build time by running a host version of the test executable and dumping the list of all tests. In order for this to continue to work, all architectures must have the same number of tests, and the host version of the executable must also have the same number of tests.

Running the gtests directly is orders of magnitude faster than using CTS, but in cases where you really have to run CTS:

$ make cts # In $ANDROID_ROOT.
$ adb unroot # Because real CTS doesn't run as root.
# This will sync any *test* changes, but not *code* changes:
$ cts-tradefed \
    run singleCommand cts --skip-preconditions -m CtsBionicTestCases

Host tests

The host tests require that you have lunched either an x86 or x86_64 target. Note that due to ABI limitations (specifically, the size of pthread_mutex_t), 32-bit bionic requires PIDs less than 65536. To enforce this, set /proc/sys/kernel/pid_max to 65536.

$ ./tests/run-on-host.sh 32
$ ./tests/run-on-host.sh 64   # For x86_64-bit *targets* only.

You can supply gtest flags as extra arguments to this script.

Against glibc

As a way to check that our tests do in fact test the correct behavior (and not just the behavior we think is correct), it is possible to run the tests against the host's glibc.

$ ./tests/run-on-host.sh glibc

Gathering test coverage

For either host or target coverage, you must first:

  • $ export NATIVE_COVERAGE=true
    • Note that the build system is ignorant to this flag being toggled, i.e. if you change this flag, you will have to manually rebuild bionic.
  • Set bionic_coverage=true in libc/Android.mk and libm/Android.mk.

Coverage from device tests

$ mma
$ adb sync
$ adb shell \
    GCOV_PREFIX=/data/local/tmp/gcov \
    GCOV_PREFIX_STRIP=`echo $ANDROID_BUILD_TOP | grep -o / | wc -l` \
    /data/nativetest/bionic-unit-tests/bionic-unit-tests
$ acov

acov will pull all coverage information from the device, push it to the right directories, run lcov, and open the coverage report in your browser.

Coverage from host tests

First, build and run the host tests as usual (see above).

$ croot
$ lcov -c -d $ANDROID_PRODUCT_OUT -o coverage.info
$ genhtml -o covreport coverage.info # or lcov --list coverage.info

The coverage report is now available at covreport/index.html.

Attaching GDB to the tests

Bionic's test runner will run each test in its own process by default to prevent tests failures from impacting other tests. This also has the added benefit of running them in parallel, so they are much faster.

However, this also makes it difficult to run the tests under GDB. To prevent each test from being forked, run the tests with the flag --no-isolate.

32-bit ABI bugs

See 32-bit ABI bugs.