cb199b47955623746a021bdb91a04478b3bd3a7e - android_system_core

commit	cb199b47955623746a021bdb91a04478b3bd3a7e	[log] [tgz]
author	Eric Miao <ericymiao@google.com>	Wed Nov 30 16:05:49 2022 -0800
committer	Eric Miao <ericymiao@google.com>	Wed Jul 12 13:23:07 2023 -0700
tree	0651e39f42d1f1604f0df714c1cb5eb7ff1e220c
parent	dc8ae8c55a0dc78be9a8cae109294e1ee645280b [diff]

libutils: Improve performance of utf8_to_utf16/utf16_to_utf8 This CL improves the performance of below functions in helping with conversion between utf8/utf16 with libutils: - utf8_to_utf16_length - utf8_to_utf16 - utf16_to_utf8_length - utf16_to_utf The basic idea is to keep the loop as tight as possible for the most common cases, e.g. in UTF16-->UTF8 case, the most common case is when the character is < 0x80 (ASCII), next is when it's < 0x0800 ( most Latin), and so on. This version of implementation reduces the number of instructions needed for every incoming utf-8 bytes in the original implementation where: 1) calculating how many bytes needed given a leading UTF-8 byte in utf8_codepoint_len(), it's a very clever way but involves multiple instructions to calculate regardless 2) and an intermediate conversion to utf32, and then to utf16 utf8_to_utf32_codepoint() The end result is about ~1.5x throughput improvement. Benchmark results on redfin (64bit) before the change: utf8_to_utf16_length: bytes_per_second=307.556M/s utf8_to_utf16: bytes_per_second=246.664M/s utf16_to_utf8_length: bytes_per_second=482.241M/s utf16_to_utf8: bytes_per_second=351.376M/s After the change: utf8_to_utf16_length: bytes_per_second=544.022M/s utf8_to_utf16: bytes_per_second=471.135M/s utf16_to_utf8_length: bytes_per_second=685.381M/s utf16_to_utf8: bytes_per_second=580.004M/s Ideas for future improvement could include alignment handling and loop unrolling to increase throughput more. This CL also fixes issues below: 1. utf16_to_utf8_length() should return 0 when the source string has length of 0, the original code returns -1 as below: ssize_t utf16_to_utf8_length(const char16_t *src, size_t src_len) { if (src == nullptr || src_len == 0) { return -1; } ... 2. utf8_to_utf16() should check whether input string is valid. Change-Id: I546138a7a8050681a524eabce9864219fc44f48e

tree: 0651e39f42d1f1604f0df714c1cb5eb7ff1e220c

bootstat/
cli-test/
code_coverage/
debuggerd/
diagnose_usb/
fastboot/
fs_mgr/
gatekeeperd/
healthd/
include/
init/
janitors/
libappfuse/
libasyncio/
libcrypto_utils/
libcutils/
libgrallocusage/
libkeyutils/
libmodprobe/
libnetutils/
libpackagelistparser/
libprocessgroup/
libsparse/
libstats/
libsuspend/
libsync/
libsystem/
libsysutils/
libusbhost/
libutils/
libvndksupport/
llkd/
mini_keyctl/
mkbootfs/
property_service/
reboot/
rootdir/
run-as/
sdcard/
shell_and_utilities/
storaged/
toolbox/
trusty/
usbd/
watchdogd/
.clang-format ⇨ .clang-format-4
.clang-format-2 ⇨ ../../build/soong/scripts/system-clang-format-2
.clang-format-4 ⇨ ../../build/soong/scripts/system-clang-format
rustfmt.toml ⇨ ../../build/soong/scripts/rustfmt.toml
.gitignore
CleanSpec.mk
METADATA
MODULE_LICENSE_APACHE2
OWNERS
PREUPLOAD.cfg