commit | 402c762fc93239d86206a3bded8c17f19dabcd4c | [log] [tgz] |
---|---|---|
author | Elliott Hughes <enh@google.com> | Fri Jul 06 17:18:05 2018 -0700 |
committer | Elliott Hughes <enh@google.com> | Fri Jul 06 17:18:05 2018 -0700 |
tree | 638213ab024e8feffc9ef9e171345e33f63fa0a3 | |
parent | 50acae8f2ae017c49b1d616e93ce9f97f6b3d118 [diff] [blame] |
Fix some long-standing UTF-8 bugs. We we incorrectly rejecting U+fffe and U+ffff, and incorrectly accepting characters above U+10ffff (see https://tools.ietf.org/html/rfc3629 section 12 for that restriction). Bug: http://lists.landley.net/pipermail/toybox-landley.net/2017-September/009146.html Test: ran tests Test: also ran the exhaustive test from that email thread Change-Id: I8ae8e41cef01b02933bd4f653ee07791932b79a5
diff --git a/libc/bionic/mbrtoc32.cpp b/libc/bionic/mbrtoc32.cpp index f004b78..88a077c 100644 --- a/libc/bionic/mbrtoc32.cpp +++ b/libc/bionic/mbrtoc32.cpp
@@ -127,7 +127,7 @@ // Malformed input; redundant encoding. return mbstate_reset_and_return_illegal(EILSEQ, state); } - if ((c32 >= 0xd800 && c32 <= 0xdfff) || c32 == 0xfffe || c32 == 0xffff) { + if ((c32 >= 0xd800 && c32 <= 0xdfff) || (c32 > 0x10ffff)) { // Malformed input; invalid code points. return mbstate_reset_and_return_illegal(EILSEQ, state); }