blob: f5d1ab52826703687bfe72b95257221577a77966 [file] [log] [blame] [view]
Josh Gaob64196a2018-08-31 00:32:46 -07001## fdsan
2
3[TOC]
4
Elliott Hughes97786712021-04-16 13:57:52 -07005fdsan is a file descriptor sanitizer added to Android in API level 29.
Elliott Hughes32a72522021-12-10 01:42:21 +00006In API level 29, fdsan warns when it finds a bug.
7In API level 30, fdsan aborts when it finds a bug.
Elliott Hughes97786712021-04-16 13:57:52 -07008
Josh Gaob64196a2018-08-31 00:32:46 -07009### Background
10*What problem is fdsan trying to solve? Why should I care?*
11
Josh Gao01416ba2018-09-06 16:23:14 -070012fdsan (file descriptor sanitizer) detects mishandling of file descriptor ownership, which tend to manifest as *use-after-close* and *double-close*. These errors are direct analogues of the memory allocation *use-after-free* and *double-free* bugs, but tend to be much more difficult to diagnose and fix. With `malloc` and `free`, implementations have free reign to detect errors and abort on double free. File descriptors, on the other hand, are mandated by the POSIX standard to be allocated with the lowest available number being returned for new allocations. As a result, many file descriptor bugs can *never* be noticed on the thread on which the error occurred, and will manifest as "impossible" behavior on another thread.
Josh Gaob64196a2018-08-31 00:32:46 -070013
14For example, given two threads running the following code:
15```cpp
16void thread_one() {
17 int fd = open("/dev/null", O_RDONLY);
18 close(fd);
19 close(fd);
20}
21
22void thread_two() {
23 while (true) {
24 int fd = open("log", O_WRONLY | O_APPEND);
25 if (write(fd, "foo", 3) != 3) {
26 err(1, "write failed!");
27 }
28 }
29}
30```
31the following interleaving is possible:
32```cpp
33thread one thread two
34open("/dev/null", O_RDONLY) = 123
35close(123) = 0
36 open("log", O_WRONLY | APPEND) = 123
37close(123) = 0
38 write(123, "foo", 3) = -1 (EBADF)
39 err(1, "write failed!")
40```
41
42Assertion failures are probably the most innocuous result that can arise from these bugs: silent data corruption [[1](#footnotes), [2](#footnotes)] or security vulnerabilities are also possible (e.g. suppose thread two was saving user data to disk when a third thread came in and opened a socket to the Internet).
43
44### Design
45*What does fdsan do?*
46
47fdsan attempts to detect and/or prevent file descriptor mismanagement by enforcing file descriptor ownership. Like how most memory allocations can have their ownership handled by types such as `std::unique_ptr`, almost all file descriptors can be associated with a unique owner which is responsible for their closure. fdsan provides functions to associate a file descriptor with an owner; if someone tries to close a file descriptor that they don't own, depending on configuration, either a warning is emitted, or the process aborts.
48
Elliott Hughes9c06d162023-10-04 23:36:14 +000049The way this is implemented is by providing functions to set a 64-bit closure tag on a file descriptor. The tag consists of an 8-bit type byte that identifies the type of the owner (`enum android_fdan_owner_type` in [`<android/fdsan.h>`](https://android.googlesource.com/platform/bionic/+/main/libc/include/android/fdsan.h)), and a 56-bit value. The value should ideally be something that uniquely identifies the object (object address for native objects and `System.identityHashCode` for Java objects), but in cases where it's hard to derive an identifier for the "owner" that should close a file descriptor, even using the same value for all file descriptors in the module can be useful, since it'll catch other code that closes your file descriptors.
Josh Gaob64196a2018-08-31 00:32:46 -070050
51If a file descriptor that's been marked with a tag is closed with an incorrect tag, or without a tag, we know something has gone wrong, and can generate diagnostics or abort.
52
53### Enabling fdsan (as a user)
54*How do I use fdsan?*
55
56fdsan has four severity levels:
57 - disabled (`ANDROID_FDSAN_ERROR_LEVEL_DISABLED`)
58 - warn-once (`ANDROID_FDSAN_ERROR_LEVEL_WARN_ONCE`)
59 - Upon detecting an error, emit a warning to logcat, generate a tombstone, and then continue execution with fdsan disabled.
60 - warn-always (`ANDROID_FDSAN_ERROR_LEVEL_WARN_ALWAYS`)
61 - Same as warn-once, except without disabling after the first warning.
62 - fatal (`ANDROID_FDSAN_ERROR_LEVEL_FATAL`)
63 - Abort upon detecting an error.
64
Elliott Hughes9c06d162023-10-04 23:36:14 +000065In Android Q, fdsan has a global default of warn-once. fdsan can be made more or less strict at runtime via the `android_fdsan_set_error_level` function in [`<android/fdsan.h>`](https://android.googlesource.com/platform/bionic/+/main/libc/include/android/fdsan.h).
Josh Gaob64196a2018-08-31 00:32:46 -070066
67The likelihood of fdsan catching a file descriptor error is proportional to the percentage of file descriptors in your process that are tagged with an owner.
68
69### Using fdsan to fix a bug
70*No, really, how do I use fdsan?*
71
72Let's look at a simple contrived example that uses sleeps to force a particular interleaving of thread execution.
73
74```cpp
75#include <err.h>
76#include <unistd.h>
77
78#include <chrono>
79#include <thread>
80#include <vector>
81
82#include <android-base/unique_fd.h>
83
84using namespace std::chrono_literals;
85using std::this_thread::sleep_for;
86
87void victim() {
Josh Gaod747bb82018-09-18 13:38:31 -070088 sleep_for(300ms);
Josh Gaob64196a2018-08-31 00:32:46 -070089 int fd = dup(STDOUT_FILENO);
90 sleep_for(200ms);
91 ssize_t rc = write(fd, "good\n", 5);
92 if (rc == -1) {
93 err(1, "good failed to write?!");
94 }
95 close(fd);
96}
97
98void bystander() {
99 sleep_for(100ms);
100 int fd = dup(STDOUT_FILENO);
Josh Gaod747bb82018-09-18 13:38:31 -0700101 sleep_for(300ms);
Josh Gaob64196a2018-08-31 00:32:46 -0700102 close(fd);
103}
104
105void offender() {
106 int fd = dup(STDOUT_FILENO);
107 close(fd);
108 sleep_for(200ms);
109 close(fd);
110}
111
112int main() {
113 std::vector<std::thread> threads;
114 for (auto function : { victim, bystander, offender }) {
115 threads.emplace_back(function);
116 }
117 for (auto& thread : threads) {
118 thread.join();
119 }
120}
121```
122
Josh Gaod747bb82018-09-18 13:38:31 -0700123When running the program, the threads' executions will be interleaved as follows:
124
125```cpp
126// victim bystander offender
127 int fd = dup(1); // 3
128 close(3);
129 int fd = dup(1); // 3
130 close(3);
131int fd = dup(1); // 3
132 close(3);
133write(3, "good\n") = 😞;
134```
135
136which results in the following output:
Josh Gaob64196a2018-08-31 00:32:46 -0700137
138 fdsan_test: good failed to write?!: Bad file descriptor
139
140This implies that either we're accidentally closing out file descriptor too early, or someone else is helpfully closing it for us. Let's use `android::base::unique_fd` in `victim` to guard the file descriptor with fdsan:
141
142```diff
143--- a/fdsan_test.cpp
144+++ b/fdsan_test.cpp
145@@ -12,13 +12,12 @@ using std::this_thread::sleep_for;
146
147 void victim() {
148 sleep_for(200ms);
149- int fd = dup(STDOUT_FILENO);
150+ android::base::unique_fd fd(dup(STDOUT_FILENO));
151 sleep_for(200ms);
152 ssize_t rc = write(fd, "good\n", 5);
153 if (rc == -1) {
154 err(1, "good failed to write?!");
155 }
156- close(fd);
157 }
158
159 void bystander() {
160```
161
162Now that we've guarded the file descriptor with fdsan, we should be able to find where the double close is:
163
164```
165pid: 25587, tid: 25589, name: fdsan_test >>> fdsan_test <<<
166signal 35 (<debuggerd signal>), code -1 (SI_QUEUE), fault addr --------
167Abort message: 'attempted to close file descriptor 3, expected to be unowned, actually owned by unique_fd 0x7bf15dc448'
168 x0 0000000000000000 x1 00000000000063f5 x2 0000000000000023 x3 0000007bf14de338
169 x4 0000007bf14de3b8 x5 3463643531666237 x6 3463643531666237 x7 3834346364353166
170 x8 00000000000000f0 x9 0000000000000000 x10 0000000000000059 x11 0000000000000035
171 x12 0000007bf1bebcfa x13 0000007bf14ddf0a x14 0000007bf14ddf0a x15 0000000000000000
172 x16 0000007bf1c33048 x17 0000007bf1ba9990 x18 0000000000000000 x19 00000000000063f3
173 x20 00000000000063f5 x21 0000007bf14de588 x22 0000007bf1f1b864 x23 0000000000000001
174 x24 0000007bf14de130 x25 0000007bf13e1000 x26 0000007bf1f1f580 x27 0000005ab43ab8f0
175 x28 0000000000000000 x29 0000007bf14de400
176 sp 0000007bf14ddff0 lr 0000007bf1b5fd6c pc 0000007bf1b5fd90
177
178backtrace:
179 #00 pc 0000000000008d90 /system/lib64/libc.so (fdsan_error(char const*, ...)+384)
180 #01 pc 0000000000008ba8 /system/lib64/libc.so (android_fdsan_close_with_tag+632)
181 #02 pc 00000000000092a0 /system/lib64/libc.so (close+16)
182 #03 pc 00000000000003e4 /system/bin/fdsan_test (bystander()+84)
183 #04 pc 0000000000000918 /system/bin/fdsan_test
184 #05 pc 000000000006689c /system/lib64/libc.so (__pthread_start(void*)+36)
185 #06 pc 000000000000712c /system/lib64/libc.so (__start_thread+68)
186```
187
188...in the obviously correct bystander? What's going on here?
189
190The reason for this is (hopefully!) not a bug in fdsan, and will commonly be seen when tracking down double-closes in processes that have sparse fdsan coverage. What actually happened is that the culprit closed `bystander`'s file descriptor between its open and close, which resulted in `bystander` being blamed for closing `victim`'s fd. If we store `bystander`'s fd in a `unique_fd` as well, we should get something more useful:
191```diff
192--- a/tmp/fdsan_test.cpp
193+++ b/tmp/fdsan_test.cpp
194@@ -23,9 +23,8 @@ void victim() {
195
196 void bystander() {
197 sleep_for(100ms);
198- int fd = dup(STDOUT_FILENO);
199+ android::base::unique_fd fd(dup(STDOUT_FILENO));
200 sleep_for(200ms);
201- close(fd);
202 }
203```
204giving us:
205```
206pid: 25779, tid: 25782, name: fdsan_test >>> fdsan_test <<<
207signal 35 (<debuggerd signal>), code -1 (SI_QUEUE), fault addr --------
208Abort message: 'attempted to close file descriptor 3, expected to be unowned, actually owned by unique_fd 0x6fef9ff448'
209 x0 0000000000000000 x1 00000000000064b6 x2 0000000000000023 x3 0000006fef901338
210 x4 0000006fef9013b8 x5 3466663966656636 x6 3466663966656636 x7 3834346666396665
211 x8 00000000000000f0 x9 0000000000000000 x10 0000000000000059 x11 0000000000000039
212 x12 0000006ff0055cfa x13 0000006fef900f0a x14 0000006fef900f0a x15 0000000000000000
213 x16 0000006ff009d048 x17 0000006ff0013990 x18 0000000000000000 x19 00000000000064b3
214 x20 00000000000064b6 x21 0000006fef901588 x22 0000006ff04ff864 x23 0000000000000001
215 x24 0000006fef901130 x25 0000006fef804000 x26 0000006ff0503580 x27 0000006368aa18f8
216 x28 0000000000000000 x29 0000006fef901400
217 sp 0000006fef900ff0 lr 0000006feffc9d6c pc 0000006feffc9d90
218
219backtrace:
220 #00 pc 0000000000008d90 /system/lib64/libc.so (fdsan_error(char const*, ...)+384)
221 #01 pc 0000000000008ba8 /system/lib64/libc.so (android_fdsan_close_with_tag+632)
222 #02 pc 00000000000092a0 /system/lib64/libc.so (close+16)
223 #03 pc 000000000000045c /system/bin/fdsan_test (offender()+68)
224 #04 pc 0000000000000920 /system/bin/fdsan_test
225 #05 pc 000000000006689c /system/lib64/libc.so (__pthread_start(void*)+36)
226 #06 pc 000000000000712c /system/lib64/libc.so (__start_thread+68)
227```
228
229Hooray!
230
231In a real application, things are probably not going to be as detectable or reproducible as our toy example, which is a good reason to try to maximize the usage of fdsan-enabled types like `unique_fd` and `ParcelFileDescriptor`, to improve the odds that double closes in other code get detected.
232
233### Enabling fdsan (as a C++ library implementer)
234
235fdsan operates via two main primitives. `android_fdsan_exchange_owner_tag` modifies a file descriptor's close tag, and `android_fdsan_close_with_tag` closes a file descriptor with its tag. In the `<android/fdsan.h>` header, these are marked with `__attribute__((weak))`, so instead of passing down the platform version from JNI, availability of the functions can be queried directly. An example implementation of unique_fd follows:
236
237```cpp
238/*
239 * Copyright (C) 2018 The Android Open Source Project
240 * All rights reserved.
241 *
242 * Redistribution and use in source and binary forms, with or without
243 * modification, are permitted provided that the following conditions
244 * are met:
245 * * Redistributions of source code must retain the above copyright
246 * notice, this list of conditions and the following disclaimer.
247 * * Redistributions in binary form must reproduce the above copyright
248 * notice, this list of conditions and the following disclaimer in
249 * the documentation and/or other materials provided with the
250 * distribution.
251 *
252 * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
253 * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
254 * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS
255 * FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE
256 * COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT,
257 * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
258 * BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS
259 * OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED
260 * AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
261 * OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
262 * OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
263 * SUCH DAMAGE.
264 */
265
266#pragma once
267
268#include <android/fdsan.h>
269#include <unistd.h>
270
271#include <utility>
272
273struct unique_fd {
274 unique_fd() = default;
275
276 explicit unique_fd(int fd) {
277 reset(fd);
278 }
279
280 unique_fd(const unique_fd& copy) = delete;
281 unique_fd(unique_fd&& move) {
282 *this = std::move(move);
283 }
284
285 ~unique_fd() {
286 reset();
287 }
288
289 unique_fd& operator=(const unique_fd& copy) = delete;
290 unique_fd& operator=(unique_fd&& move) {
291 if (this == &move) {
292 return *this;
293 }
294
295 reset();
296
297 if (move.fd_ != -1) {
298 fd_ = move.fd_;
299 move.fd_ = -1;
300
301 // Acquire ownership from the moved-from object.
302 exchange_tag(fd_, move.tag(), tag());
303 }
304
305 return *this;
306 }
307
308 int get() { return fd_; }
309
310 int release() {
311 if (fd_ == -1) {
312 return -1;
313 }
314
315 int fd = fd_;
316 fd_ = -1;
317
318 // Release ownership.
319 exchange_tag(fd, tag(), 0);
320 return fd;
321 }
322
323 void reset(int new_fd = -1) {
324 if (fd_ != -1) {
325 close(fd_, tag());
326 fd_ = -1;
327 }
328
329 if (new_fd != -1) {
330 fd_ = new_fd;
331
332 // Acquire ownership of the presumably unowned fd.
333 exchange_tag(fd_, 0, tag());
334 }
335 }
336
337 private:
338 int fd_ = -1;
339
340 // The obvious choice of tag to use is the address of the object.
341 uint64_t tag() {
342 return reinterpret_cast<uint64_t>(this);
343 }
344
345 // These functions are marked with __attribute__((weak)), so that their
346 // availability can be determined at runtime. These wrappers will use them
347 // if available, and fall back to no-ops or regular close on pre-Q devices.
348 static void exchange_tag(int fd, uint64_t old_tag, uint64_t new_tag) {
349 if (android_fdsan_exchange_owner_tag) {
350 android_fdsan_exchange_owner_tag(fd, old_tag, new_tag);
351 }
352 }
353
354 static int close(int fd, uint64_t tag) {
355 if (android_fdsan_close_with_tag) {
356 return android_fdsan_close_with_tag(fd, tag);
357 } else {
358 return ::close(fd);
359 }
360 }
361};
362```
363
364### Frequently seen bugs
365 * Native APIs not making it clear when they take ownership of a file descriptor. <br/>
366 * Solution: accept `unique_fd` instead of `int` in functions that take ownership.
367 * [Example one](https://android-review.googlesource.com/c/platform/system/core/+/721985), [two](https://android-review.googlesource.com/c/platform/frameworks/native/+/709451)
368 * Receiving a `ParcelFileDescriptor` via Intent, and then passing it into JNI code that ends up calling close on it. <br/>
369 * Solution: ¯\\\_(ツ)\_/¯. Use fdsan?
370 * [Example one](https://android-review.googlesource.com/c/platform/system/bt/+/710104), [two](https://android-review.googlesource.com/c/platform/frameworks/base/+/732305)
371
372### Footnotes
3731. [How To Corrupt An SQLite Database File](https://www.sqlite.org/howtocorrupt.html#_continuing_to_use_a_file_descriptor_after_it_has_been_closed)
374
3752. [<b><i>50%</i></b> of Facebook's iOS crashes caused by a file descriptor double close leading to SQLite database corruption](https://code.fb.com/ios/debugging-file-corruption-on-ios/)