Fix race conditions when using mHwc off the main thread
In a few cases we were accessing mHwc off the main thread without
acquiring mStateLock, resulting in crashes and other incorrect
behavior. This CL adds the missing locks. Since the locking semantics
are somewhat hard to understand, also add a clarifying comment to the
mHwc member declaration in SurfaceFlinger.h.
Bug: 64586546
Test: I manually tested normal surface flinger operation and vr behavior
on different devices, and confirmed everything looks fine. The crashes
we saw that were caused by these mHwc race conditions are hard to
reproduce, so I couldn't empirically verify this fixes the crash. I'm
relying on manual code inspection to confirm the issue is in fact fixed.
Regarding performance, I added profiling code (not part of this CL) to
check for lock contention and hold times with the newly added locks. I
confirmed that contention is low, so these calls shouldn't be
significantly slower as a result of adding the locks. The time spent
holding these new locks is also low, except for getDisplayColorModes(),
which makes a call to hardware composer service and can in some cases
take over a millisecond. That function is called so rarely though, only
once at boot, or twice at boot after a fresh flash, that it's not worth
optimizing.
Change-Id: I3854779c12a61983aaaecddb9f6316f218e519e3
diff --git a/services/surfaceflinger/SurfaceFlinger.h b/services/surfaceflinger/SurfaceFlinger.h
index 025da0e..13a057f 100644
--- a/services/surfaceflinger/SurfaceFlinger.h
+++ b/services/surfaceflinger/SurfaceFlinger.h
@@ -192,6 +192,8 @@
// force full composition on all displays
void repaintEverything();
+ // Can only be called from the main thread or with mStateLock held
+ void repaintEverythingLocked();
// returns the default Display
sp<const DisplayDevice> getDefaultDisplayDevice() const {
@@ -343,7 +345,9 @@
* Message handling
*/
void waitForEvent();
+ // Can only be called from the main thread or with mStateLock held
void signalTransaction();
+ // Can only be called from the main thread or with mStateLock held
void signalLayerUpdate();
void signalRefresh();
@@ -386,6 +390,7 @@
*/
uint32_t getTransactionFlags(uint32_t flags);
uint32_t peekTransactionFlags();
+ // Can only be called from the main thread or with mStateLock held
uint32_t setTransactionFlags(uint32_t flags);
void commitTransaction();
uint32_t setClientStateLocked(const sp<Client>& client, const layer_state_t& s);
@@ -642,8 +647,26 @@
// access must be protected by mInvalidateLock
volatile int32_t mRepaintEverything;
- // The current hardware composer interface. When switching into and out of
- // vr, our HWComposer instance will be recreated.
+ // The current hardware composer interface.
+ //
+ // The following thread safety rules apply when accessing mHwc, either
+ // directly or via getHwComposer():
+ //
+ // 1. When recreating mHwc, acquire mStateLock. We currently recreate mHwc
+ // only when switching into and out of vr. Recreating mHwc must only be
+ // done on the main thread.
+ //
+ // 2. When accessing mHwc on the main thread, it's not necessary to acquire
+ // mStateLock.
+ //
+ // 3. When accessing mHwc on a thread other than the main thread, we always
+ // need to acquire mStateLock. This is because the main thread could be
+ // in the process of destroying the current mHwc instance.
+ //
+ // The above thread safety rules only apply to SurfaceFlinger.cpp. In
+ // SurfaceFlinger_hwc1.cpp we create mHwc at surface flinger init and never
+ // destroy it, so it's always safe to access mHwc from any thread without
+ // acquiring mStateLock.
std::unique_ptr<HWComposer> mHwc;
// constant members (no synchronization needed for access)