SF: Fix for rare flakes in Event{Control}ThreadTests
This has been a minor issue for a while, and was causing a few failures
a week in the continuous testing infrastructure.
I managed to reproduce it locally with 10k iterations of one test (to
even have a good chance of catching one failure!) -- on a phone that was
stuck in a startup boot crash loop! (Other significant activity), and
from there diagnose it to an extremely rare chance that threads would
take 20-30ms to resume execution after being unblocked, significantly
more than the 10ms maximum time allowed.
This patch increases the maximum wait time to 100ms which should
eliminate the flakes (or at least reduce them to an substantially
smaller percentage of the time).
This led to the discovery that the EventThreadTest destructor was using
the call for expecting an event when actually it should have been making
a call using a lower default threshhold to detect unexpected calls,
which I fixed.
With the tests otherwise using the calls as intended, I verified that
the typical test time was not substantially slower than it was before (in
fact it was faster after the EventThreadTest fix!)
Bug: 112104412
Test: atest libsurfaceflinger_unittest # Not slower for a typical run
Test: EventControlThreadTest.signalsVSyncEnabledThenDisabled * 100k
Change-Id: Ie778567955d746922ef514baff1a552c34794b8d
diff --git a/services/surfaceflinger/tests/unittests/AsyncCallRecorder.h b/services/surfaceflinger/tests/unittests/AsyncCallRecorder.h
index 2245ee1..8bed766 100644
--- a/services/surfaceflinger/tests/unittests/AsyncCallRecorder.h
+++ b/services/surfaceflinger/tests/unittests/AsyncCallRecorder.h
@@ -75,14 +75,16 @@
template <typename... Args>
class AsyncCallRecorder<void (*)(Args...)> {
public:
- // For the tests, we expect the wait for an expected change to be signaled
- // to be much shorter than this.
- static constexpr std::chrono::milliseconds DEFAULT_CALL_EXPECTED_TIMEOUT{10};
+ // This wait value needs to be large enough to avoid flakes caused by delays
+ // scheduling threads, but small enough that tests don't take forever if
+ // something really is wrong. Based on some empirical evidence, 100ms should
+ // be enough to avoid the former.
+ static constexpr std::chrono::milliseconds DEFAULT_CALL_EXPECTED_TIMEOUT{100};
- // The wait here is tricky. We don't expect a change, but we don't want to
- // wait forever (or for longer than the typical test function runtime). As
- // even the simplest Google Test can take 1ms (1000us) to run, we wait for
- // half that time.
+ // The wait here is tricky. It's for when We don't expect to record a call,
+ // but we don't want to wait forever (or for longer than the typical test
+ // function runtime). As even the simplest Google Test can take 1ms (1000us)
+ // to run, we wait for half that time.
static constexpr std::chrono::microseconds UNEXPECTED_CALL_TIMEOUT{500};
using ArgTuple = std::tuple<std::remove_cv_t<std::remove_reference_t<Args>>...>;
diff --git a/services/surfaceflinger/tests/unittests/EventThreadTest.cpp b/services/surfaceflinger/tests/unittests/EventThreadTest.cpp
index dd90063..459a594 100644
--- a/services/surfaceflinger/tests/unittests/EventThreadTest.cpp
+++ b/services/surfaceflinger/tests/unittests/EventThreadTest.cpp
@@ -116,7 +116,7 @@
ALOGD("**** Tearing down after %s.%s\n", test_info->test_case_name(), test_info->name());
// EventThread should unregister itself as VSyncSource callback.
- EXPECT_FALSE(expectVSyncSetCallbackCallReceived());
+ EXPECT_TRUE(!mVSyncSetCallbackCallRecorder.waitForUnexpectedCall().has_value());
}
void EventThreadTest::createThread() {