Rework how SF generates gainmapped screenshots

Previously, SF generated an HDR screenshot by:

* Asking CompositionEngine to composite layers offscreen into SDR
* Asking CompositionEngine _again_ to composite layers offscreen into
  HDR
* Asking RenderEngine to build a gainmap
* Sending the SDR buffer and the gainmap to the client

But, this had some problems:
* This path bit-rot, so the HDR layer list was totally empty :)
* LayerFE now has an assumption that only one *thing* will be consuming
  the layer and emitting a release fence back. So, this fixing the above
  caused ghost images, and undoing that assumption just for screenshots
  is a bit unappetizing.
* If the above assumption *were* undone, then the release fence for the
  layer would be delayed due to rendering to an additional render
  target, which sucks.
* We talk to RenderEngine three times, which involves thread hops, which
  sucks.

So, we solve these problems by:

* Only asking CompositionEngine to composite layers into HDR
* Then, ask RenderEngine _once_ to tonemap the HDR buffer into SDR, and
  generate a gainmap.
* Send the SDR buffer and the gainmap to the client.

And now most things are solved: LayerFE is only used to render to one
render target, so it only gets one release fence back, as expected.
Moreover, compositing into HDR is expected to be very cheap, as the
local tonemapper only has to run when tonemapping from a very large HDR
range (like a PQ video) -- when the application is already adapting to
display conditions then we don't have to do anything. And, we only have
to talk to RenderEngine twice now.

We could optimize this further by moving the gainmap generation into
CompositionEngine, but that seems like adding complexity for the moment.

Note that in theory, screenshot quality for the SDR rendition could be a
little worse, since MouriMap introduces shadowing artifacts that may now
bleed across layers. But it doesn't seem to be noticeable in practice,
and it was a pre-existing issue for application-tonemapped content.

Also, we remove CaptureArgs::attachGainmap, since we don't need it at
all to safely trigger this path.

Bug: 329465218
Flag: com.android.graphics.surfaceflinger.flags.true_hdr_screenshots
Test: Trigger screencapture
Change-Id: I5a274886acb135b21096300a034aadca20eceeb5
diff --git a/libs/gui/aidl/android/gui/CaptureArgs.aidl b/libs/gui/aidl/android/gui/CaptureArgs.aidl
index 4920344..2bbed2b 100644
--- a/libs/gui/aidl/android/gui/CaptureArgs.aidl
+++ b/libs/gui/aidl/android/gui/CaptureArgs.aidl
@@ -69,10 +69,5 @@
     // exact colorspace is not an appropriate intermediate result.
     // Note that if the caller is requesting a specific dataspace, this hint does nothing.
     boolean hintForSeamlessTransition = false;
-
-    // Allows the screenshot to attach a gainmap, which allows for a per-pixel
-    // transformation of the screenshot to another luminance range, typically
-    // mapping an SDR base image into HDR.
-    boolean attachGainmap = false;
 }
 
diff --git a/libs/renderengine/RenderEngine.cpp b/libs/renderengine/RenderEngine.cpp
index 907590a..873fc67 100644
--- a/libs/renderengine/RenderEngine.cpp
+++ b/libs/renderengine/RenderEngine.cpp
@@ -107,16 +107,15 @@
     return resultFuture;
 }
 
-ftl::Future<FenceResult> RenderEngine::drawGainmap(
-        const std::shared_ptr<ExternalTexture>& sdr, base::borrowed_fd&& sdrFence,
+ftl::Future<FenceResult> RenderEngine::tonemapAndDrawGainmap(
         const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
-        float hdrSdrRatio, ui::Dataspace dataspace,
+        float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
         const std::shared_ptr<ExternalTexture>& gainmap) {
     const auto resultPromise = std::make_shared<std::promise<FenceResult>>();
     std::future<FenceResult> resultFuture = resultPromise->get_future();
     updateProtectedContext({}, {sdr.get(), hdr.get(), gainmap.get()});
-    drawGainmapInternal(std::move(resultPromise), sdr, std::move(sdrFence), hdr,
-                        std::move(hdrFence), hdrSdrRatio, dataspace, gainmap);
+    tonemapAndDrawGainmapInternal(std::move(resultPromise), hdr, std::move(hdrFence), hdrSdrRatio,
+                                  dataspace, sdr, gainmap);
     return resultFuture;
 }
 
diff --git a/libs/renderengine/include/renderengine/RenderEngine.h b/libs/renderengine/include/renderengine/RenderEngine.h
index 95c4d03..c2dd4ae 100644
--- a/libs/renderengine/include/renderengine/RenderEngine.h
+++ b/libs/renderengine/include/renderengine/RenderEngine.h
@@ -217,12 +217,17 @@
                                                 const std::shared_ptr<ExternalTexture>& buffer,
                                                 base::unique_fd&& bufferFence);
 
-    virtual ftl::Future<FenceResult> drawGainmap(const std::shared_ptr<ExternalTexture>& sdr,
-                                                 base::borrowed_fd&& sdrFence,
-                                                 const std::shared_ptr<ExternalTexture>& hdr,
-                                                 base::borrowed_fd&& hdrFence, float hdrSdrRatio,
-                                                 ui::Dataspace dataspace,
-                                                 const std::shared_ptr<ExternalTexture>& gainmap);
+    // Tonemaps an HDR input image and draws an SDR rendition, plus a gainmap
+    // describing how to recover the HDR image.
+    //
+    // The HDR input image is ALWAYS encoded with an sRGB transfer function and
+    // is a floating point format. Accordingly, the hdrSdrRatio describes the
+    // max luminance in the HDR input image above SDR, and the dataspace
+    // describes the input primaries.
+    virtual ftl::Future<FenceResult> tonemapAndDrawGainmap(
+            const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
+            float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
+            const std::shared_ptr<ExternalTexture>& gainmap);
 
     // Clean-up method that should be called on the main thread after the
     // drawFence returned by drawLayers fires. This method will free up
@@ -310,11 +315,10 @@
             const DisplaySettings& display, const std::vector<LayerSettings>& layers,
             const std::shared_ptr<ExternalTexture>& buffer, base::unique_fd&& bufferFence) = 0;
 
-    virtual void drawGainmapInternal(
+    virtual void tonemapAndDrawGainmapInternal(
             const std::shared_ptr<std::promise<FenceResult>>&& resultPromise,
-            const std::shared_ptr<ExternalTexture>& sdr, base::borrowed_fd&& sdrFence,
             const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
-            float hdrSdrRatio, ui::Dataspace dataspace,
+            float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
             const std::shared_ptr<ExternalTexture>& gainmap) = 0;
 };
 
diff --git a/libs/renderengine/include/renderengine/mock/RenderEngine.h b/libs/renderengine/include/renderengine/mock/RenderEngine.h
index fb8331d..c42e403 100644
--- a/libs/renderengine/include/renderengine/mock/RenderEngine.h
+++ b/libs/renderengine/include/renderengine/mock/RenderEngine.h
@@ -46,17 +46,16 @@
                  ftl::Future<FenceResult>(const DisplaySettings&, const std::vector<LayerSettings>&,
                                           const std::shared_ptr<ExternalTexture>&,
                                           base::unique_fd&&));
-    MOCK_METHOD7(drawGainmap,
+    MOCK_METHOD6(tonemapAndDrawGainmap,
                  ftl::Future<FenceResult>(const std::shared_ptr<ExternalTexture>&,
-                                          base::borrowed_fd&&,
-                                          const std::shared_ptr<ExternalTexture>&,
                                           base::borrowed_fd&&, float, ui::Dataspace,
+                                          const std::shared_ptr<ExternalTexture>&,
                                           const std::shared_ptr<ExternalTexture>&));
-    MOCK_METHOD8(drawGainmapInternal,
+    MOCK_METHOD7(tonemapAndDrawGainmapInternal,
                  void(const std::shared_ptr<std::promise<FenceResult>>&&,
-                      const std::shared_ptr<ExternalTexture>&, base::borrowed_fd&&,
                       const std::shared_ptr<ExternalTexture>&, base::borrowed_fd&&, float,
-                      ui::Dataspace, const std::shared_ptr<ExternalTexture>&));
+                      ui::Dataspace, const std::shared_ptr<ExternalTexture>&,
+                      const std::shared_ptr<ExternalTexture>&));
     MOCK_METHOD5(drawLayersInternal,
                  void(const std::shared_ptr<std::promise<FenceResult>>&&, const DisplaySettings&,
                       const std::vector<LayerSettings>&, const std::shared_ptr<ExternalTexture>&,
diff --git a/libs/renderengine/skia/SkiaRenderEngine.cpp b/libs/renderengine/skia/SkiaRenderEngine.cpp
index b986967..7e8ccef 100644
--- a/libs/renderengine/skia/SkiaRenderEngine.cpp
+++ b/libs/renderengine/skia/SkiaRenderEngine.cpp
@@ -567,9 +567,7 @@
         if (usingLocalTonemap) {
             const float inputRatio =
                     hdrType == HdrRenderType::GENERIC_HDR ? 1.0f : parameters.layerDimmingRatio;
-            static MouriMap kMapper;
-            shader = kMapper.mouriMap(getActiveContext(), shader, inputRatio,
-                                      parameters.display.targetHdrSdrRatio);
+            shader = localTonemap(shader, inputRatio, parameters.display.targetHdrSdrRatio);
         }
 
         // disable tonemapping if we already locally tonemapped
@@ -610,6 +608,12 @@
     return shader;
 }
 
+sk_sp<SkShader> SkiaRenderEngine::localTonemap(sk_sp<SkShader> shader, float inputMultiplier,
+                                               float targetHdrSdrRatio) {
+    static MouriMap kMapper;
+    return kMapper.mouriMap(getActiveContext(), shader, inputMultiplier, targetHdrSdrRatio);
+}
+
 void SkiaRenderEngine::initCanvas(SkCanvas* canvas, const DisplaySettings& display) {
     if (CC_UNLIKELY(mCapture->isCaptureRunning())) {
         // Record display settings when capture is running.
@@ -1212,44 +1216,58 @@
     resultPromise->set_value(std::move(drawFence));
 }
 
-void SkiaRenderEngine::drawGainmapInternal(
+void SkiaRenderEngine::tonemapAndDrawGainmapInternal(
         const std::shared_ptr<std::promise<FenceResult>>&& resultPromise,
-        const std::shared_ptr<ExternalTexture>& sdr, base::borrowed_fd&& sdrFence,
         const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
-        float hdrSdrRatio, ui::Dataspace dataspace,
+        float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
         const std::shared_ptr<ExternalTexture>& gainmap) {
     std::lock_guard<std::mutex> lock(mRenderingMutex);
     auto context = getActiveContext();
-    auto surfaceTextureRef = getOrCreateBackendTexture(gainmap->getBuffer(), true);
-    sk_sp<SkSurface> dstSurface =
-            surfaceTextureRef->getOrCreateSurface(ui::Dataspace::V0_SRGB_LINEAR);
+    auto gainmapTextureRef = getOrCreateBackendTexture(gainmap->getBuffer(), true);
+    sk_sp<SkSurface> gainmapSurface =
+            gainmapTextureRef->getOrCreateSurface(ui::Dataspace::V0_SRGB_LINEAR);
 
-    waitFence(context, sdrFence);
-    const auto sdrTextureRef = getOrCreateBackendTexture(sdr->getBuffer(), false);
-    const auto sdrImage = sdrTextureRef->makeImage(dataspace, kPremul_SkAlphaType);
-    const auto sdrShader =
-            sdrImage->makeShader(SkTileMode::kClamp, SkTileMode::kClamp,
-                                 SkSamplingOptions({SkFilterMode::kLinear, SkMipmapMode::kNone}),
-                                 nullptr);
+    auto sdrTextureRef = getOrCreateBackendTexture(sdr->getBuffer(), true);
+    sk_sp<SkSurface> sdrSurface = sdrTextureRef->getOrCreateSurface(dataspace);
+
     waitFence(context, hdrFence);
     const auto hdrTextureRef = getOrCreateBackendTexture(hdr->getBuffer(), false);
     const auto hdrImage = hdrTextureRef->makeImage(dataspace, kPremul_SkAlphaType);
     const auto hdrShader =
             hdrImage->makeShader(SkTileMode::kClamp, SkTileMode::kClamp,
-                                 SkSamplingOptions({SkFilterMode::kLinear, SkMipmapMode::kNone}),
+                                 SkSamplingOptions({SkFilterMode::kNearest, SkMipmapMode::kNone}),
                                  nullptr);
 
+    const auto tonemappedShader = localTonemap(hdrShader, 1.0f, 1.0f);
+
     static GainmapFactory kGainmapFactory;
-    const auto gainmapShader = kGainmapFactory.createSkShader(sdrShader, hdrShader, hdrSdrRatio);
+    const auto gainmapShader =
+            kGainmapFactory.createSkShader(tonemappedShader, hdrShader, hdrSdrRatio);
 
-    const auto canvas = dstSurface->getCanvas();
-    SkPaint paint;
-    paint.setShader(gainmapShader);
-    paint.setBlendMode(SkBlendMode::kSrc);
-    canvas->drawPaint(paint);
+    sp<Fence> drawFence;
 
-    auto drawFence = sp<Fence>::make(flushAndSubmit(context, dstSurface));
-    trace(drawFence);
+    {
+        const auto canvas = sdrSurface->getCanvas();
+        SkPaint paint;
+        paint.setShader(tonemappedShader);
+        paint.setBlendMode(SkBlendMode::kSrc);
+        canvas->drawPaint(paint);
+
+        drawFence = sp<Fence>::make(flushAndSubmit(context, sdrSurface));
+        trace(drawFence);
+    }
+
+    {
+        const auto canvas = gainmapSurface->getCanvas();
+        SkPaint paint;
+        paint.setShader(gainmapShader);
+        paint.setBlendMode(SkBlendMode::kSrc);
+        canvas->drawPaint(paint);
+
+        auto gmFence = sp<Fence>::make(flushAndSubmit(context, gainmapSurface));
+        trace(gmFence);
+        drawFence = Fence::merge("gm-ss", drawFence, gmFence);
+    }
     resultPromise->set_value(std::move(drawFence));
 }
 
diff --git a/libs/renderengine/skia/SkiaRenderEngine.h b/libs/renderengine/skia/SkiaRenderEngine.h
index 7be4c25..92b7af9 100644
--- a/libs/renderengine/skia/SkiaRenderEngine.h
+++ b/libs/renderengine/skia/SkiaRenderEngine.h
@@ -143,13 +143,11 @@
                             const std::vector<LayerSettings>& layers,
                             const std::shared_ptr<ExternalTexture>& buffer,
                             base::unique_fd&& bufferFence) override final;
-    void drawGainmapInternal(const std::shared_ptr<std::promise<FenceResult>>&& resultPromise,
-                             const std::shared_ptr<ExternalTexture>& sdr,
-                             base::borrowed_fd&& sdrFence,
-                             const std::shared_ptr<ExternalTexture>& hdr,
-                             base::borrowed_fd&& hdrFence, float hdrSdrRatio,
-                             ui::Dataspace dataspace,
-                             const std::shared_ptr<ExternalTexture>& gainmap) override final;
+    void tonemapAndDrawGainmapInternal(
+            const std::shared_ptr<std::promise<FenceResult>>&& resultPromise,
+            const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
+            float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
+            const std::shared_ptr<ExternalTexture>& gainmap) override final;
 
     void dump(std::string& result) override final;
 
@@ -168,6 +166,8 @@
     };
     sk_sp<SkShader> createRuntimeEffectShader(const RuntimeEffectShaderParameters&);
 
+    sk_sp<SkShader> localTonemap(sk_sp<SkShader>, float inputMultiplier, float targetHdrSdrRatio);
+
     const PixelFormat mDefaultPixelFormat;
 
     // Identifier used for various mappings of layers to various
diff --git a/libs/renderengine/skia/filters/MouriMap.cpp b/libs/renderengine/skia/filters/MouriMap.cpp
index b099bcf..aa12cef 100644
--- a/libs/renderengine/skia/filters/MouriMap.cpp
+++ b/libs/renderengine/skia/filters/MouriMap.cpp
@@ -30,12 +30,12 @@
 }
 const SkString kCrosstalkAndChunk16x16(R"(
     uniform shader bitmap;
-    uniform float hdrSdrRatio;
+    uniform float inputMultiplier;
     vec4 main(vec2 xy) {
         float maximum = 0.0;
         for (int y = 0; y < 16; y++) {
             for (int x = 0; x < 16; x++) {
-                float3 linear = toLinearSrgb(bitmap.eval((xy - 0.5) * 16 + 0.5 + vec2(x, y)).rgb) * hdrSdrRatio;
+                float3 linear = toLinearSrgb(bitmap.eval((xy - 0.5) * 16 + 0.5 + vec2(x, y)).rgb) * inputMultiplier;
                 float maxRGB = max(linear.r, max(linear.g, linear.b));
                 maximum = max(maximum, log2(max(maxRGB, 1.0)));
             }
@@ -77,12 +77,12 @@
     uniform shader image;
     uniform shader lux;
     uniform float scaleFactor;
-    uniform float hdrSdrRatio;
+    uniform float inputMultiplier;
     uniform float targetHdrSdrRatio;
     vec4 main(vec2 xy) {
         float localMax = lux.eval(xy * scaleFactor).r;
         float4 rgba = image.eval(xy);
-        float3 linear = toLinearSrgb(rgba.rgb) * hdrSdrRatio;
+        float3 linear = toLinearSrgb(rgba.rgb) * inputMultiplier;
 
         if (localMax <= targetHdrSdrRatio) {
             return float4(fromLinearSrgb(linear), rgba.a);
@@ -116,19 +116,19 @@
         mTonemap(makeEffect(kTonemap)) {}
 
 sk_sp<SkShader> MouriMap::mouriMap(SkiaGpuContext* context, sk_sp<SkShader> input,
-                                   float hdrSdrRatio, float targetHdrSdrRatio) {
-    auto downchunked = downchunk(context, input, hdrSdrRatio);
+                                   float inputMultiplier, float targetHdrSdrRatio) {
+    auto downchunked = downchunk(context, input, inputMultiplier);
     auto localLux = blur(context, downchunked.get());
-    return tonemap(input, localLux.get(), hdrSdrRatio, targetHdrSdrRatio);
+    return tonemap(input, localLux.get(), inputMultiplier, targetHdrSdrRatio);
 }
 
 sk_sp<SkImage> MouriMap::downchunk(SkiaGpuContext* context, sk_sp<SkShader> input,
-                                   float hdrSdrRatio) const {
+                                   float inputMultiplier) const {
     SkMatrix matrix;
     SkImage* image = input->isAImage(&matrix, (SkTileMode*)nullptr);
     SkRuntimeShaderBuilder crosstalkAndChunk16x16Builder(mCrosstalkAndChunk16x16);
     crosstalkAndChunk16x16Builder.child("bitmap") = input;
-    crosstalkAndChunk16x16Builder.uniform("hdrSdrRatio") = hdrSdrRatio;
+    crosstalkAndChunk16x16Builder.uniform("inputMultiplier") = inputMultiplier;
     // TODO: fp16 might be overkill. Most practical surfaces use 8-bit RGB for HDR UI and 10-bit YUV
     // for HDR video. These downsample operations compute log2(max(linear RGB, 1.0)). So we don't
     // care about LDR precision since they all resolve to LDR-max. For appropriately mastered HDR
@@ -168,7 +168,7 @@
     LOG_ALWAYS_FATAL_IF(!blurSurface, "%s: Failed to create surface!", __func__);
     return makeImage(blurSurface.get(), blurBuilder);
 }
-sk_sp<SkShader> MouriMap::tonemap(sk_sp<SkShader> input, SkImage* localLux, float hdrSdrRatio,
+sk_sp<SkShader> MouriMap::tonemap(sk_sp<SkShader> input, SkImage* localLux, float inputMultiplier,
                                   float targetHdrSdrRatio) const {
     static constexpr float kScaleFactor = 1.0f / 128.0f;
     SkRuntimeShaderBuilder tonemapBuilder(mTonemap);
@@ -177,7 +177,7 @@
             localLux->makeRawShader(SkTileMode::kClamp, SkTileMode::kClamp,
                                     SkSamplingOptions(SkFilterMode::kLinear, SkMipmapMode::kNone));
     tonemapBuilder.uniform("scaleFactor") = kScaleFactor;
-    tonemapBuilder.uniform("hdrSdrRatio") = hdrSdrRatio;
+    tonemapBuilder.uniform("inputMultiplier") = inputMultiplier;
     tonemapBuilder.uniform("targetHdrSdrRatio") = targetHdrSdrRatio;
     return tonemapBuilder.makeShader();
 }
diff --git a/libs/renderengine/skia/filters/MouriMap.h b/libs/renderengine/skia/filters/MouriMap.h
index 9ba2b6f..f4bfa15 100644
--- a/libs/renderengine/skia/filters/MouriMap.h
+++ b/libs/renderengine/skia/filters/MouriMap.h
@@ -62,10 +62,13 @@
 public:
     MouriMap();
     // Apply the MouriMap tonemmaping operator to the input.
-    // The HDR/SDR ratio describes the luminace range of the input. 1.0 means SDR. Anything larger
-    // then 1.0 means that there is headroom above the SDR region.
-    // Similarly, the target HDR/SDR ratio describes the luminance range of the output.
-    sk_sp<SkShader> mouriMap(SkiaGpuContext* context, sk_sp<SkShader> input, float inputHdrSdrRatio,
+    // The inputMultiplier informs how to interpret the luminance encoding of the input.
+    // For a fixed point input, this is necessary to interpret what "1.0" means for the input
+    // pixels, so that the luminance can be scaled appropriately, such that the operator can
+    // transform SDR values to be within 1.0. For a floating point input, "1.0" always means SDR,
+    // so the caller must pass 1.0.
+    // The target HDR/SDR ratio describes the luminance range of the output.
+    sk_sp<SkShader> mouriMap(SkiaGpuContext* context, sk_sp<SkShader> input, float inputMultiplier,
                              float targetHdrSdrRatio);
 
 private:
diff --git a/libs/renderengine/threaded/RenderEngineThreaded.cpp b/libs/renderengine/threaded/RenderEngineThreaded.cpp
index c187f93..ea5605d 100644
--- a/libs/renderengine/threaded/RenderEngineThreaded.cpp
+++ b/libs/renderengine/threaded/RenderEngineThreaded.cpp
@@ -249,11 +249,10 @@
     return;
 }
 
-void RenderEngineThreaded::drawGainmapInternal(
+void RenderEngineThreaded::tonemapAndDrawGainmapInternal(
         const std::shared_ptr<std::promise<FenceResult>>&& resultPromise,
-        const std::shared_ptr<ExternalTexture>& sdr, base::borrowed_fd&& sdrFence,
         const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
-        float hdrSdrRatio, ui::Dataspace dataspace,
+        float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
         const std::shared_ptr<ExternalTexture>& gainmap) {
     resultPromise->set_value(Fence::NO_FENCE);
     return;
@@ -281,10 +280,9 @@
     return resultFuture;
 }
 
-ftl::Future<FenceResult> RenderEngineThreaded::drawGainmap(
-        const std::shared_ptr<ExternalTexture>& sdr, base::borrowed_fd&& sdrFence,
+ftl::Future<FenceResult> RenderEngineThreaded::tonemapAndDrawGainmap(
         const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
-        float hdrSdrRatio, ui::Dataspace dataspace,
+        float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
         const std::shared_ptr<ExternalTexture>& gainmap) {
     SFTRACE_CALL();
     const auto resultPromise = std::make_shared<std::promise<FenceResult>>();
@@ -292,13 +290,14 @@
     {
         std::lock_guard lock(mThreadMutex);
         mNeedsPostRenderCleanup = true;
-        mFunctionCalls.push([resultPromise, sdr, sdrFence = std::move(sdrFence), hdr,
-                             hdrFence = std::move(hdrFence), hdrSdrRatio, dataspace,
+        mFunctionCalls.push([resultPromise, hdr, hdrFence = std::move(hdrFence), hdrSdrRatio,
+                             dataspace, sdr,
                              gainmap](renderengine::RenderEngine& instance) mutable {
-            SFTRACE_NAME("REThreaded::drawGainmap");
-            instance.updateProtectedContext({}, {sdr.get(), hdr.get(), gainmap.get()});
-            instance.drawGainmapInternal(std::move(resultPromise), sdr, std::move(sdrFence), hdr,
-                                         std::move(hdrFence), hdrSdrRatio, dataspace, gainmap);
+            SFTRACE_NAME("REThreaded::tonemapAndDrawGainmap");
+            instance.updateProtectedContext({}, {hdr.get(), sdr.get(), gainmap.get()});
+            instance.tonemapAndDrawGainmapInternal(std::move(resultPromise), hdr,
+                                                   std::move(hdrFence), hdrSdrRatio, dataspace, sdr,
+                                                   gainmap);
         });
     }
     mCondition.notify_one();
diff --git a/libs/renderengine/threaded/RenderEngineThreaded.h b/libs/renderengine/threaded/RenderEngineThreaded.h
index cb6e924..8554b55 100644
--- a/libs/renderengine/threaded/RenderEngineThreaded.h
+++ b/libs/renderengine/threaded/RenderEngineThreaded.h
@@ -55,12 +55,10 @@
                                         const std::vector<LayerSettings>& layers,
                                         const std::shared_ptr<ExternalTexture>& buffer,
                                         base::unique_fd&& bufferFence) override;
-    ftl::Future<FenceResult> drawGainmap(const std::shared_ptr<ExternalTexture>& sdr,
-                                         base::borrowed_fd&& sdrFence,
-                                         const std::shared_ptr<ExternalTexture>& hdr,
-                                         base::borrowed_fd&& hdrFence, float hdrSdrRatio,
-                                         ui::Dataspace dataspace,
-                                         const std::shared_ptr<ExternalTexture>& gainmap) override;
+    ftl::Future<FenceResult> tonemapAndDrawGainmap(
+            const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
+            float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
+            const std::shared_ptr<ExternalTexture>& gainmap) override;
 
     int getContextPriority() override;
     bool supportsBackgroundBlur() override;
@@ -77,13 +75,11 @@
                             const std::vector<LayerSettings>& layers,
                             const std::shared_ptr<ExternalTexture>& buffer,
                             base::unique_fd&& bufferFence) override;
-    void drawGainmapInternal(const std::shared_ptr<std::promise<FenceResult>>&& resultPromise,
-                             const std::shared_ptr<ExternalTexture>& sdr,
-                             base::borrowed_fd&& sdrFence,
-                             const std::shared_ptr<ExternalTexture>& hdr,
-                             base::borrowed_fd&& hdrFence, float hdrSdrRatio,
-                             ui::Dataspace dataspace,
-                             const std::shared_ptr<ExternalTexture>& gainmap) override;
+    void tonemapAndDrawGainmapInternal(
+            const std::shared_ptr<std::promise<FenceResult>>&& resultPromise,
+            const std::shared_ptr<ExternalTexture>& hdr, base::borrowed_fd&& hdrFence,
+            float hdrSdrRatio, ui::Dataspace dataspace, const std::shared_ptr<ExternalTexture>& sdr,
+            const std::shared_ptr<ExternalTexture>& gainmap) override;
 
 private:
     void threadMain(CreateInstanceFactory factory);
diff --git a/services/surfaceflinger/RegionSamplingThread.cpp b/services/surfaceflinger/RegionSamplingThread.cpp
index 21d3396..d3483b0 100644
--- a/services/surfaceflinger/RegionSamplingThread.cpp
+++ b/services/surfaceflinger/RegionSamplingThread.cpp
@@ -346,7 +346,6 @@
     constexpr bool kRegionSampling = true;
     constexpr bool kGrayscale = false;
     constexpr bool kIsProtected = false;
-    constexpr bool kAttachGainmap = false;
 
     SurfaceFlinger::RenderAreaBuilderVariant
             renderAreaBuilder(std::in_place_type<DisplayRenderAreaBuilder>, sampledBounds,
@@ -358,7 +357,7 @@
             mFlinger.getSnapshotsFromMainThread(renderAreaBuilder, getLayerSnapshotsFn, layers);
     FenceResult fenceResult =
             mFlinger.captureScreenshot(renderAreaBuilder, buffer, kRegionSampling, kGrayscale,
-                                       kIsProtected, kAttachGainmap, nullptr, displayState, layers)
+                                       kIsProtected, nullptr, displayState, layers)
                     .get();
     if (fenceResult.ok()) {
         fenceResult.value()->waitForever(LOG_TAG);
diff --git a/services/surfaceflinger/SurfaceFlinger.cpp b/services/surfaceflinger/SurfaceFlinger.cpp
index 59b1917..962e2fc 100644
--- a/services/surfaceflinger/SurfaceFlinger.cpp
+++ b/services/surfaceflinger/SurfaceFlinger.cpp
@@ -7145,8 +7145,7 @@
                                                  displayWeak, options),
                         getLayerSnapshotsFn, reqSize,
                         static_cast<ui::PixelFormat>(captureArgs.pixelFormat),
-                        captureArgs.allowProtected, captureArgs.grayscale,
-                        captureArgs.attachGainmap, captureListener);
+                        captureArgs.allowProtected, captureArgs.grayscale, captureListener);
 }
 
 void SurfaceFlinger::captureDisplay(DisplayId displayId, const CaptureArgs& args,
@@ -7203,7 +7202,7 @@
                                                  static_cast<ui::Dataspace>(args.dataspace),
                                                  displayWeak, options),
                         getLayerSnapshotsFn, size, static_cast<ui::PixelFormat>(args.pixelFormat),
-                        kAllowProtected, kGrayscale, args.attachGainmap, captureListener);
+                        kAllowProtected, kGrayscale, captureListener);
 }
 
 ScreenCaptureResults SurfaceFlinger::captureLayersSync(const LayerCaptureArgs& args) {
@@ -7314,8 +7313,7 @@
                                                  options),
                         getLayerSnapshotsFn, reqSize,
                         static_cast<ui::PixelFormat>(captureArgs.pixelFormat),
-                        captureArgs.allowProtected, captureArgs.grayscale,
-                        captureArgs.attachGainmap, captureListener);
+                        captureArgs.allowProtected, captureArgs.grayscale, captureListener);
 }
 
 // Creates a Future release fence for a layer and keeps track of it in a list to
@@ -7373,7 +7371,7 @@
 void SurfaceFlinger::captureScreenCommon(RenderAreaBuilderVariant renderAreaBuilder,
                                          GetLayerSnapshotsFunction getLayerSnapshotsFn,
                                          ui::Size bufferSize, ui::PixelFormat reqPixelFormat,
-                                         bool allowProtected, bool grayscale, bool attachGainmap,
+                                         bool allowProtected, bool grayscale,
                                          const sp<IScreenCaptureListener>& captureListener) {
     SFTRACE_CALL();
 
@@ -7388,6 +7386,10 @@
     std::vector<std::pair<Layer*, sp<LayerFE>>> layers;
     auto displayState = getSnapshotsFromMainThread(renderAreaBuilder, getLayerSnapshotsFn, layers);
 
+    const bool hasHdrLayer = std::any_of(layers.cbegin(), layers.cend(), [this](const auto& layer) {
+        return isHdrLayer(*(layer.second->mSnapshot.get()));
+    });
+
     const bool supportsProtected = getRenderEngine().supportsProtectedContent();
     bool hasProtectedLayer = false;
     if (allowProtected && supportsProtected) {
@@ -7416,9 +7418,51 @@
             renderengine::impl::ExternalTexture>(buffer, getRenderEngine(),
                                                  renderengine::impl::ExternalTexture::Usage::
                                                          WRITEABLE);
-    auto futureFence =
-            captureScreenshot(renderAreaBuilder, texture, false /* regionSampling */, grayscale,
-                              isProtected, attachGainmap, captureListener, displayState, layers);
+
+    std::shared_ptr<renderengine::impl::ExternalTexture> hdrTexture;
+    std::shared_ptr<renderengine::impl::ExternalTexture> gainmapTexture;
+
+    bool hintForSeamless = std::visit(
+            [](auto&& arg) {
+                return arg.options.test(RenderArea::Options::HINT_FOR_SEAMLESS_TRANSITION);
+            },
+            renderAreaBuilder);
+    if (hasHdrLayer && !hintForSeamless && FlagManager::getInstance().true_hdr_screenshots()) {
+        const auto hdrBuffer =
+                getFactory().createGraphicBuffer(buffer->getWidth(), buffer->getHeight(),
+                                                 HAL_PIXEL_FORMAT_RGBA_FP16, 1 /* layerCount */,
+                                                 buffer->getUsage(), "screenshot-hdr");
+        const auto gainmapBuffer =
+                getFactory().createGraphicBuffer(buffer->getWidth(), buffer->getHeight(),
+                                                 buffer->getPixelFormat(), 1 /* layerCount */,
+                                                 buffer->getUsage(), "screenshot-gainmap");
+
+        const status_t hdrBufferStatus = hdrBuffer->initCheck();
+        const status_t gainmapBufferStatus = gainmapBuffer->initCheck();
+
+        if (hdrBufferStatus != OK || gainmapBufferStatus != -OK) {
+            if (hdrBufferStatus != OK) {
+                ALOGW("%s: Buffer failed to allocate for hdr: %d. Screenshoting SDR instead.",
+                      __func__, hdrBufferStatus);
+            } else {
+                ALOGW("%s: Buffer failed to allocate for gainmap: %d. Screenshoting SDR instead.",
+                      __func__, gainmapBufferStatus);
+            }
+        } else {
+            hdrTexture = std::make_shared<
+                    renderengine::impl::ExternalTexture>(hdrBuffer, getRenderEngine(),
+                                                         renderengine::impl::ExternalTexture::
+                                                                 Usage::WRITEABLE);
+            gainmapTexture = std::make_shared<
+                    renderengine::impl::ExternalTexture>(gainmapBuffer, getRenderEngine(),
+                                                         renderengine::impl::ExternalTexture::
+                                                                 Usage::WRITEABLE);
+        }
+    }
+
+    auto futureFence = captureScreenshot(renderAreaBuilder, texture, false /* regionSampling */,
+                                         grayscale, isProtected, captureListener, displayState,
+                                         layers, hdrTexture, gainmapTexture);
     futureFence.get();
 }
 
@@ -7461,10 +7505,11 @@
 ftl::SharedFuture<FenceResult> SurfaceFlinger::captureScreenshot(
         const RenderAreaBuilderVariant& renderAreaBuilder,
         const std::shared_ptr<renderengine::ExternalTexture>& buffer, bool regionSampling,
-        bool grayscale, bool isProtected, bool attachGainmap,
-        const sp<IScreenCaptureListener>& captureListener,
-        std::optional<OutputCompositionState>& displayState,
-        std::vector<std::pair<Layer*, sp<LayerFE>>>& layers) {
+        bool grayscale, bool isProtected, const sp<IScreenCaptureListener>& captureListener,
+        const std::optional<OutputCompositionState>& displayState,
+        const std::vector<std::pair<Layer*, sp<LayerFE>>>& layers,
+        const std::shared_ptr<renderengine::ExternalTexture>& hdrBuffer,
+        const std::shared_ptr<renderengine::ExternalTexture>& gainmapBuffer) {
     SFTRACE_CALL();
 
     ScreenCaptureResults captureResults;
@@ -7480,74 +7525,40 @@
         }
         return ftl::yield<FenceResult>(base::unexpected(NO_ERROR)).share();
     }
+
     float displayBrightnessNits = displayState.value().displayBrightnessNits;
     float sdrWhitePointNits = displayState.value().sdrWhitePointNits;
 
-    ftl::SharedFuture<FenceResult> renderFuture =
-            renderScreenImpl(renderArea.get(), buffer, regionSampling, grayscale, isProtected,
-                             captureResults, displayState, layers);
+    ftl::SharedFuture<FenceResult> renderFuture;
 
-    if (captureResults.capturedHdrLayers && attachGainmap &&
-        FlagManager::getInstance().true_hdr_screenshots()) {
-        sp<GraphicBuffer> hdrBuffer =
-                getFactory().createGraphicBuffer(buffer->getWidth(), buffer->getHeight(),
-                                                 HAL_PIXEL_FORMAT_RGBA_FP16, 1 /* layerCount */,
-                                                 buffer->getUsage(), "screenshot-hdr");
-        sp<GraphicBuffer> gainmapBuffer =
-                getFactory().createGraphicBuffer(buffer->getWidth(), buffer->getHeight(),
-                                                 buffer->getPixelFormat(), 1 /* layerCount */,
-                                                 buffer->getUsage(), "screenshot-gainmap");
+    if (hdrBuffer && gainmapBuffer) {
+        ftl::SharedFuture<FenceResult> hdrRenderFuture =
+                renderScreenImpl(renderArea.get(), hdrBuffer, regionSampling, grayscale,
+                                 isProtected, captureResults, displayState, layers);
+        captureResults.buffer = buffer->getBuffer();
+        captureResults.optionalGainMap = gainmapBuffer->getBuffer();
 
-        const status_t bufferStatus = hdrBuffer->initCheck();
-        const status_t gainmapBufferStatus = gainmapBuffer->initCheck();
+        renderFuture =
+                ftl::Future(std::move(hdrRenderFuture))
+                        .then([&, displayBrightnessNits, sdrWhitePointNits,
+                               dataspace = captureResults.capturedDataspace, buffer, hdrBuffer,
+                               gainmapBuffer](FenceResult fenceResult) -> FenceResult {
+                            if (!fenceResult.ok()) {
+                                return fenceResult;
+                            }
 
-        if (bufferStatus != OK) {
-            ALOGW("%s: Buffer failed to allocate for hdr: %d. Screenshoting SDR instead.", __func__,
-                  bufferStatus);
-        } else if (gainmapBufferStatus != OK) {
-            ALOGW("%s: Buffer failed to allocate for gainmap: %d. Screenshoting SDR instead.",
-                  __func__, gainmapBufferStatus);
-        } else {
-            captureResults.optionalGainMap = gainmapBuffer;
-            const auto hdrTexture = std::make_shared<
-                    renderengine::impl::ExternalTexture>(hdrBuffer, getRenderEngine(),
-                                                         renderengine::impl::ExternalTexture::
-                                                                 Usage::WRITEABLE);
-            const auto gainmapTexture = std::make_shared<
-                    renderengine::impl::ExternalTexture>(gainmapBuffer, getRenderEngine(),
-                                                         renderengine::impl::ExternalTexture::
-                                                                 Usage::WRITEABLE);
-            ScreenCaptureResults unusedResults;
-            ftl::SharedFuture<FenceResult> hdrRenderFuture =
-                    renderScreenImpl(renderArea.get(), hdrTexture, regionSampling, grayscale,
-                                     isProtected, unusedResults, displayState, layers);
-
-            renderFuture =
-                    ftl::Future(std::move(renderFuture))
-                            .then([&, hdrRenderFuture = std::move(hdrRenderFuture),
-                                   displayBrightnessNits, sdrWhitePointNits,
-                                   dataspace = captureResults.capturedDataspace, buffer, hdrTexture,
-                                   gainmapTexture](FenceResult fenceResult) -> FenceResult {
-                                if (!fenceResult.ok()) {
-                                    return fenceResult;
-                                }
-
-                                auto hdrFenceResult = hdrRenderFuture.get();
-
-                                if (!hdrFenceResult.ok()) {
-                                    return hdrFenceResult;
-                                }
-
-                                return getRenderEngine()
-                                        .drawGainmap(buffer, fenceResult.value()->get(), hdrTexture,
-                                                     hdrFenceResult.value()->get(),
-                                                     displayBrightnessNits / sdrWhitePointNits,
-                                                     static_cast<ui::Dataspace>(dataspace),
-                                                     gainmapTexture)
-                                        .get();
-                            })
-                            .share();
-        };
+                            return getRenderEngine()
+                                    .tonemapAndDrawGainmap(hdrBuffer, fenceResult.value()->get(),
+                                                           displayBrightnessNits /
+                                                                   sdrWhitePointNits,
+                                                           static_cast<ui::Dataspace>(dataspace),
+                                                           buffer, gainmapBuffer)
+                                    .get();
+                        })
+                        .share();
+    } else {
+        renderFuture = renderScreenImpl(renderArea.get(), buffer, regionSampling, grayscale,
+                                        isProtected, captureResults, displayState, layers);
     }
 
     if (captureListener) {
@@ -7569,8 +7580,8 @@
 ftl::SharedFuture<FenceResult> SurfaceFlinger::renderScreenImpl(
         const RenderArea* renderArea, const std::shared_ptr<renderengine::ExternalTexture>& buffer,
         bool regionSampling, bool grayscale, bool isProtected, ScreenCaptureResults& captureResults,
-        std::optional<OutputCompositionState>& displayState,
-        std::vector<std::pair<Layer*, sp<LayerFE>>>& layers) {
+        const std::optional<OutputCompositionState>& displayState,
+        const std::vector<std::pair<Layer*, sp<LayerFE>>>& layers) {
     SFTRACE_CALL();
 
     for (auto& [_, layerFE] : layers) {
@@ -7638,9 +7649,8 @@
     }
 
     auto present = [this, buffer = capturedBuffer, dataspace = captureResults.capturedDataspace,
-                    sdrWhitePointNits, displayBrightnessNits, grayscale, isProtected,
-                    layers = std::move(layers), layerStack, regionSampling,
-                    renderArea = std::move(renderArea), renderIntent,
+                    sdrWhitePointNits, displayBrightnessNits, grayscale, isProtected, layers,
+                    layerStack, regionSampling, renderArea = std::move(renderArea), renderIntent,
                     enableLocalTonemapping]() -> FenceResult {
         std::unique_ptr<compositionengine::CompositionEngine> compositionEngine =
                 mFactory.createCompositionEngine();
diff --git a/services/surfaceflinger/SurfaceFlinger.h b/services/surfaceflinger/SurfaceFlinger.h
index 1e2c087..236e288 100644
--- a/services/surfaceflinger/SurfaceFlinger.h
+++ b/services/surfaceflinger/SurfaceFlinger.h
@@ -872,7 +872,7 @@
 
     void captureScreenCommon(RenderAreaBuilderVariant, GetLayerSnapshotsFunction,
                              ui::Size bufferSize, ui::PixelFormat, bool allowProtected,
-                             bool grayscale, bool attachGainmap, const sp<IScreenCaptureListener>&);
+                             bool grayscale, const sp<IScreenCaptureListener>&);
 
     std::optional<OutputCompositionState> getDisplayStateFromRenderAreaBuilder(
             RenderAreaBuilderVariant& renderAreaBuilder) REQUIRES(kMainThreadContext);
@@ -880,16 +880,17 @@
     ftl::SharedFuture<FenceResult> captureScreenshot(
             const RenderAreaBuilderVariant& renderAreaBuilder,
             const std::shared_ptr<renderengine::ExternalTexture>& buffer, bool regionSampling,
-            bool grayscale, bool isProtected, bool attachGainmap,
-            const sp<IScreenCaptureListener>& captureListener,
-            std::optional<OutputCompositionState>& displayState,
-            std::vector<std::pair<Layer*, sp<LayerFE>>>& layers);
+            bool grayscale, bool isProtected, const sp<IScreenCaptureListener>& captureListener,
+            const std::optional<OutputCompositionState>& displayState,
+            const std::vector<std::pair<Layer*, sp<LayerFE>>>& layers,
+            const std::shared_ptr<renderengine::ExternalTexture>& hdrBuffer = nullptr,
+            const std::shared_ptr<renderengine::ExternalTexture>& gainmapBuffer = nullptr);
 
     ftl::SharedFuture<FenceResult> renderScreenImpl(
             const RenderArea*, const std::shared_ptr<renderengine::ExternalTexture>&,
             bool regionSampling, bool grayscale, bool isProtected, ScreenCaptureResults&,
-            std::optional<OutputCompositionState>& displayState,
-            std::vector<std::pair<Layer*, sp<LayerFE>>>& layers);
+            const std::optional<OutputCompositionState>& displayState,
+            const std::vector<std::pair<Layer*, sp<LayerFE>>>& layers);
 
     void readPersistentProperties();