[Tuning] Disable CPU Idle in NNAPI workload with PMQoS CPU DMA Latency

To improve the return path latency, we want to keep the CPU at at least WFI state (Idle_1). The PMQos cpu_dma_latency knob prevents the CPU from falling below WFI state. This makes the return path CPU wakeup latency very good. Check with wvw@, the power impact shouldn't be too significant.

The average energy cost per inference dropped from 3.85 to 3.47 mJ. The reason why the power number is lower WITH disable-idle is that, due to better latency, we get to run more inferences given the same amount of time. This makes the average power consumption lower.

Measurement:
MLPerf IC model Latency (ms) Power (mW) Energy/inference (mJ) MLPerf scores
Default 1.35 2837 3.85 560
Disable CPU Idle 0.98 3539 3.47 826

https://docs.google.com/presentation/d/1zx7sLkhOClmuRTCrq8-l3N1mZrrv7f-CtcdMuzV0eaI/edit?pli=1#slide=id.g12dd9e50b4b_0_0

Bug: 232183574
Test: MLPerf on Android T. Performance improved. Verified on Perfetto.
Change-Id: If067e0851bea0475043ef2127a25ed3a5fdab093
diff --git a/powerhint-oriole.json b/powerhint-oriole.json
index 48c0e51..314f634 100644
--- a/powerhint-oriole.json
+++ b/powerhint-oriole.json
@@ -197,6 +197,15 @@
       "ResetOnInit": true
     },
     {
+      "Name": "PMQoSCpuDmaLatency",
+      "Path": "/dev/cpu_dma_latency",
+      "Values": [
+        "44",
+        "1000"
+      ],
+      "HoldFd": true
+    },
+    {
       "Name": "CDPreferIdle",
       "Path": "/proc/vendor_sched/cam_prefer_idle",
       "Values": [
@@ -1593,6 +1602,12 @@
       "Value": "512"
     },
     {
+      "PowerHint": "ML_ACC",
+      "Node": "PMQoSCpuDmaLatency",
+      "Duration": 2000,
+      "Value": "44"
+    },
+    {
       "PowerHint": "DEVICE_IDLE",
       "Node": "RestrictedCpuset",
       "Duration": 0,
diff --git a/powerhint-raven.json b/powerhint-raven.json
index da30e1c..3b9b980 100644
--- a/powerhint-raven.json
+++ b/powerhint-raven.json
@@ -197,6 +197,15 @@
       "ResetOnInit": true
     },
     {
+      "Name": "PMQoSCpuDmaLatency",
+      "Path": "/dev/cpu_dma_latency",
+      "Values": [
+        "44",
+        "1000"
+      ],
+      "HoldFd": true
+    },
+    {
       "Name": "CDPreferIdle",
       "Path": "/proc/vendor_sched/cam_prefer_idle",
       "Values": [
@@ -1609,6 +1618,12 @@
       "Value": "512"
     },
     {
+      "PowerHint": "ML_ACC",
+      "Node": "PMQoSCpuDmaLatency",
+      "Duration": 2000,
+      "Value": "44"
+    },
+    {
       "PowerHint": "DEVICE_IDLE",
       "Node": "RestrictedCpuset",
       "Duration": 0,