blob: c98a76b88e7ad17cbdd7e6f4a30c9149ff3c0281 [file] [log] [blame]
Janis Danisevskis1af91262020-08-10 14:58:08 -07001// Copyright 2020, The Android Open Source Project
2//
3// Licensed under the Apache License, Version 2.0 (the "License");
4// you may not use this file except in compliance with the License.
5// You may obtain a copy of the License at
6//
7// http://www.apache.org/licenses/LICENSE-2.0
8//
9// Unless required by applicable law or agreed to in writing, software
10// distributed under the License is distributed on an "AS IS" BASIS,
11// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12// See the License for the specific language governing permissions and
13// limitations under the License.
14
15//! This crate implements the `IKeystoreOperation` AIDL interface, which represents
16//! an ongoing key operation, as well as the operation database, which is mainly
17//! required for tracking operations for the purpose of pruning.
18//! This crate also implements an operation pruning strategy.
19//!
20//! Operations implement the API calls update, finish, and abort.
21//! Additionally, an operation can be dropped and pruned. The former
22//! happens if the client deletes a binder to the operation object.
23//! An existing operation may get pruned when running out of operation
24//! slots and a new operation takes precedence.
25//!
26//! ## Operation Lifecycle
27//! An operation gets created when the client calls `IKeystoreSecurityLevel::create`.
28//! It may receive zero or more update request. The lifecycle ends when:
29//! * `update` yields an error.
30//! * `finish` is called.
31//! * `abort` is called.
32//! * The operation gets dropped.
33//! * The operation gets pruned.
34//! `Operation` has an `Outcome` member. While the outcome is `Outcome::Unknown`,
35//! the operation is active and in a good state. Any of the above conditions may
36//! change the outcome to one of the defined outcomes Success, Abort, Dropped,
37//! Pruned, or ErrorCode. The latter is chosen in the case of an unexpected error, during
38//! `update` or `finish`. `Success` is chosen iff `finish` completes without error.
39//! Note that all operations get dropped eventually in the sense that they lose
40//! their last reference and get destroyed. At that point, the fate of the operation
41//! gets logged. However, an operation will transition to `Outcome::Dropped` iff
42//! the operation was still active (`Outcome::Unknown`) at that time.
43//!
44//! ## Operation Dropping
45//! To observe the dropping of an operation, we have to make sure that there
46//! are no strong references to the IBinder representing this operation.
47//! This would be simple enough if the operation object would need to be accessed
48//! only by transactions. But to perform pruning, we have to retain a reference to the
49//! original operation object.
50//!
51//! ## Operation Pruning
52//! Pruning an operation happens during the creation of a new operation.
53//! We have to iterate through the operation database to find a suitable
54//! candidate. Then we abort and finalize this operation setting its outcome to
55//! `Outcome::Pruned`. The corresponding KeyMint operation slot will have been freed
56//! up at this point, but the `Operation` object lingers. When the client
57//! attempts to use the operation again they will receive
58//! ErrorCode::INVALID_OPERATION_HANDLE indicating that the operation no longer
59//! exits. This should be the cue for the client to destroy its binder.
60//! At that point the operation gets dropped.
61//!
62//! ## Architecture
63//! The `IKeystoreOperation` trait is implemented by `KeystoreOperation`.
64//! This acts as a proxy object holding a strong reference to actual operation
65//! implementation `Operation`.
66//!
67//! ```
68//! struct KeystoreOperation {
69//! operation: Mutex<Option<Arc<Operation>>>,
70//! }
71//! ```
72//!
73//! The `Mutex` serves two purposes. It provides interior mutability allowing
74//! us to set the Option to None. We do this when the life cycle ends during
75//! a call to `update`, `finish`, or `abort`. As a result most of the Operation
76//! related resources are freed. The `KeystoreOperation` proxy object still
77//! lingers until dropped by the client.
78//! The second purpose is to protect operations against concurrent usage.
79//! Failing to lock this mutex yields `ResponseCode::OPERATION_BUSY` and indicates
80//! a programming error in the client.
81//!
82//! Note that the Mutex only protects the operation against concurrent client calls.
83//! We still retain weak references to the operation in the operation database:
84//!
85//! ```
86//! struct OperationDb {
87//! operations: Mutex<Vec<Weak<Operation>>>
88//! }
89//! ```
90//!
91//! This allows us to access the operations for the purpose of pruning.
92//! We do this in three phases.
93//! 1. We gather the pruning information. Besides non mutable information,
94//! we access `last_usage` which is protected by a mutex.
95//! We only lock this mutex for single statements at a time. During
96//! this phase we hold the operation db lock.
97//! 2. We choose a pruning candidate by computing the pruning resistance
98//! of each operation. We do this entirely with information we now
99//! have on the stack without holding any locks.
100//! (See `OperationDb::prune` for more details on the pruning strategy.)
101//! 3. During pruning we briefly lock the operation database again to get the
102//! the pruning candidate by index. We then attempt to abort the candidate.
103//! If the candidate was touched in the meantime or is currently fulfilling
104//! a request (i.e., the client calls update, finish, or abort),
105//! we go back to 1 and try again.
106//!
107//! So the outer Mutex in `KeystoreOperation::operation` only protects
108//! operations against concurrent client calls but not against concurrent
109//! pruning attempts. This is what the `Operation::outcome` mutex is used for.
110//!
111//! ```
112//! struct Operation {
113//! ...
114//! outcome: Mutex<Outcome>,
115//! ...
116//! }
117//! ```
118//!
119//! Any request that can change the outcome, i.e., `update`, `finish`, `abort`,
120//! `drop`, and `prune` has to take the outcome lock and check if the outcome
121//! is still `Outcome::Unknown` before entering. `prune` is special in that
122//! it will `try_lock`, because we don't want to be blocked on a potentially
123//! long running request at another operation. If it fails to get the lock
124//! the operation is either being touched, which changes its pruning resistance,
125//! or it transitions to its end-of-life, which means we may get a free slot.
126//! Either way, we have to revaluate the pruning scores.
127
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800128use crate::enforcements::AuthInfo;
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000129use crate::error::{map_km_error, map_or_log_err, Error, ErrorCode, ResponseCode};
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000130use crate::utils::Asp;
131use android_hardware_security_keymint::aidl::android::hardware::security::keymint::{
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800132 ByteArray::ByteArray, IKeyMintOperation::IKeyMintOperation,
133 KeyParameter::KeyParameter as KmParam, KeyParameterArray::KeyParameterArray,
134 KeyParameterValue::KeyParameterValue as KmParamValue, Tag::Tag,
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000135};
136use android_system_keystore2::aidl::android::system::keystore2::{
137 IKeystoreOperation::BnKeystoreOperation, IKeystoreOperation::IKeystoreOperation,
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000138};
139use anyhow::{anyhow, Context, Result};
Stephen Crane221bbb52020-12-16 15:52:10 -0800140use binder::IBinder;
Janis Danisevskis1af91262020-08-10 14:58:08 -0700141use std::{
142 collections::HashMap,
143 sync::{Arc, Mutex, MutexGuard, Weak},
144 time::Duration,
145 time::Instant,
146};
147
Janis Danisevskis1af91262020-08-10 14:58:08 -0700148/// Operations have `Outcome::Unknown` as long as they are active. They transition
149/// to one of the other variants exactly once. The distinction in outcome is mainly
150/// for the statistic.
151#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd)]
152enum Outcome {
153 Unknown,
154 Success,
155 Abort,
156 Dropped,
157 Pruned,
158 ErrorCode(ErrorCode),
159}
160
161/// Operation bundles all of the operation related resources and tracks the operation's
162/// outcome.
163#[derive(Debug)]
164pub struct Operation {
165 // The index of this operation in the OperationDb.
166 index: usize,
167 km_op: Asp,
168 last_usage: Mutex<Instant>,
169 outcome: Mutex<Outcome>,
170 owner: u32, // Uid of the operation's owner.
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800171 auth_info: Mutex<AuthInfo>,
Janis Danisevskis1af91262020-08-10 14:58:08 -0700172}
173
174struct PruningInfo {
175 last_usage: Instant,
176 owner: u32,
177 index: usize,
178}
179
Janis Danisevskis1af91262020-08-10 14:58:08 -0700180// We don't except more than 32KiB of data in `update`, `updateAad`, and `finish`.
181const MAX_RECEIVE_DATA: usize = 0x8000;
182
183impl Operation {
184 /// Constructor
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000185 pub fn new(
186 index: usize,
Stephen Crane221bbb52020-12-16 15:52:10 -0800187 km_op: binder::Strong<dyn IKeyMintOperation>,
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000188 owner: u32,
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800189 auth_info: AuthInfo,
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000190 ) -> Self {
Janis Danisevskis1af91262020-08-10 14:58:08 -0700191 Self {
192 index,
193 km_op: Asp::new(km_op.as_binder()),
194 last_usage: Mutex::new(Instant::now()),
195 outcome: Mutex::new(Outcome::Unknown),
196 owner,
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800197 auth_info: Mutex::new(auth_info),
Janis Danisevskis1af91262020-08-10 14:58:08 -0700198 }
199 }
200
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700201 fn get_pruning_info(&self) -> Option<PruningInfo> {
202 // An operation may be finalized.
203 if let Ok(guard) = self.outcome.try_lock() {
204 match *guard {
205 Outcome::Unknown => {}
206 // If the outcome is any other than unknown, it has been finalized,
207 // and we can no longer consider it for pruning.
208 _ => return None,
209 }
210 }
211 // Else: If we could not grab the lock, this means that the operation is currently
212 // being used and it may be transitioning to finalized or it was simply updated.
213 // In any case it is fair game to consider it for pruning. If the operation
214 // transitioned to a final state, we will notice when we attempt to prune, and
215 // a subsequent attempt to create a new operation will succeed.
216 Some(PruningInfo {
217 // Expect safety:
218 // `last_usage` is locked only for primitive single line statements.
219 // There is no chance to panic and poison the mutex.
Janis Danisevskis1af91262020-08-10 14:58:08 -0700220 last_usage: *self.last_usage.lock().expect("In get_pruning_info."),
221 owner: self.owner,
222 index: self.index,
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700223 })
Janis Danisevskis1af91262020-08-10 14:58:08 -0700224 }
225
226 fn prune(&self, last_usage: Instant) -> Result<(), Error> {
227 let mut locked_outcome = match self.outcome.try_lock() {
228 Ok(guard) => match *guard {
229 Outcome::Unknown => guard,
230 _ => return Err(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE)),
231 },
232 Err(_) => return Err(Error::Rc(ResponseCode::OPERATION_BUSY)),
233 };
234
235 // In `OperationDb::prune`, which is our caller, we first gather the pruning
236 // information including the last usage. When we select a candidate
237 // we call `prune` on that candidate passing the last_usage
238 // that we gathered earlier. If the actual last usage
239 // has changed since than, it means the operation was busy in the
240 // meantime, which means that we have to reevaluate the pruning score.
241 //
242 // Expect safety:
243 // `last_usage` is locked only for primitive single line statements.
244 // There is no chance to panic and poison the mutex.
245 if *self.last_usage.lock().expect("In Operation::prune()") != last_usage {
246 return Err(Error::Rc(ResponseCode::OPERATION_BUSY));
247 }
248 *locked_outcome = Outcome::Pruned;
249
Stephen Crane221bbb52020-12-16 15:52:10 -0800250 let km_op: binder::public_api::Strong<dyn IKeyMintOperation> =
251 match self.km_op.get_interface() {
252 Ok(km_op) => km_op,
253 Err(e) => {
254 log::error!("In prune: Failed to get KeyMintOperation interface.\n {:?}", e);
255 return Err(Error::sys());
256 }
257 };
Janis Danisevskis1af91262020-08-10 14:58:08 -0700258
259 // We abort the operation. If there was an error we log it but ignore it.
260 if let Err(e) = map_km_error(km_op.abort()) {
261 log::error!("In prune: KeyMint::abort failed with {:?}.", e);
262 }
263
264 Ok(())
265 }
266
267 // This function takes a Result from a KeyMint call and inspects it for errors.
268 // If an error was found it updates the given `locked_outcome` accordingly.
269 // It forwards the Result unmodified.
270 // The precondition to this call must be *locked_outcome == Outcome::Unknown.
271 // Ideally the `locked_outcome` came from a successful call to `check_active`
272 // see below.
273 fn update_outcome<T>(
274 &self,
275 locked_outcome: &mut Outcome,
276 err: Result<T, Error>,
277 ) -> Result<T, Error> {
278 match &err {
279 Err(Error::Km(e)) => *locked_outcome = Outcome::ErrorCode(*e),
280 Err(_) => *locked_outcome = Outcome::ErrorCode(ErrorCode::UNKNOWN_ERROR),
281 Ok(_) => (),
282 }
283 err
284 }
285
286 // This function grabs the outcome lock and checks the current outcome state.
287 // If the outcome is still `Outcome::Unknown`, this function returns
288 // the locked outcome for further updates. In any other case it returns
289 // ErrorCode::INVALID_OPERATION_HANDLE indicating that this operation has
290 // been finalized and is no longer active.
291 fn check_active(&self) -> Result<MutexGuard<Outcome>> {
292 let guard = self.outcome.lock().expect("In check_active.");
293 match *guard {
294 Outcome::Unknown => Ok(guard),
295 _ => Err(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE)).context(format!(
296 "In check_active: Call on finalized operation with outcome: {:?}.",
297 *guard
298 )),
299 }
300 }
301
302 // This function checks the amount of input data sent to us. We reject any buffer
303 // exceeding MAX_RECEIVE_DATA bytes as input to `update`, `update_aad`, and `finish`
304 // in order to force clients into using reasonable limits.
305 fn check_input_length(data: &[u8]) -> Result<()> {
306 if data.len() > MAX_RECEIVE_DATA {
307 // This error code is unique, no context required here.
308 return Err(anyhow!(Error::Rc(ResponseCode::TOO_MUCH_DATA)));
309 }
310 Ok(())
311 }
312
313 // Update the last usage to now.
314 fn touch(&self) {
315 // Expect safety:
316 // `last_usage` is locked only for primitive single line statements.
317 // There is no chance to panic and poison the mutex.
318 *self.last_usage.lock().expect("In touch.") = Instant::now();
319 }
320
321 /// Implementation of `IKeystoreOperation::updateAad`.
322 /// Refer to the AIDL spec at system/hardware/interfaces/keystore2 for details.
323 fn update_aad(&self, aad_input: &[u8]) -> Result<()> {
324 let mut outcome = self.check_active().context("In update_aad")?;
325 Self::check_input_length(aad_input).context("In update_aad")?;
326 self.touch();
327
Janis Danisevskis85d47932020-10-23 16:12:59 -0700328 let params = KeyParameterArray {
329 params: vec![KmParam {
330 tag: Tag::ASSOCIATED_DATA,
Janis Danisevskis398e6be2020-12-17 09:29:25 -0800331 value: KmParamValue::Blob(aad_input.into()),
Janis Danisevskis85d47932020-10-23 16:12:59 -0700332 }],
333 };
Janis Danisevskis1af91262020-08-10 14:58:08 -0700334
Janis Danisevskis85d47932020-10-23 16:12:59 -0700335 let mut out_params: Option<KeyParameterArray> = None;
336 let mut output: Option<ByteArray> = None;
Janis Danisevskis1af91262020-08-10 14:58:08 -0700337
Stephen Crane221bbb52020-12-16 15:52:10 -0800338 let km_op: binder::public_api::Strong<dyn IKeyMintOperation> =
Janis Danisevskis1af91262020-08-10 14:58:08 -0700339 self.km_op.get_interface().context("In update: Failed to get KeyMintOperation.")?;
340
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800341 let (hat, tst) = self
342 .auth_info
343 .lock()
344 .unwrap()
Qi Wub9433b52020-12-01 14:52:46 +0800345 .before_update()
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800346 .context("In update_aad: Trying to get auth tokens.")?;
347
Janis Danisevskis1af91262020-08-10 14:58:08 -0700348 self.update_outcome(
349 &mut *outcome,
350 map_km_error(km_op.update(
Janis Danisevskis85d47932020-10-23 16:12:59 -0700351 Some(&params),
352 None,
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800353 hat.as_ref(),
354 tst.as_ref(),
Janis Danisevskis1af91262020-08-10 14:58:08 -0700355 &mut out_params,
356 &mut output,
357 )),
358 )
359 .context("In update_aad: KeyMint::update failed.")?;
360
361 Ok(())
362 }
363
364 /// Implementation of `IKeystoreOperation::update`.
365 /// Refer to the AIDL spec at system/hardware/interfaces/keystore2 for details.
366 fn update(&self, input: &[u8]) -> Result<Option<Vec<u8>>> {
367 let mut outcome = self.check_active().context("In update")?;
368 Self::check_input_length(input).context("In update")?;
369 self.touch();
370
Janis Danisevskis85d47932020-10-23 16:12:59 -0700371 let mut out_params: Option<KeyParameterArray> = None;
Janis Danisevskis1af91262020-08-10 14:58:08 -0700372
Stephen Crane221bbb52020-12-16 15:52:10 -0800373 let km_op: binder::public_api::Strong<dyn IKeyMintOperation> =
Janis Danisevskis1af91262020-08-10 14:58:08 -0700374 self.km_op.get_interface().context("In update: Failed to get KeyMintOperation.")?;
375
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800376 let (hat, tst) = self
377 .auth_info
378 .lock()
379 .unwrap()
Qi Wub9433b52020-12-01 14:52:46 +0800380 .before_update()
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800381 .context("In update: Trying to get auth tokens.")?;
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000382
Janis Danisevskis002891c2021-01-31 13:07:02 -0800383 let mut result: Option<Vec<u8>> = None;
384 let mut consumed = 0usize;
385 loop {
386 let mut output: Option<ByteArray> = None;
387 consumed += self
388 .update_outcome(
389 &mut *outcome,
390 map_km_error(km_op.update(
391 None,
392 Some(&input[consumed..]),
393 hat.as_ref(),
394 tst.as_ref(),
395 &mut out_params,
396 &mut output,
397 )),
398 )
399 .context("In update: KeyMint::update failed.")? as usize;
Janis Danisevskis1af91262020-08-10 14:58:08 -0700400
Janis Danisevskis002891c2021-01-31 13:07:02 -0800401 match (output, &mut result) {
402 (Some(blob), None) => {
403 if !blob.data.is_empty() {
404 result = Some(blob.data)
405 }
Janis Danisevskis3cfd4a42020-11-23 13:42:38 -0800406 }
Janis Danisevskis002891c2021-01-31 13:07:02 -0800407 (Some(mut blob), Some(ref mut result)) => {
408 result.append(&mut blob.data);
409 }
410 (None, _) => {}
Janis Danisevskis3cfd4a42020-11-23 13:42:38 -0800411 }
Janis Danisevskis002891c2021-01-31 13:07:02 -0800412
413 if consumed == input.len() {
414 return Ok(result);
415 }
Janis Danisevskis1af91262020-08-10 14:58:08 -0700416 }
417 }
418
419 /// Implementation of `IKeystoreOperation::finish`.
420 /// Refer to the AIDL spec at system/hardware/interfaces/keystore2 for details.
421 fn finish(&self, input: Option<&[u8]>, signature: Option<&[u8]>) -> Result<Option<Vec<u8>>> {
422 let mut outcome = self.check_active().context("In finish")?;
423 if let Some(input) = input {
424 Self::check_input_length(input).context("In finish")?;
425 }
426 self.touch();
Janis Danisevskis1af91262020-08-10 14:58:08 -0700427
Janis Danisevskis85d47932020-10-23 16:12:59 -0700428 let mut out_params: Option<KeyParameterArray> = None;
Janis Danisevskis1af91262020-08-10 14:58:08 -0700429
Stephen Crane221bbb52020-12-16 15:52:10 -0800430 let km_op: binder::public_api::Strong<dyn IKeyMintOperation> =
Janis Danisevskis1af91262020-08-10 14:58:08 -0700431 self.km_op.get_interface().context("In finish: Failed to get KeyMintOperation.")?;
432
Janis Danisevskisb1673db2021-02-08 18:11:57 -0800433 let (hat, tst, confirmation_token) = self
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800434 .auth_info
435 .lock()
436 .unwrap()
Qi Wub9433b52020-12-01 14:52:46 +0800437 .before_finish()
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800438 .context("In finish: Trying to get auth tokens.")?;
Hasini Gunasinghe888dd352020-11-17 23:08:39 +0000439
Janis Danisevskisb1673db2021-02-08 18:11:57 -0800440 let in_params = confirmation_token.map(|token| KeyParameterArray {
441 params: vec![KmParam {
442 tag: Tag::CONFIRMATION_TOKEN,
443 value: KmParamValue::Blob(token),
444 }],
445 });
446
Janis Danisevskis85d47932020-10-23 16:12:59 -0700447 let output = self
448 .update_outcome(
449 &mut *outcome,
450 map_km_error(km_op.finish(
Janis Danisevskisb1673db2021-02-08 18:11:57 -0800451 in_params.as_ref(),
Janis Danisevskis85d47932020-10-23 16:12:59 -0700452 input,
453 signature,
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800454 hat.as_ref(),
455 tst.as_ref(),
Janis Danisevskis85d47932020-10-23 16:12:59 -0700456 &mut out_params,
457 )),
458 )
459 .context("In finish: KeyMint::finish failed.")?;
Janis Danisevskis1af91262020-08-10 14:58:08 -0700460
Qi Wub9433b52020-12-01 14:52:46 +0800461 self.auth_info.lock().unwrap().after_finish().context("In finish.")?;
462
Janis Danisevskis1af91262020-08-10 14:58:08 -0700463 // At this point the operation concluded successfully.
464 *outcome = Outcome::Success;
465
466 if output.is_empty() {
467 Ok(None)
468 } else {
469 Ok(Some(output))
470 }
471 }
472
473 /// Aborts the operation if it is active. IFF the operation is aborted the outcome is
474 /// set to `outcome`. `outcome` must reflect the reason for the abort. Since the operation
475 /// gets aborted `outcome` must not be `Operation::Success` or `Operation::Unknown`.
476 fn abort(&self, outcome: Outcome) -> Result<()> {
477 let mut locked_outcome = self.check_active().context("In abort")?;
478 *locked_outcome = outcome;
Stephen Crane221bbb52020-12-16 15:52:10 -0800479 let km_op: binder::public_api::Strong<dyn IKeyMintOperation> =
Janis Danisevskis1af91262020-08-10 14:58:08 -0700480 self.km_op.get_interface().context("In abort: Failed to get KeyMintOperation.")?;
481
482 map_km_error(km_op.abort()).context("In abort: KeyMint::abort failed.")
483 }
484}
485
486impl Drop for Operation {
487 fn drop(&mut self) {
488 if let Ok(Outcome::Unknown) = self.outcome.get_mut() {
489 // If the operation was still active we call abort, setting
490 // the outcome to `Outcome::Dropped`
491 if let Err(e) = self.abort(Outcome::Dropped) {
492 log::error!("While dropping Operation: abort failed:\n {:?}", e);
493 }
494 }
495 }
496}
497
498/// The OperationDb holds weak references to all ongoing operations.
499/// Its main purpose is to facilitate operation pruning.
500#[derive(Debug, Default)]
501pub struct OperationDb {
502 // TODO replace Vec with WeakTable when the weak_table crate becomes
503 // available.
504 operations: Mutex<Vec<Weak<Operation>>>,
505}
506
507impl OperationDb {
508 /// Creates a new OperationDb.
509 pub fn new() -> Self {
510 Self { operations: Mutex::new(Vec::new()) }
511 }
512
513 /// Creates a new operation.
514 /// This function takes a KeyMint operation and an associated
515 /// owner uid and returns a new Operation wrapped in a `std::sync::Arc`.
516 pub fn create_operation(
517 &self,
Stephen Crane221bbb52020-12-16 15:52:10 -0800518 km_op: binder::public_api::Strong<dyn IKeyMintOperation>,
Janis Danisevskis1af91262020-08-10 14:58:08 -0700519 owner: u32,
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800520 auth_info: AuthInfo,
Janis Danisevskis1af91262020-08-10 14:58:08 -0700521 ) -> Arc<Operation> {
522 // We use unwrap because we don't allow code that can panic while locked.
523 let mut operations = self.operations.lock().expect("In create_operation.");
524
525 let mut index: usize = 0;
526 // First we iterate through the operation slots to try and find an unused
527 // slot. If we don't find one, we append the new entry instead.
528 match (*operations).iter_mut().find(|s| {
529 index += 1;
530 s.upgrade().is_none()
531 }) {
532 Some(free_slot) => {
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800533 let new_op = Arc::new(Operation::new(index - 1, km_op, owner, auth_info));
Janis Danisevskis1af91262020-08-10 14:58:08 -0700534 *free_slot = Arc::downgrade(&new_op);
535 new_op
536 }
537 None => {
Janis Danisevskis5ed8c532021-01-11 14:19:42 -0800538 let new_op = Arc::new(Operation::new(operations.len(), km_op, owner, auth_info));
Janis Danisevskis1af91262020-08-10 14:58:08 -0700539 operations.push(Arc::downgrade(&new_op));
540 new_op
541 }
542 }
543 }
544
545 fn get(&self, index: usize) -> Option<Arc<Operation>> {
546 self.operations.lock().expect("In OperationDb::get.").get(index).and_then(|op| op.upgrade())
547 }
548
549 /// Attempts to prune an operation.
550 ///
551 /// This function is used during operation creation, i.e., by
552 /// `KeystoreSecurityLevel::create_operation`, to try and free up an operation slot
553 /// if it got `ErrorCode::TOO_MANY_OPERATIONS` from the KeyMint backend. It is not
554 /// guaranteed that an operation slot is available after this call successfully
555 /// returned for various reasons. E.g., another thread may have snatched up the newly
556 /// available slot. Callers may have to call prune multiple times before they get a
557 /// free operation slot. Prune may also return `Err(Error::Rc(ResponseCode::BACKEND_BUSY))`
558 /// which indicates that no prunable operation was found.
559 ///
560 /// To find a suitable candidate we compute the malus for the caller and each existing
561 /// operation. The malus is the inverse of the pruning power (caller) or pruning
562 /// resistance (existing operation).
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700563 ///
Janis Danisevskis1af91262020-08-10 14:58:08 -0700564 /// The malus is based on the number of sibling operations and age. Sibling
565 /// operations are operations that have the same owner (UID).
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700566 ///
Janis Danisevskis1af91262020-08-10 14:58:08 -0700567 /// Every operation, existing or new, starts with a malus of 1. Every sibling
568 /// increases the malus by one. The age is the time since an operation was last touched.
569 /// It increases the malus by log6(<age in seconds> + 1) rounded down to the next
570 /// integer. So the malus increases stepwise after 5s, 35s, 215s, ...
571 /// Of two operations with the same malus the least recently used one is considered
572 /// weaker.
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700573 ///
Janis Danisevskis1af91262020-08-10 14:58:08 -0700574 /// For the caller to be able to prune an operation it must find an operation
575 /// with a malus higher than its own.
576 ///
577 /// The malus can be expressed as
578 /// ```
579 /// malus = 1 + no_of_siblings + floor(log6(age_in_seconds + 1))
580 /// ```
581 /// where the constant `1` accounts for the operation under consideration.
582 /// In reality we compute it as
583 /// ```
584 /// caller_malus = 1 + running_siblings
585 /// ```
586 /// because the new operation has no age and is not included in the `running_siblings`,
587 /// and
588 /// ```
589 /// running_malus = running_siblings + floor(log6(age_in_seconds + 1))
590 /// ```
591 /// because a running operation is included in the `running_siblings` and it has
592 /// an age.
593 ///
594 /// ## Example
595 /// A caller with no running operations has a malus of 1. Young (age < 5s) operations
596 /// also with no siblings have a malus of one and cannot be pruned by the caller.
597 /// We have to find an operation that has at least one sibling or is older than 5s.
598 ///
599 /// A caller with one running operation has a malus of 2. Now even young siblings
600 /// or single child aging (5s <= age < 35s) operations are off limit. An aging
601 /// sibling of two, however, would have a malus of 3 and would be fair game.
602 ///
603 /// ## Rationale
604 /// Due to the limitation of KeyMint operation slots, we cannot get around pruning or
605 /// a single app could easily DoS KeyMint.
606 /// Keystore 1.0 used to always prune the least recently used operation. This at least
607 /// guaranteed that new operations can always be started. With the increased usage
608 /// of Keystore we saw increased pruning activity which can lead to a livelock
609 /// situation in the worst case.
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700610 ///
Janis Danisevskis1af91262020-08-10 14:58:08 -0700611 /// With the new pruning strategy we want to provide well behaved clients with
612 /// progress assurances while punishing DoS attempts. As a result of this
613 /// strategy we can be in the situation where no operation can be pruned and the
614 /// creation of a new operation fails. This allows single child operations which
615 /// are frequently updated to complete, thereby breaking up livelock situations
616 /// and facilitating system wide progress.
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700617 ///
618 /// ## Update
619 /// We also allow callers to cannibalize their own sibling operations if no other
620 /// slot can be found. In this case the least recently used sibling is pruned.
Janis Danisevskis1af91262020-08-10 14:58:08 -0700621 pub fn prune(&self, caller: u32) -> Result<(), Error> {
622 loop {
623 // Maps the uid of the owner to the number of operations that owner has
624 // (running_siblings). More operations per owner lowers the pruning
625 // resistance of the operations of that owner. Whereas the number of
626 // ongoing operations of the caller lowers the pruning power of the caller.
627 let mut owners: HashMap<u32, u64> = HashMap::new();
628 let mut pruning_info: Vec<PruningInfo> = Vec::new();
629
630 let now = Instant::now();
631 self.operations
632 .lock()
633 .expect("In OperationDb::prune: Trying to lock self.operations.")
634 .iter()
635 .for_each(|op| {
636 if let Some(op) = op.upgrade() {
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700637 if let Some(p_info) = op.get_pruning_info() {
638 let owner = p_info.owner;
639 pruning_info.push(p_info);
640 // Count operations per owner.
641 *owners.entry(owner).or_insert(0) += 1;
642 }
Janis Danisevskis1af91262020-08-10 14:58:08 -0700643 }
644 });
645
646 let caller_malus = 1u64 + *owners.entry(caller).or_default();
647
648 // We iterate through all operations computing the malus and finding
649 // the candidate with the highest malus which must also be higher
650 // than the caller_malus.
651 struct CandidateInfo {
652 index: usize,
653 malus: u64,
654 last_usage: Instant,
655 age: Duration,
656 }
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700657 let mut oldest_caller_op: Option<CandidateInfo> = None;
Janis Danisevskis1af91262020-08-10 14:58:08 -0700658 let candidate = pruning_info.iter().fold(
659 None,
660 |acc: Option<CandidateInfo>, &PruningInfo { last_usage, owner, index }| {
661 // Compute the age of the current operation.
662 let age = now
663 .checked_duration_since(last_usage)
664 .unwrap_or_else(|| Duration::new(0, 0));
665
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700666 // Find the least recently used sibling as an alternative pruning candidate.
667 if owner == caller {
668 if let Some(CandidateInfo { age: a, .. }) = oldest_caller_op {
669 if age > a {
670 oldest_caller_op =
671 Some(CandidateInfo { index, malus: 0, last_usage, age });
672 }
673 } else {
674 oldest_caller_op =
675 Some(CandidateInfo { index, malus: 0, last_usage, age });
676 }
677 }
678
Janis Danisevskis1af91262020-08-10 14:58:08 -0700679 // Compute the malus of the current operation.
680 // Expect safety: Every owner in pruning_info was counted in
681 // the owners map. So this unwrap cannot panic.
682 let malus = *owners
683 .get(&owner)
684 .expect("This is odd. We should have counted every owner in pruning_info.")
685 + ((age.as_secs() + 1) as f64).log(6.0).floor() as u64;
686
687 // Now check if the current operation is a viable/better candidate
688 // the one currently stored in the accumulator.
689 match acc {
690 // First we have to find any operation that is prunable by the caller.
691 None => {
692 if caller_malus < malus {
693 Some(CandidateInfo { index, malus, last_usage, age })
694 } else {
695 None
696 }
697 }
698 // If we have found one we look for the operation with the worst score.
699 // If there is a tie, the older operation is considered weaker.
700 Some(CandidateInfo { index: i, malus: m, last_usage: l, age: a }) => {
701 if malus > m || (malus == m && age > a) {
702 Some(CandidateInfo { index, malus, last_usage, age })
703 } else {
704 Some(CandidateInfo { index: i, malus: m, last_usage: l, age: a })
705 }
706 }
707 }
708 },
709 );
710
Janis Danisevskis45c5c972020-10-26 09:35:16 -0700711 // If we did not find a suitable candidate we may cannibalize our oldest sibling.
712 let candidate = candidate.or(oldest_caller_op);
713
Janis Danisevskis1af91262020-08-10 14:58:08 -0700714 match candidate {
715 Some(CandidateInfo { index, malus: _, last_usage, age: _ }) => {
716 match self.get(index) {
717 Some(op) => {
718 match op.prune(last_usage) {
719 // We successfully freed up a slot.
720 Ok(()) => break Ok(()),
721 // This means the operation we tried to prune was on its way
722 // out. It also means that the slot it had occupied was freed up.
723 Err(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE)) => break Ok(()),
724 // This means the operation we tried to prune was currently
725 // servicing a request. There are two options.
726 // * Assume that it was touched, which means that its
727 // pruning resistance increased. In that case we have
728 // to start over and find another candidate.
729 // * Assume that the operation is transitioning to end-of-life.
730 // which means that we got a free slot for free.
731 // If we assume the first but the second is true, we prune
732 // a good operation without need (aggressive approach).
733 // If we assume the second but the first is true, our
734 // caller will attempt to create a new KeyMint operation,
735 // fail with `ErrorCode::TOO_MANY_OPERATIONS`, and call
736 // us again (conservative approach).
737 Err(Error::Rc(ResponseCode::OPERATION_BUSY)) => {
738 // We choose the conservative approach, because
739 // every needlessly pruned operation can impact
740 // the user experience.
741 // To switch to the aggressive approach replace
742 // the following line with `continue`.
743 break Ok(());
744 }
745
746 // The candidate may have been touched so the score
747 // has changed since our evaluation.
748 _ => continue,
749 }
750 }
751 // This index does not exist any more. The operation
752 // in this slot was dropped. Good news, a slot
753 // has freed up.
754 None => break Ok(()),
755 }
756 }
757 // We did not get a pruning candidate.
758 None => break Err(Error::Rc(ResponseCode::BACKEND_BUSY)),
759 }
760 }
761 }
762}
763
764/// Implementation of IKeystoreOperation.
765pub struct KeystoreOperation {
766 operation: Mutex<Option<Arc<Operation>>>,
767}
768
769impl KeystoreOperation {
770 /// Creates a new operation instance wrapped in a
771 /// BnKeystoreOperation proxy object. It also
772 /// calls `IBinder::set_requesting_sid` on the new interface, because
773 /// we need it for checking Keystore permissions.
Stephen Crane221bbb52020-12-16 15:52:10 -0800774 pub fn new_native_binder(
775 operation: Arc<Operation>,
776 ) -> binder::public_api::Strong<dyn IKeystoreOperation> {
Janis Danisevskis1af91262020-08-10 14:58:08 -0700777 let result =
778 BnKeystoreOperation::new_binder(Self { operation: Mutex::new(Some(operation)) });
779 result.as_binder().set_requesting_sid(true);
780 result
781 }
782
783 /// Grabs the outer operation mutex and calls `f` on the locked operation.
784 /// The function also deletes the operation if it returns with an error or if
785 /// `delete_op` is true.
786 fn with_locked_operation<T, F>(&self, f: F, delete_op: bool) -> Result<T>
787 where
788 for<'a> F: FnOnce(&'a Operation) -> Result<T>,
789 {
790 let mut delete_op: bool = delete_op;
791 match self.operation.try_lock() {
792 Ok(mut mutex_guard) => {
793 let result = match &*mutex_guard {
794 Some(op) => {
795 let result = f(&*op);
796 // Any error here means we can discard the operation.
797 if result.is_err() {
798 delete_op = true;
799 }
800 result
801 }
802 None => Err(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE))
803 .context("In KeystoreOperation::with_locked_operation"),
804 };
805
806 if delete_op {
807 // We give up our reference to the Operation, thereby freeing up our
808 // internal resources and ending the wrapped KeyMint operation.
809 // This KeystoreOperation object will still be owned by an SpIBinder
810 // until the client drops its remote reference.
811 *mutex_guard = None;
812 }
813 result
814 }
815 Err(_) => Err(Error::Rc(ResponseCode::OPERATION_BUSY))
816 .context("In KeystoreOperation::with_locked_operation"),
817 }
818 }
819}
820
821impl binder::Interface for KeystoreOperation {}
822
823impl IKeystoreOperation for KeystoreOperation {
824 fn updateAad(&self, aad_input: &[u8]) -> binder::public_api::Result<()> {
825 map_or_log_err(
826 self.with_locked_operation(
827 |op| op.update_aad(aad_input).context("In KeystoreOperation::updateAad"),
828 false,
829 ),
830 Ok,
831 )
832 }
833
834 fn update(&self, input: &[u8]) -> binder::public_api::Result<Option<Vec<u8>>> {
835 map_or_log_err(
836 self.with_locked_operation(
837 |op| op.update(input).context("In KeystoreOperation::update"),
838 false,
839 ),
840 Ok,
841 )
842 }
843 fn finish(
844 &self,
845 input: Option<&[u8]>,
846 signature: Option<&[u8]>,
847 ) -> binder::public_api::Result<Option<Vec<u8>>> {
848 map_or_log_err(
849 self.with_locked_operation(
850 |op| op.finish(input, signature).context("In KeystoreOperation::finish"),
851 true,
852 ),
853 Ok,
854 )
855 }
856
857 fn abort(&self) -> binder::public_api::Result<()> {
858 map_or_log_err(
859 self.with_locked_operation(
860 |op| op.abort(Outcome::Abort).context("In KeystoreOperation::abort"),
861 true,
862 ),
863 Ok,
864 )
865 }
866}