Blame - keystore2/src/operation.rs - android_system_security

2020-08-10 14:58:08 -0700

[diff] [blame]

1

2

//

3

// Licensed under the Apache License, Version 2.0 (the "License");

4

// you may not use this file except in compliance with the License.

5

// You may obtain a copy of the License at

6

//

7

// http://www.apache.org/licenses/LICENSE-2.0

8

//

9

// Unless required by applicable law or agreed to in writing, software

10

// distributed under the License is distributed on an "AS IS" BASIS,

11

// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

12

// See the License for the specific language governing permissions and

13

// limitations under the License.

14

15

//! This crate implements the `IKeystoreOperation` AIDL interface, which represents

16

//! an ongoing key operation, as well as the operation database, which is mainly

17

//! required for tracking operations for the purpose of pruning.

18

//! This crate also implements an operation pruning strategy.

19

//!

20

//! Operations implement the API calls update, finish, and abort.

21

//! Additionally, an operation can be dropped and pruned. The former

22

//! happens if the client deletes a binder to the operation object.

23

//! An existing operation may get pruned when running out of operation

24

//! slots and a new operation takes precedence.

25

//!

26

//! ## Operation Lifecycle

27

//! An operation gets created when the client calls `IKeystoreSecurityLevel::create`.

28

//! It may receive zero or more update request. The lifecycle ends when:

29

//! * `update` yields an error.

30

//! * `finish` is called.

31

//! * `abort` is called.

32

//! * The operation gets dropped.

33

//! * The operation gets pruned.

34

//! `Operation` has an `Outcome` member. While the outcome is `Outcome::Unknown`,

35

//! the operation is active and in a good state. Any of the above conditions may

36

//! change the outcome to one of the defined outcomes Success, Abort, Dropped,

37

//! Pruned, or ErrorCode. The latter is chosen in the case of an unexpected error, during

38

//! `update` or `finish`. `Success` is chosen iff `finish` completes without error.

39

//! Note that all operations get dropped eventually in the sense that they lose

40

//! their last reference and get destroyed. At that point, the fate of the operation

41

//! gets logged. However, an operation will transition to `Outcome::Dropped` iff

42

//! the operation was still active (`Outcome::Unknown`) at that time.

43

//!

44

//! ## Operation Dropping

45

//! To observe the dropping of an operation, we have to make sure that there

46

//! are no strong references to the IBinder representing this operation.

47

//! This would be simple enough if the operation object would need to be accessed

48

//! only by transactions. But to perform pruning, we have to retain a reference to the

49

//! original operation object.

50

//!

51

//! ## Operation Pruning

52

//! Pruning an operation happens during the creation of a new operation.

53

//! We have to iterate through the operation database to find a suitable

54

//! candidate. Then we abort and finalize this operation setting its outcome to

55

//! `Outcome::Pruned`. The corresponding KeyMint operation slot will have been freed

56

//! up at this point, but the `Operation` object lingers. When the client

57

//! attempts to use the operation again they will receive

58

//! ErrorCode::INVALID_OPERATION_HANDLE indicating that the operation no longer

59

//! exits. This should be the cue for the client to destroy its binder.

60

//! At that point the operation gets dropped.

61

//!

62

//! ## Architecture

63

//! The `IKeystoreOperation` trait is implemented by `KeystoreOperation`.

64

//! This acts as a proxy object holding a strong reference to actual operation

65

//! implementation `Operation`.

66

//!

67

//! ```

68

//! struct KeystoreOperation {

69

//! operation: Mutex<Option<Arc<Operation>>>,

//! }

//! ```

//!

//! The `Mutex` serves two purposes. It provides interior mutability allowing

74

//! us to set the Option to None. We do this when the life cycle ends during

75

//! a call to `update`, `finish`, or `abort`. As a result most of the Operation

76

//! related resources are freed. The `KeystoreOperation` proxy object still

77

//! lingers until dropped by the client.

78

//! The second purpose is to protect operations against concurrent usage.

79

//! Failing to lock this mutex yields `ResponseCode::OPERATION_BUSY` and indicates

80

//! a programming error in the client.

81

//!

82

//! Note that the Mutex only protects the operation against concurrent client calls.

83

//! We still retain weak references to the operation in the operation database:

84

//!

85

//! ```

86

//! struct OperationDb {

87

//! operations: Mutex<Vec<Weak<Operation>>>

//! }

//! ```

//!

//! This allows us to access the operations for the purpose of pruning.

92

//! We do this in three phases.

93

//! 1. We gather the pruning information. Besides non mutable information,

94

//! we access `last_usage` which is protected by a mutex.

95

//! We only lock this mutex for single statements at a time. During

96

//! this phase we hold the operation db lock.

97

//! 2. We choose a pruning candidate by computing the pruning resistance

98

//! of each operation. We do this entirely with information we now

99

//! have on the stack without holding any locks.

100

//! (See `OperationDb::prune` for more details on the pruning strategy.)

101

//! 3. During pruning we briefly lock the operation database again to get the

102

//! the pruning candidate by index. We then attempt to abort the candidate.

103

//! If the candidate was touched in the meantime or is currently fulfilling

104

//! a request (i.e., the client calls update, finish, or abort),

105

//! we go back to 1 and try again.

106

//!

107

//! So the outer Mutex in `KeystoreOperation::operation` only protects

108

//! operations against concurrent client calls but not against concurrent

109

//! pruning attempts. This is what the `Operation::outcome` mutex is used for.

110

//!

111

//! ```

112

//! struct Operation {

113

//! ...

114

//! outcome: Mutex<Outcome>,

//! ...

//! }

//! ```

//!

//! Any request that can change the outcome, i.e., `update`, `finish`, `abort`,

120

//! `drop`, and `prune` has to take the outcome lock and check if the outcome

121

//! is still `Outcome::Unknown` before entering. `prune` is special in that

122

//! it will `try_lock`, because we don't want to be blocked on a potentially

123

//! long running request at another operation. If it fails to get the lock

124

//! the operation is either being touched, which changes its pruning resistance,

125

//! or it transitions to its end-of-life, which means we may get a free slot.

126

//! Either way, we have to revaluate the pruning scores.

127

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

128

use crate::enforcements::AuthInfo;

Janis Danisevskis

778245c

2021-03-04 15:40:23 -0800

[diff] [blame]

129

use crate::error::{map_err_with, map_km_error, map_or_log_err, Error, ErrorCode, ResponseCode};

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

130

use crate::metrics::log_key_operation_event_stats;

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

131

use crate::utils::watchdog as wd;

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

132

use android_hardware_security_keymint::aidl::android::hardware::security::keymint::{

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

133

IKeyMintOperation::IKeyMintOperation, KeyParameter::KeyParameter, KeyPurpose::KeyPurpose,

Hasini Gunasinghe

2021-04-01 22:27:07 +0000

[diff] [blame]

134

SecurityLevel::SecurityLevel,

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

135

};

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

136

use android_hardware_security_keymint::binder::{BinderFeatures, Strong};

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

137

use android_system_keystore2::aidl::android::system::keystore2::{

138

IKeystoreOperation::BnKeystoreOperation, IKeystoreOperation::IKeystoreOperation,

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

139

};

140

use anyhow::{anyhow, Context, Result};

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

141

use std::{

142

collections::HashMap,

143

sync::{Arc, Mutex, MutexGuard, Weak},

time::Duration,

time::Instant,

};

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

148

/// Operations have `Outcome::Unknown` as long as they are active. They transition

149

/// to one of the other variants exactly once. The distinction in outcome is mainly

150

/// for the statistic.

151

#[derive(Debug, Copy, Clone, Eq, PartialEq, Ord, PartialOrd)]

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

152

pub enum Outcome {

153

/// Operations have `Outcome::Unknown` as long as they are active.

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

154

Unknown,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

155

/// Operation is successful.

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

156

Success,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

157

/// Operation is aborted.

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

158

Abort,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

159

/// Operation is dropped.

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

160

Dropped,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

161

/// Operation is pruned.

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

162

Pruned,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

163

/// Operation is failed with the error code.

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

164

ErrorCode(ErrorCode),

165

}

166

167

/// Operation bundles all of the operation related resources and tracks the operation's

168

/// outcome.

169

#[derive(Debug)]

170

pub struct Operation {

171

// The index of this operation in the OperationDb.

172

index: usize,

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

173

km_op: Strong<dyn IKeyMintOperation>,

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

174

last_usage: Mutex<Instant>,

175

outcome: Mutex<Outcome>,

176

owner: u32, // Uid of the operation's owner.

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

177

auth_info: Mutex<AuthInfo>,

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

178

forced: bool,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

179

logging_info: LoggingInfo,

180

}

181

182

/// Keeps track of the information required for logging operations.

183

#[derive(Debug)]

184

pub struct LoggingInfo {

Hasini Gunasinghe

2021-04-01 22:27:07 +0000

[diff] [blame]

185

sec_level: SecurityLevel,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

186

purpose: KeyPurpose,

187

op_params: Vec<KeyParameter>,

key_upgraded: bool,

}

impl LoggingInfo {

/// Constructor

pub fn new(

Hasini Gunasinghe

2021-04-01 22:27:07 +0000

[diff] [blame]

194

sec_level: SecurityLevel,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

195

purpose: KeyPurpose,

196

op_params: Vec<KeyParameter>,

197

key_upgraded: bool,

198

) -> LoggingInfo {

Hasini Gunasinghe

2021-04-01 22:27:07 +0000

[diff] [blame]

199

Self { sec_level, purpose, op_params, key_upgraded }

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

200

}

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

}

struct PruningInfo {

last_usage: Instant,

owner: u32,

index: usize,

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

207

forced: bool,

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

208

}

209

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

210

// We don't except more than 32KiB of data in `update`, `updateAad`, and `finish`.

211

const MAX_RECEIVE_DATA: usize = 0x8000;

212

213

impl Operation {

214

/// Constructor

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

215

pub fn new(

216

index: usize,

Stephen Crane

221bbb5

2020-12-16 15:52:10 -0800

[diff] [blame]

217

km_op: binder::Strong<dyn IKeyMintOperation>,

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

218

owner: u32,

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

219

auth_info: AuthInfo,

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

220

forced: bool,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

221

logging_info: LoggingInfo,

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

222

) -> Self {

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

223

Self {

224

index,

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

225

km_op,

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

226

last_usage: Mutex::new(Instant::now()),

227

outcome: Mutex::new(Outcome::Unknown),

228

owner,

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

229

auth_info: Mutex::new(auth_info),

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

230

forced,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

231

logging_info,

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

}

}

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

235

fn get_pruning_info(&self) -> Option<PruningInfo> {

236

// An operation may be finalized.

237

if let Ok(guard) = self.outcome.try_lock() {

238

match *guard {

239

Outcome::Unknown => {}

240

// If the outcome is any other than unknown, it has been finalized,

241

// and we can no longer consider it for pruning.

_ => return None,

}

}

// Else: If we could not grab the lock, this means that the operation is currently

246

// being used and it may be transitioning to finalized or it was simply updated.

247

// In any case it is fair game to consider it for pruning. If the operation

248

// transitioned to a final state, we will notice when we attempt to prune, and

249

// a subsequent attempt to create a new operation will succeed.

250

Some(PruningInfo {

251

// Expect safety:

252

// `last_usage` is locked only for primitive single line statements.

253

// There is no chance to panic and poison the mutex.

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

254

last_usage: *self.last_usage.lock().expect("In get_pruning_info."),

255

owner: self.owner,

256

index: self.index,

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

257

forced: self.forced,

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

258

})

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

259

}

260

261

fn prune(&self, last_usage: Instant) -> Result<(), Error> {

262

let mut locked_outcome = match self.outcome.try_lock() {

263

Ok(guard) => match *guard {

264

Outcome::Unknown => guard,

265

_ => return Err(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE)),

266

},

267

Err(_) => return Err(Error::Rc(ResponseCode::OPERATION_BUSY)),

268

};

269

270

// In `OperationDb::prune`, which is our caller, we first gather the pruning

271

// information including the last usage. When we select a candidate

272

// we call `prune` on that candidate passing the last_usage

273

// that we gathered earlier. If the actual last usage

274

// has changed since than, it means the operation was busy in the

275

// meantime, which means that we have to reevaluate the pruning score.

276

//

277

// Expect safety:

278

// `last_usage` is locked only for primitive single line statements.

279

// There is no chance to panic and poison the mutex.

280

if *self.last_usage.lock().expect("In Operation::prune()") != last_usage {

281

return Err(Error::Rc(ResponseCode::OPERATION_BUSY));

282

}

283

*locked_outcome = Outcome::Pruned;

284

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

285

let _wp = wd::watch_millis("In Operation::prune: calling abort()", 500);

286

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

287

// We abort the operation. If there was an error we log it but ignore it.

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

288

if let Err(e) = map_km_error(self.km_op.abort()) {

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

289

log::error!("In prune: KeyMint::abort failed with {:?}.", e);

}

Ok(())

}

// This function takes a Result from a KeyMint call and inspects it for errors.

296

// If an error was found it updates the given `locked_outcome` accordingly.

297

// It forwards the Result unmodified.

298

// The precondition to this call must be *locked_outcome == Outcome::Unknown.

299

// Ideally the `locked_outcome` came from a successful call to `check_active`

300

// see below.

301

fn update_outcome<T>(

302

&self,

303

locked_outcome: &mut Outcome,

304

err: Result<T, Error>,

305

) -> Result<T, Error> {

306

match &err {

307

Err(Error::Km(e)) => *locked_outcome = Outcome::ErrorCode(*e),

308

Err(_) => *locked_outcome = Outcome::ErrorCode(ErrorCode::UNKNOWN_ERROR),

Ok(_) => (),

}

err

}

// This function grabs the outcome lock and checks the current outcome state.

315

// If the outcome is still `Outcome::Unknown`, this function returns

316

// the locked outcome for further updates. In any other case it returns

317

// ErrorCode::INVALID_OPERATION_HANDLE indicating that this operation has

318

// been finalized and is no longer active.

319

fn check_active(&self) -> Result<MutexGuard<Outcome>> {

320

let guard = self.outcome.lock().expect("In check_active.");

321

match *guard {

322

Outcome::Unknown => Ok(guard),

323

_ => Err(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE)).context(format!(

324

"In check_active: Call on finalized operation with outcome: {:?}.",

*guard

)),

}

}

// This function checks the amount of input data sent to us. We reject any buffer

331

// exceeding MAX_RECEIVE_DATA bytes as input to `update`, `update_aad`, and `finish`

332

// in order to force clients into using reasonable limits.

333

fn check_input_length(data: &[u8]) -> Result<()> {

334

if data.len() > MAX_RECEIVE_DATA {

335

// This error code is unique, no context required here.

336

return Err(anyhow!(Error::Rc(ResponseCode::TOO_MUCH_DATA)));

}

Ok(())

}

// Update the last usage to now.

342

fn touch(&self) {

343

// Expect safety:

344

// `last_usage` is locked only for primitive single line statements.

345

// There is no chance to panic and poison the mutex.

346

*self.last_usage.lock().expect("In touch.") = Instant::now();

347

}

348

349

/// Implementation of `IKeystoreOperation::updateAad`.

350

/// Refer to the AIDL spec at system/hardware/interfaces/keystore2 for details.

351

fn update_aad(&self, aad_input: &[u8]) -> Result<()> {

352

let mut outcome = self.check_active().context("In update_aad")?;

353

Self::check_input_length(aad_input).context("In update_aad")?;

354

self.touch();

355

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

356

let (hat, tst) = self

357

.auth_info

358

.lock()

359

.unwrap()

Qi Wu

2020-12-01 14:52:46 +0800

[diff] [blame]

360

.before_update()

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

361

.context("In update_aad: Trying to get auth tokens.")?;

362

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

363

self.update_outcome(&mut *outcome, {

364

let _wp = wd::watch_millis("Operation::update_aad: calling updateAad", 500);

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

365

map_km_error(self.km_op.updateAad(aad_input, hat.as_ref(), tst.as_ref()))

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

366

})

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

367

.context("In update_aad: KeyMint::update failed.")?;

Ok(())

}

/// Implementation of `IKeystoreOperation::update`.

373

/// Refer to the AIDL spec at system/hardware/interfaces/keystore2 for details.

374

fn update(&self, input: &[u8]) -> Result<Option<Vec<u8>>> {

375

let mut outcome = self.check_active().context("In update")?;

376

Self::check_input_length(input).context("In update")?;

377

self.touch();

378

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

379

let (hat, tst) = self

380

.auth_info

381

.lock()

382

.unwrap()

Qi Wu

2020-12-01 14:52:46 +0800

[diff] [blame]

383

.before_update()

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

384

.context("In update: Trying to get auth tokens.")?;

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

385

Shawn Willden

2021-02-19 10:53:49 -0700

[diff] [blame]

386

let output = self

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

387

.update_outcome(&mut *outcome, {

388

let _wp = wd::watch_millis("Operation::update: calling update", 500);

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

389

map_km_error(self.km_op.update(input, hat.as_ref(), tst.as_ref()))

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

390

})

Shawn Willden

2021-02-19 10:53:49 -0700

[diff] [blame]

391

.context("In update: KeyMint::update failed.")?;

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

392

Shawn Willden

2021-02-19 10:53:49 -0700

[diff] [blame]

393

if output.is_empty() {

394

Ok(None)

395

} else {

396

Ok(Some(output))

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

}

}

/// Implementation of `IKeystoreOperation::finish`.

401

/// Refer to the AIDL spec at system/hardware/interfaces/keystore2 for details.

402

fn finish(&self, input: Option<&[u8]>, signature: Option<&[u8]>) -> Result<Option<Vec<u8>>> {

403

let mut outcome = self.check_active().context("In finish")?;

404

if let Some(input) = input {

405

Self::check_input_length(input).context("In finish")?;

406

}

407

self.touch();

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

408

Janis Danisevskis

b1673db

2021-02-08 18:11:57 -0800

[diff] [blame]

409

let (hat, tst, confirmation_token) = self

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

410

.auth_info

411

.lock()

412

.unwrap()

Qi Wu

2020-12-01 14:52:46 +0800

[diff] [blame]

413

.before_finish()

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

414

.context("In finish: Trying to get auth tokens.")?;

Hasini Gunasinghe

2020-11-17 23:08:39 +0000

[diff] [blame]

415

Janis Danisevskis

85d4793

2020-10-23 16:12:59 -0700

[diff] [blame]

416

let output = self

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

417

.update_outcome(&mut *outcome, {

418

let _wp = wd::watch_millis("Operation::finish: calling finish", 500);

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

419

map_km_error(self.km_op.finish(

Janis Danisevskis

85d4793

2020-10-23 16:12:59 -0700

[diff] [blame]

420

input,

421

signature,

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

422

hat.as_ref(),

423

tst.as_ref(),

Shawn Willden

2021-02-19 10:53:49 -0700

[diff] [blame]

424

confirmation_token.as_deref(),

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

425

))

426

})

Janis Danisevskis

85d4793

2020-10-23 16:12:59 -0700

[diff] [blame]

427

.context("In finish: KeyMint::finish failed.")?;

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

428

Qi Wu

2020-12-01 14:52:46 +0800

[diff] [blame]

429

self.auth_info.lock().unwrap().after_finish().context("In finish.")?;

430

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

431

// At this point the operation concluded successfully.

432

*outcome = Outcome::Success;

433

434

if output.is_empty() {

Ok(None)

} else {

Ok(Some(output))

}

}

/// Aborts the operation if it is active. IFF the operation is aborted the outcome is

442

/// set to `outcome`. `outcome` must reflect the reason for the abort. Since the operation

443

/// gets aborted `outcome` must not be `Operation::Success` or `Operation::Unknown`.

444

fn abort(&self, outcome: Outcome) -> Result<()> {

445

let mut locked_outcome = self.check_active().context("In abort")?;

446

*locked_outcome = outcome;

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

447

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

448

{

449

let _wp = wd::watch_millis("Operation::abort: calling abort", 500);

Janis Danisevskis

2021-06-18 11:26:42 -0700

[diff] [blame]

450

map_km_error(self.km_op.abort()).context("In abort: KeyMint::abort failed.")

Janis Danisevskis

2021-05-05 14:29:08 -0700

[diff] [blame]

451

}

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

}

}

impl Drop for Operation {

456

fn drop(&mut self) {

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

457

let guard = self.outcome.lock().expect("In drop.");

458

log_key_operation_event_stats(

Hasini Gunasinghe

2021-04-01 22:27:07 +0000

[diff] [blame]

459

self.logging_info.sec_level,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

460

self.logging_info.purpose,

461

&(self.logging_info.op_params),

462

&guard,

463

self.logging_info.key_upgraded,

464

);

465

if let Outcome::Unknown = *guard {

466

drop(guard);

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

467

// If the operation was still active we call abort, setting

468

// the outcome to `Outcome::Dropped`

469

if let Err(e) = self.abort(Outcome::Dropped) {

470

log::error!("While dropping Operation: abort failed:\n {:?}", e);

}

}

}

}

/// The OperationDb holds weak references to all ongoing operations.

477

/// Its main purpose is to facilitate operation pruning.

478

#[derive(Debug, Default)]

479

pub struct OperationDb {

480

// TODO replace Vec with WeakTable when the weak_table crate becomes

481

// available.

482

operations: Mutex<Vec<Weak<Operation>>>,

}

impl OperationDb {

/// Creates a new OperationDb.

487

pub fn new() -> Self {

488

Self { operations: Mutex::new(Vec::new()) }

489

}

490

491

/// Creates a new operation.

492

/// This function takes a KeyMint operation and an associated

493

/// owner uid and returns a new Operation wrapped in a `std::sync::Arc`.

494

pub fn create_operation(

495

&self,

Stephen Crane

221bbb5

2020-12-16 15:52:10 -0800

[diff] [blame]

496

km_op: binder::public_api::Strong<dyn IKeyMintOperation>,

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

497

owner: u32,

Janis Danisevskis

2021-01-11 14:19:42 -0800

[diff] [blame]

498

auth_info: AuthInfo,

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

499

forced: bool,

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

500

logging_info: LoggingInfo,

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

501

) -> Arc<Operation> {

502

// We use unwrap because we don't allow code that can panic while locked.

503

let mut operations = self.operations.lock().expect("In create_operation.");

504

505

let mut index: usize = 0;

506

// First we iterate through the operation slots to try and find an unused

507

// slot. If we don't find one, we append the new entry instead.

508

match (*operations).iter_mut().find(|s| {

509

index += 1;

510

s.upgrade().is_none()

511

}) {

512

Some(free_slot) => {

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

513

let new_op = Arc::new(Operation::new(

index - 1,

km_op,

owner,

auth_info,

forced,

logging_info,

));

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

521

*free_slot = Arc::downgrade(&new_op);

522

new_op

523

}

524

None => {

Hasini Gunasinghe

2021-03-19 00:43:52 +0000

[diff] [blame]

525

let new_op = Arc::new(Operation::new(

operations.len(),

km_op,

owner,

auth_info,

forced,

logging_info,

));

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

533

operations.push(Arc::downgrade(&new_op));

new_op

}

}

}

fn get(&self, index: usize) -> Option<Arc<Operation>> {

540

self.operations.lock().expect("In OperationDb::get.").get(index).and_then(|op| op.upgrade())

541

}

542

543

/// Attempts to prune an operation.

544

///

545

/// This function is used during operation creation, i.e., by

546

/// `KeystoreSecurityLevel::create_operation`, to try and free up an operation slot

547

/// if it got `ErrorCode::TOO_MANY_OPERATIONS` from the KeyMint backend. It is not

548

/// guaranteed that an operation slot is available after this call successfully

549

/// returned for various reasons. E.g., another thread may have snatched up the newly

550

/// available slot. Callers may have to call prune multiple times before they get a

551

/// free operation slot. Prune may also return `Err(Error::Rc(ResponseCode::BACKEND_BUSY))`

552

/// which indicates that no prunable operation was found.

553

///

554

/// To find a suitable candidate we compute the malus for the caller and each existing

555

/// operation. The malus is the inverse of the pruning power (caller) or pruning

556

/// resistance (existing operation).

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

557

///

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

558

/// The malus is based on the number of sibling operations and age. Sibling

559

/// operations are operations that have the same owner (UID).

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

560

///

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

561

/// Every operation, existing or new, starts with a malus of 1. Every sibling

562

/// increases the malus by one. The age is the time since an operation was last touched.

563

/// It increases the malus by log6(<age in seconds> + 1) rounded down to the next

564

/// integer. So the malus increases stepwise after 5s, 35s, 215s, ...

565

/// Of two operations with the same malus the least recently used one is considered

566

/// weaker.

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

567

///

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

568

/// For the caller to be able to prune an operation it must find an operation

569

/// with a malus higher than its own.

570

///

571

/// The malus can be expressed as

572

/// ```

573

/// malus = 1 + no_of_siblings + floor(log6(age_in_seconds + 1))

574

/// ```

575

/// where the constant `1` accounts for the operation under consideration.

576

/// In reality we compute it as

577

/// ```

578

/// caller_malus = 1 + running_siblings

579

/// ```

580

/// because the new operation has no age and is not included in the `running_siblings`,

581

/// and

582

/// ```

583

/// running_malus = running_siblings + floor(log6(age_in_seconds + 1))

584

/// ```

585

/// because a running operation is included in the `running_siblings` and it has

/// an age.

///

/// ## Example

/// A caller with no running operations has a malus of 1. Young (age < 5s) operations

590

/// also with no siblings have a malus of one and cannot be pruned by the caller.

591

/// We have to find an operation that has at least one sibling or is older than 5s.

592

///

593

/// A caller with one running operation has a malus of 2. Now even young siblings

594

/// or single child aging (5s <= age < 35s) operations are off limit. An aging

595

/// sibling of two, however, would have a malus of 3 and would be fair game.

596

///

597

/// ## Rationale

598

/// Due to the limitation of KeyMint operation slots, we cannot get around pruning or

599

/// a single app could easily DoS KeyMint.

600

/// Keystore 1.0 used to always prune the least recently used operation. This at least

601

/// guaranteed that new operations can always be started. With the increased usage

602

/// of Keystore we saw increased pruning activity which can lead to a livelock

603

/// situation in the worst case.

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

604

///

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

605

/// With the new pruning strategy we want to provide well behaved clients with

606

/// progress assurances while punishing DoS attempts. As a result of this

607

/// strategy we can be in the situation where no operation can be pruned and the

608

/// creation of a new operation fails. This allows single child operations which

609

/// are frequently updated to complete, thereby breaking up livelock situations

610

/// and facilitating system wide progress.

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

611

///

612

/// ## Update

613

/// We also allow callers to cannibalize their own sibling operations if no other

614

/// slot can be found. In this case the least recently used sibling is pruned.

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

615

pub fn prune(&self, caller: u32, forced: bool) -> Result<(), Error> {

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

616

loop {

617

// Maps the uid of the owner to the number of operations that owner has

618

// (running_siblings). More operations per owner lowers the pruning

619

// resistance of the operations of that owner. Whereas the number of

620

// ongoing operations of the caller lowers the pruning power of the caller.

621

let mut owners: HashMap<u32, u64> = HashMap::new();

622

let mut pruning_info: Vec<PruningInfo> = Vec::new();

623

624

let now = Instant::now();

625

self.operations

626

.lock()

627

.expect("In OperationDb::prune: Trying to lock self.operations.")

628

.iter()

629

.for_each(|op| {

630

if let Some(op) = op.upgrade() {

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

631

if let Some(p_info) = op.get_pruning_info() {

632

let owner = p_info.owner;

633

pruning_info.push(p_info);

634

// Count operations per owner.

635

*owners.entry(owner).or_insert(0) += 1;

636

}

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

}

});

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

640

// If the operation is forced, the caller has a malus of 0.

641

let caller_malus = if forced { 0 } else { 1u64 + *owners.entry(caller).or_default() };

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

642

643

// We iterate through all operations computing the malus and finding

644

// the candidate with the highest malus which must also be higher

645

// than the caller_malus.

646

struct CandidateInfo {

index: usize,

malus: u64,

last_usage: Instant,

age: Duration,

}

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

652

let mut oldest_caller_op: Option<CandidateInfo> = None;

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

653

let candidate = pruning_info.iter().fold(

654

None,

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

655

|acc: Option<CandidateInfo>, &PruningInfo { last_usage, owner, index, forced }| {

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

656

// Compute the age of the current operation.

657

let age = now

658

.checked_duration_since(last_usage)

659

.unwrap_or_else(|| Duration::new(0, 0));

660

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

661

// Find the least recently used sibling as an alternative pruning candidate.

662

if owner == caller {

663

if let Some(CandidateInfo { age: a, .. }) = oldest_caller_op {

664

if age > a {

665

oldest_caller_op =

666

Some(CandidateInfo { index, malus: 0, last_usage, age });

}

} else {

oldest_caller_op =

Some(CandidateInfo { index, malus: 0, last_usage, age });

}

}

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

674

// Compute the malus of the current operation.

Janis Danisevskis

2021-03-03 14:40:52 -0800

[diff] [blame]

675

let malus = if forced {

676

// Forced operations have a malus of 0. And cannot even be pruned

677

// by other forced operations.

678

0

679

} else {

680

// Expect safety: Every owner in pruning_info was counted in

681

// the owners map. So this unwrap cannot panic.

682

*owners.get(&owner).expect(

683

"This is odd. We should have counted every owner in pruning_info.",

684

) + ((age.as_secs() + 1) as f64).log(6.0).floor() as u64

685

};

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

686

687

// Now check if the current operation is a viable/better candidate

688

// the one currently stored in the accumulator.

689

match acc {

690

// First we have to find any operation that is prunable by the caller.

691

None => {

692

if caller_malus < malus {

693

Some(CandidateInfo { index, malus, last_usage, age })

} else {

None

}

}

// If we have found one we look for the operation with the worst score.

699

// If there is a tie, the older operation is considered weaker.

700

Some(CandidateInfo { index: i, malus: m, last_usage: l, age: a }) => {

701

if malus > m || (malus == m && age > a) {

702

Some(CandidateInfo { index, malus, last_usage, age })

703

} else {

704

Some(CandidateInfo { index: i, malus: m, last_usage: l, age: a })

}

}

}

},

);

Janis Danisevskis

2020-10-26 09:35:16 -0700

[diff] [blame]

711

// If we did not find a suitable candidate we may cannibalize our oldest sibling.

712

let candidate = candidate.or(oldest_caller_op);

713

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

714

match candidate {

715

Some(CandidateInfo { index, malus: _, last_usage, age: _ }) => {

716

match self.get(index) {

717

Some(op) => {

718

match op.prune(last_usage) {

719

// We successfully freed up a slot.

720

Ok(()) => break Ok(()),

721

// This means the operation we tried to prune was on its way

722

// out. It also means that the slot it had occupied was freed up.

723

Err(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE)) => break Ok(()),

724

// This means the operation we tried to prune was currently

725

// servicing a request. There are two options.

726

// * Assume that it was touched, which means that its

727

// pruning resistance increased. In that case we have

728

// to start over and find another candidate.

729

// * Assume that the operation is transitioning to end-of-life.

730

// which means that we got a free slot for free.

731

// If we assume the first but the second is true, we prune

732

// a good operation without need (aggressive approach).

733

// If we assume the second but the first is true, our

734

// caller will attempt to create a new KeyMint operation,

735

// fail with `ErrorCode::TOO_MANY_OPERATIONS`, and call

736

// us again (conservative approach).

737

Err(Error::Rc(ResponseCode::OPERATION_BUSY)) => {

738

// We choose the conservative approach, because

739

// every needlessly pruned operation can impact

740

// the user experience.

741

// To switch to the aggressive approach replace

742

// the following line with `continue`.

break Ok(());

}

// The candidate may have been touched so the score

747

// has changed since our evaluation.

_ => continue,

}

}

// This index does not exist any more. The operation

752

// in this slot was dropped. Good news, a slot

753

// has freed up.

754

None => break Ok(()),

755

}

756

}

757

// We did not get a pruning candidate.

758

None => break Err(Error::Rc(ResponseCode::BACKEND_BUSY)),

}

}

}

}

/// Implementation of IKeystoreOperation.

765

pub struct KeystoreOperation {

766

operation: Mutex<Option<Arc<Operation>>>,

767

}

768

769

impl KeystoreOperation {

770

/// Creates a new operation instance wrapped in a

Andrew Walbran

de45c8b

2021-04-13 14:42:38 +0000

[diff] [blame]

771

/// BnKeystoreOperation proxy object. It also enables

772

/// `BinderFeatures::set_requesting_sid` on the new interface, because

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

773

/// we need it for checking Keystore permissions.

Stephen Crane

221bbb5

2020-12-16 15:52:10 -0800

[diff] [blame]

774

pub fn new_native_binder(

775

operation: Arc<Operation>,

776

) -> binder::public_api::Strong<dyn IKeystoreOperation> {

Andrew Walbran

de45c8b

2021-04-13 14:42:38 +0000

[diff] [blame]

777

BnKeystoreOperation::new_binder(

778

Self { operation: Mutex::new(Some(operation)) },

779

BinderFeatures { set_requesting_sid: true, ..BinderFeatures::default() },

780

)

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

781

}

782

783

/// Grabs the outer operation mutex and calls `f` on the locked operation.

784

/// The function also deletes the operation if it returns with an error or if

785

/// `delete_op` is true.

786

fn with_locked_operation<T, F>(&self, f: F, delete_op: bool) -> Result<T>

787

where

788

for<'a> F: FnOnce(&'a Operation) -> Result<T>,

789

{

790

let mut delete_op: bool = delete_op;

791

match self.operation.try_lock() {

792

Ok(mut mutex_guard) => {

793

let result = match &*mutex_guard {

794

Some(op) => {

795

let result = f(&*op);

796

// Any error here means we can discard the operation.

if result.is_err() {

delete_op = true;

}

result

}

None => Err(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE))

803

.context("In KeystoreOperation::with_locked_operation"),

};

if delete_op {

// We give up our reference to the Operation, thereby freeing up our

808

// internal resources and ending the wrapped KeyMint operation.

809

// This KeystoreOperation object will still be owned by an SpIBinder

810

// until the client drops its remote reference.

*mutex_guard = None;

}

result

}

Err(_) => Err(Error::Rc(ResponseCode::OPERATION_BUSY))

816

.context("In KeystoreOperation::with_locked_operation"),

}

}

}

impl binder::Interface for KeystoreOperation {}

822

823

impl IKeystoreOperation for KeystoreOperation {

824

fn updateAad(&self, aad_input: &[u8]) -> binder::public_api::Result<()> {

Hasini Gunasinghe

2021-05-05 14:32:32 +0000

[diff] [blame]

825

let _wp = wd::watch_millis("IKeystoreOperation::updateAad", 500);

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

826

map_or_log_err(

827

self.with_locked_operation(

828

|op| op.update_aad(aad_input).context("In KeystoreOperation::updateAad"),

false,

),

Ok,

)

}

fn update(&self, input: &[u8]) -> binder::public_api::Result<Option<Vec<u8>>> {

Hasini Gunasinghe

2021-05-05 14:32:32 +0000

[diff] [blame]

836

let _wp = wd::watch_millis("IKeystoreOperation::update", 500);

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

837

map_or_log_err(

838

self.with_locked_operation(

839

|op| op.update(input).context("In KeystoreOperation::update"),

false,

),

Ok,

)

}

fn finish(

&self,

input: Option<&[u8]>,

848

signature: Option<&[u8]>,

849

) -> binder::public_api::Result<Option<Vec<u8>>> {

Hasini Gunasinghe

2021-05-05 14:32:32 +0000

[diff] [blame]

850

let _wp = wd::watch_millis("IKeystoreOperation::finish", 500);

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

851

map_or_log_err(

852

self.with_locked_operation(

853

|op| op.finish(input, signature).context("In KeystoreOperation::finish"),

true,

),

Ok,

)

}

fn abort(&self) -> binder::public_api::Result<()> {

Hasini Gunasinghe

2021-05-05 14:32:32 +0000

[diff] [blame]

861

let _wp = wd::watch_millis("IKeystoreOperation::abort", 500);

Janis Danisevskis

778245c

2021-03-04 15:40:23 -0800

[diff] [blame]

862

map_err_with(

Janis Danisevskis

2020-08-10 14:58:08 -0700

[diff] [blame]

863

self.with_locked_operation(

864

|op| op.abort(Outcome::Abort).context("In KeystoreOperation::abort"),

865

true,

866

),

Janis Danisevskis

778245c

2021-03-04 15:40:23 -0800

[diff] [blame]

867

|e| {

868

match e.root_cause().downcast_ref::<Error>() {

869

// Calling abort on expired operations is something very common.

870

// There is no reason to clutter the log with it. It is never the cause

871

// for a true problem.

872

Some(Error::Km(ErrorCode::INVALID_OPERATION_HANDLE)) => {}

873

_ => log::error!("{:?}", e),

874

};

875

e

876

},

Janis Danisevskis