Abstractions for security protocol verification

Abstract

We present a large class of security protocol abstractions with the aim of improving the scope and efficiency of verification tools. We propose abstractions that transform a term’s structure based on its type as well as abstractions that remove atomic messages, variables, and redundant terms. Our theory improves on previous work by supporting rewrite theories with the finite-variant property, user-defined types, and untyped variables to cover type flaw attacks. We prove soundness results for an expressive property language that includes secrecy and authentication. Applying our abstractions to realistic IETF protocol models, we achieve dramatic speedups and extend the scope of several modern security protocol analyzers.

Keywords

Security protocols formal verification abstraction technique

1. Introduction

Security protocols play a central role in today’s networked applications. Past experience has amply shown that informal arguments justifying the security of such protocols are insufficient. This makes security protocols prime candidates for formal verification. In the last two decades, research in formal security protocol verification has made enormous progress, which is reflected in many state-of-the-art tools including AVANTSSAR [5], ProVerif [9], Maude-NPA [21], Scyther [14], and Tamarin [34]. These tools can verify small to medium-sized protocols in a few seconds or less, sometimes for an unbounded number of sessions. Despite this success, they can still be challenged when verifying real-world protocols such as those defined in standards and deployed on the internet (e.g., TLS, IKE, and ISO/IEC 9798). Such protocols typically have messages with numerous fields, support many alternatives (e.g., cryptographic setups), and may be composed from more basic protocols (e.g., IKEv2-EAP).

Abstraction [10] is a standard technique to over-approximate complex systems by simpler ones to make verification more efficient or feasible. Sound abstractions preserve counterexamples (or attacks in security terms) from concrete to abstracted systems. In the context of security protocols, abstractions are extensively used. Here, we only mention a few examples. First, the Dolev–Yao model [19] is a standard (but not necessarily sound) abstraction of cryptography. Second, many tools encode the verification problem in the formalism of an efficient solver or reasoner. These encodings often involve abstraction as well. Therefore, we call these back-end abstractions. For example, ProVerif [9] translates models in the applied pi calculus to a set of Horn clauses, SATMC [6] reduces protocol verification to SAT solving, and Paulson [40] models protocols as inductively defined trace sets. Finally, some abstractions aim at speeding up automated analysis by simplifying protocols within a given protocol model before feeding them to verifiers [28,37]. Our work belongs to this class of front-end abstractions.

Extending Hui and Lowe’s work [28], we proposed in [37] a rich class of protocol abstractions and proved its soundness for a wide range of security properties. We used a type system to uniformly transform all terms of a given type (e.g., a pattern in a protocol role and its instances during execution) whereas [28] only covers ground terms. Our work [37] exhibits several limitations: (1) the theory is limited to the free algebra over a fixed signature; (2) all variables have strict (possibly structured) types, hence we cannot precisely model ticket forwarding or Diffie–Hellman exchanges. While the type system enables fine-grained control over abstractions (e.g., by discerning different nonces), it may eliminate realistic attacks such as type flaw attacks; (3) some soundness conditions involving quantifiers are hard to check in practice; and (4) it only presents experimental results for a single tool (SATMC) using abstractions that are crafted manually.

In this work, we address all the limitations above. First, we work with rewrite theories with the finite-variant property modulo a set of axioms to model cryptographic operations. Second, we support untyped variables, user-defined types, and subtyping. User-defined types enable the grouping of similar atomic types (e.g., session keys) and adjusting the granularity of matching in message abstraction. Furthermore, we have separated the removal of variables, atomic messages, and redundancies, from the transformation of the message structure. This separation simplifies the specifications and soundness proof of the abstractions that transform the message structure. Third, we provide effectively checkable syntactic criteria for the conditions of the soundness theorems. Finally, we extended Scyther [14] with fully automated support for our abstraction methodology. The resulting tool is available online [36]. We validated our approach on an extensive set of realistic case studies drawn from the IKEv1, IKEv2, ISO/IEC 9798, and PANA-AKA standard proposals. Our abstractions result in very substantial performance gains. We have also obtained positive results for several other state-of-the-art verifiers (ProVerif, CL-Atse, OFMC, and SATMC) with manually produced abstractions.

This article is based on the conference paper [38] from which it differs mainly as follows. On the theoretical side, we have generalized the class of supported rewrite systems from a subclass of shallow subterm-convergent ones to all those with the finite-variant property. Using the finite-variant property, we have significantly simplified the condition needed for equality preservation (Theorem 4.23). On the practical side, we provide additional details of the abstraction heuristics and the implementation. We have also extended the Scyther implementation with a check for spurious attacks. Moreover, we have performed several additional case studies.

Due to space constraints, most proofs are moved to the full version of the paper [39]. Table 1 gives an overview of the rest of the paper.

Table 1
Structure of paper

Topic Main description

Motivating example: IKE Section 2

Modeling security protocols Section 3

Abstraction theory Section 4

Abstraction generation algorithm Section 5

Algorithm implementation in Scyther Section 6.1

Experimental results Section 6.2

Topic	Main description
Motivating example: IKE	Section 2
Modeling security protocols	Section 3
Abstraction theory	Section 4
Abstraction generation algorithm	Section 5
Algorithm implementation in Scyther	Section 6.1
Experimental results	Section 6.2

2. Motivating example: An IKE protocol

The Internet Key Exchange (IKE) family of protocols is part of the IPsec protocol suite for securing Internet Protocol (IP) communication. IKE establishes a shared key, which is later used for securing IP packets, realizes mutual authentication, and offers identity protection as an option. Its first version (IKEv1) dates back to 1998 [27]. The second version (IKEv2) [30] significantly simplifies the first one. However, the protocols in this family are still complex and contain a large number of fields.

Concrete protocol. As our running example, we present a member of the IKEv2 family, called IKEv2-mac (or $IK E_{m}^{}$ for short), which sets up a session key using a Diffie–Hellman (DH) key exchange, provides mutual authentication based on MACs, and also offers identity protection. We use Cremers’ models of IKE [15] as a basis for our presentation and experiments (see Section 6.2). Our starting point is the following concrete $IK E_{m}^{}$ protocol between an initiator A and a responder B, where we write ${| m |}_{k}$ to denote the symmetric encryption of m with key k. $\begin{matrix} IK E_{m}^{} (1) . & A \to B : SPIa, o, sA 1, g^{x}, Na \\ IK E_{m}^{} (2) . & B \to A : SPIa, SPIb, sA 1, g^{y}, Nb \\ IK E_{m}^{} (3) . & A \to B : SPIa, SPIb, {| A, B, AUTHa, sA 2, tSa, tSb |}_{SK} \\ IK E_{m}^{} (4) . & B \to A : SPIa, SPIb, {| B, AUTHb, sA 2, tSa, tSb |}_{SK} \end{matrix}$ Here, $SPIa$ and $SPIb$ denote the Security Parameter Indices (two unique values that together identify a connection), o is a constant number, $sA 1$ and $sA 2$ are Security Associations (a group of security parameters that the parties will agree on, such as the used cryptographic algorithms), g is the DH group generator, x and y are secret DH exponents, $Na$ and $Nb$ are nonces, and $tSa$ and $tSb$ denote Traffic Selectors specifying certain IP parameters. $AUTHa$ and $AUTHb$ denote the authenticators of A and B and $SK$ the session key derived from the DH key $g^{x y}$ . These are defined as follows. $\begin{matrix} SK = kdf (Na, Nb, g^{x y}, SPIa, SPIb) \\ AUTHa = mac (sh (A, B), SPIa, o, sA 1, g^{x}, Na, Nb, prf (SK, A)) \\ AUTHb = mac (sh (B, A), SPIa, SPIb, sA 1, g^{y}, Nb, Na, prf (SK, B)) \end{matrix}$ We model the functions $mac$ , $kdf$ , and $prf$ as hash functions and use $sh (A, B)$ and $sh (B, A)$ to refer to the (single) long-term symmetric key shared by A and B as part of the cryptographic setup.

We consider the following security properties:

the secrecy of the DH key $g^{x y}$ , and

mutual non-injective agreement on the nonces $Na$ and $Nb$ and the DH half-keys $g^{x}$ and $g^{y}$ .

The DH key serves as the master secret for

SK

. We could also consider the secrecy of

SK

, but for the running example we only consider the simpler property.

Abstraction. Our theory supports the construction of abstract models by removing inessential fields and operations using a range of abstractions. Typically, we use abstractions in a first step to remove selected cryptographic operations, remove fields under hashes, and to pull fields outside other cryptographic operations like encryptions or signatures. The types enable a fine-grained selection of the messages to be abstracted. In a second step, we remove inessential top-level (i.e., unprotected) fields and redundancies.

Let us apply these two steps to the $IK E_{m}^{}$ protocol. In the first step, we remove: (i) the symmetric encryptions with the session key $SK$ (providing identity protection), (ii) from the session key: all fields under $kdf$ except the DH key $g^{x y}$ , and (iii) from the authenticators: the fields $SPIa$ , $SPIb$ , and $sA 1$ and the application of $prf$ including the agent names underneath. Here is the resulting protocol, which we call $IK E_{m}^{1}$ . $\begin{matrix} IK E_{m}^{1} (1) . & A \to B : SPIa, o, sA 1, g^{x}, Na \\ IK E_{m}^{1} (2) . & B \to A : SPIa, SPIb, sA 1, g^{y}, Nb \\ IK E_{m}^{1} (3) . & A \to B : SPIa, SPIb, A, B, {AUTHa}^{'}, sA 2, tSa, tSb \\ IK E_{m}^{1} (4) . & B \to A : SPIa, SPIb, B, {AUTHb}^{'}, sA 2, tSa, tSb \end{matrix}$ where ${SK}^{'} = kdf (g^{x y})$ and $\begin{matrix} {AUTHa}^{'} = mac (sh (A, B), o, g^{x}, Na, Nb, {SK}^{'}) \\ {AUTHb}^{'} = mac (sh (B, A), g^{y}, Nb, Na, {SK}^{'}) . \end{matrix}$ Note that we keep the field o in ${AUTHa}^{'}$ to prevent its unifiability with ${AUTHb}^{'}$ and hence the potential introduction of spurious attacks. Here, the type system plays an essential role in that it allows us to distinguish $AUTHa$ (with constant o as its third field under the $mac$ ) from $AUTHb$ (where $SPIb$ is the third field under the $mac$ , which we model as a nonce) and transform them in different ways resulting in ${AUTHa}^{'}$ and ${AUTHb}^{'}$ .

In a second step, we use abstractions to remove the fields o, A, B, $SPIa$ , $SPIb$ , $sA 1$ , $sA 2$ , $tSa$ , and $tSb$ in unprotected positions. The resulting protocol is $IK E_{m}^{2}$ : $\begin{matrix} IK E_{m}^{2} (1) . & A \to B : g^{x}, Na \\ IK E_{m}^{2} (2) . & B \to A : g^{y}, Nb \\ IK E_{m}^{2} (3) . & A \to B : {AUTHa}^{'} \\ IK E_{m}^{2} (4) . & B \to A : {AUTHb}^{'} \end{matrix}$

Scyther verifies the properties (P1) and (P2) in 8.7 s on the concrete and in 1.7 s on an automatically generated abstract protocol (which differs somewhat from the one presented here). Our soundness results imply that the original protocol $IK E_{m}^{}$ also enjoys these properties. We chose the protocol $IK E_{m}^{}$ as running example for its relative simplicity compared to the other protocols in our case studies. In many of our experiments (Section 6.2), our abstractions (i) result in much more substantial speedups, or (ii) enable the successful unbounded verification of a protocol where it times out or exhausts memory on the original protocol.

3. Security protocol model

We define a term algebra $T_{Σ} (V)$ over a signature Σ and a set of variables V in the standard way. Let $Σ^{n}$ denote the symbols of arity n. We call the elements of $Σ^{0}$ atoms and write $Σ^{⩾ 1}$ for the set of proper function symbols. For a fixed $Σ^{⩾ 1}$ , we will vary $Σ^{0}$ to generate different sets of terms, denoted by $T (V, Σ^{0})$ , including terms in protocol roles, network messages, and types. We write $subterm (t)$ for the set of subterms of t. We also define $vars (t) = subterm (t) \cap V$ and $atoms (t) = subterm (t) \cap Σ_{0}$ . If $vars (t) = \emptyset$ then t is called ground. We denote the top-level symbol of a (non-variable) term t by $topsym (t)$ and the set of its function symbols in $Σ^{⩾ 1}$ by $funsym (t)$ . We call a term t composed if $funsym (t)$ is non-empty. A position is a sequence of positive natural numbers denoting a path in the tree representation of a term. The size of a term t, denoted by $| t |$ , is the cardinality of its set of positions. We denote the subterm of t at position p with $t |_{p}$ and write $t {[u]}_{p}$ for the term obtained by replacing $t |_{p}$ at position p by u. We also partition Σ into sets of public and private symbols, denoted by $Σ_{pub}$ and $Σ_{pri}$ . We assume $Σ_{pub}$ includes pairing $⟨ \cdot, \cdot ⟩$ which associates to the right, e.g., $⟨ t, u, v ⟩ = ⟨ t, ⟨ u, v ⟩ ⟩$ . We usually write, e.g., ${| t, u, v |}_{k}$ rather than ${| ⟨ t, u, v ⟩ |}_{k}$ . We take the liberty to lift functions on terms to functions on sets of terms T, e.g., $funsym (T) = ⋃_{t \in T} funsym (t)$ . We denote by $dom (g)$ and $ran (g)$ the domain and range of a function g. For $n \in N$ , $\tilde{n}$ denotes ${1, \dots, n}$ .

The set of message terms is $M = T (V, A \cup F \cup C)$ , where $V$ , $A$ , $F$ , and $C$ are pairwise disjoint infinite sets of variables, agents, fresh values, and constants. We use terms in $M$ to model messages in protocol definitions which we present in Section 3.4. We partition $A$ into sets of honest and compromised agents: $A = A_{H} \cup A_{C}$ . The set $fresh (t) = subterm (t) \cap F$ denotes the fresh values in t. By convention, we use identifiers starting with upper-case and lower-case letters to denote variables and atoms, respectively.

3.1. Type system

We introduce a type system akin to [2] and extend it with subtyping. This type system is very fine-grained. For example, there are different types for different fresh values. We will subsequently restrict some abstractions to apply only to arguments of a specific type. Thus, the purpose of this fine-grained type system is to control when those abstractions are used. The subtyping allows us to adapt to different setups and tools by making types more coarse-grained. For example, we can define a type $nonce$ as a supertype for all fresh values.

We define the set of atomic types by $Y_{a t} = Y_{0} \cup {α, msg} \cup {β_{n} ∣ n \in F} \cup {γ_{c} ∣ c \in C}$ , where α, $β_{n}$ , and $γ_{c}$ are the types of agents, the fresh value n, and the constant c, respectively. Moreover, $msg$ is the type of all messages and $Y_{0}$ is a disjoint set of user-defined types. The set of all types is then defined by $Y = T (\emptyset, Y_{a t})$ .

We assume that all variables have an atomic type, i.e., $V = {V_{τ}}_{τ \in Y_{a t}}$ is a family of disjoint infinite sets of variables. Define $Γ : V \to Y_{a t}$ by $Γ (X) = τ$ if and only if $X \in V_{τ}$ . We extend Γ to atoms by defining $Γ (a) = α$ , $Γ (n) = β_{n}$ , and $Γ (c) = γ_{c}$ for $a \in A$ , $n \in F$ , and $c \in C$ , and then homomorphically to all terms $t \in M$ . Note that Γ is unique. We call $τ = Γ (t)$ the type of t and sometimes also write $t : τ$ .

The subtyping relation $≼$ on types is defined by the following inference rules and by two additional rules (not shown) defining its reflexivity and transitivity.

Every type is a subtype of $msg$ by the first rule. The second rule embeds a user-defined atomic subtyping relation $≼_{0} \subseteq (Y_{a t} ∖ {msg}) \times Y_{0}$ , which relates atomic types (except $msg$ ) to user-defined atomic types in $Y_{0}$ . For simplicity, we require that $≼_{0}$ is a partial function. The third rule ensures that subtyping is preserved by all symbols. The set of subtypes of τ is $τ ↓ = {τ^{'} \in Y ∣ τ^{'} ≼ τ}$ .

3.2. Equational theories

An equation over a signature Σ is an unordered pair ${s, t}$ , written $s ≃ t$ , where $s, t \in T_{Σ} (V_{msg})$ . An equation presentation $E = (Σ, E)$ consists of a signature Σ and a set E of equations over Σ. The equational theory induced by $E$ is the smallest Σ-congruence, written $=_{E}$ , containing all instances of equations in E. We often identify $E$ with the induced equational theory.

A rewrite rule is an oriented pair $l \to r$ , where $vars (r) \subseteq vars (l) \subseteq V_{msg}$ . A rewrite theory is a triple $R = (Σ, A x, R)$ where Σ is a signature, $A x$ a set of Σ-equations such that $vars (s) = vars (t)$ for all $s ≃ t \in A x$ , and R a set of rewrite rules. The rewriting relation $\to_{R, A x}$ on $T_{Σ} (V)$ is defined by $t \to_{R, A x} t^{'}$ iff there exists a non-variable position p in t, a rule $l \to r \in R$ , and a substitution σ such that $t |_{p} =_{A x} l σ$ and $t^{'} = t {[r σ]}_{p}$ . If $t \to_{R, A x}^{*} t^{'}$ and $t^{'}$ is irreducible under $\to_{R, A x}$ , we call $t^{'}$ $R, A x -normal$ and also say that $t^{'}$ is a normal form of t. A substitution σ is called $R, A x -normal$ if all terms in $ran (σ)$ are.

Provided that $A x$ has a finitary and complete unification algorithm and under suitable termination, confluence, and coherence conditions (see [29] for definitions), one can decompose an equational theory $(Σ, E)$ into a rewrite theory $(Σ, A x, R)$ where $E = A x \cup R$ (reading R here as a set of equations) and, for all terms $t, u \in T_{Σ} (V)$ , we have $t =_{E} u$ if and only if $t ↓_{R, A x} =_{A x} u ↓_{R, A x}$ . Here, $t ↓_{R, A x}$ denotes any normal form of t. Well-formed rewrite theories, defined below, satisfy a few additional mild assumptions.

Definition 3.1.
A rewrite theory $(Σ, A x, R)$ is well-formed if for all $s ≃ t \in A x$ and all $l \to r \in R$ , we have (i) $vars (s) = vars (t)$ and $vars (r) \subseteq vars (l)$ , (ii) $topsym (s) = topsym (t)$ , (iii) s, t, and l are composed and neither of them is a pair, and (iv) s, t, l, and r do not contain any fresh values.

The equality $vars (s) = vars (t)$ in point (i) of this definition is a standard assumption made for rewrite theories known as regularity [22]. Such rewrite theories are adequate to model many well-known cryptographic primitives as illustrated by the examples below.
Example 3.2.
We model the protocols of our case studies (see Section 2 and Section 6.2) in the rewrite theory $R_{c s} = (Σ_{c s}, A x_{c s}, R_{c s})$ where $\begin{matrix} Σ_{c s} = {sh, pk, pri, prf, kdf, mac, \exp, ⟨ \cdot, \cdot ⟩, π_{1}, π_{2}, {| \cdot |}_{\cdot}, {| \cdot |}_{\cdot}^{- 1}, {\cdot}_{\cdot}, {\cdot}_{\cdot}^{- 1}, {[\cdot]}_{\cdot}, ver} \cup Σ_{c s}^{0} \end{matrix}$ contains function symbols for: shared, public, and private long-term keys (where $Σ_{pri} = {sh, pri}$ ); hash functions $prf$ , $kdf$ , and $mac$ ; exponentiation $\exp$ ; pairs and projections; symmetric and asymmetric encryption and decryption; and signing and verification. The set of atoms $Σ_{c s}^{0}$ is specified later. The set $R_{c s}$ consists of rewrite rules for projections, decryption, and signature verification (with message recovery): $\begin{matrix} π_{1} (⟨ X, Y ⟩) \to X {| {| X |}_{K} |}_{K}^{- 1} \to X ver ({[X]}_{pri (Y)}, pk (Y)) \to X \\ π_{2} (⟨ X, Y ⟩) \to Y {{X}_{pk (Y)}}_{pri (Y)}^{- 1} \to Y \end{matrix}$ We have two equations in $A x_{c s}$ , namely, $\exp (\exp (g, X), Y) ≃ \exp (\exp (g, Y), X)$ to model Diffie–Hellman key exchange and $sh (X, Y) ≃ sh (Y, X)$ . Note that the rewrite rule for signature verification models signatures with message recovery (as, e.g., for RSA signatures). In contrast, MACs do not provide message recovery, so they have to be reconstructed for verification.
Example 3.3.
The theory of XOR is given by the following rewrite system. The rightmost rule is redundant but required to ensure coherence [29]. $\begin{matrix} X \oplus Y ≃ Y \oplus X X \oplus 0 \to X X \oplus X \oplus Y \to Y \\ (X \oplus Y) \oplus Z ≃ X \oplus (Y \oplus Z) X \oplus X \to 0 \end{matrix}$

We have used the AProVE termination tool [23] and Maude’s Church-Rosser and coherence checker [20] to verify the termination, confluence, and coherence properties that are required for decomposing the equational theories of our case studies.

Finally, we define well-typed substitutions, which are substitutions that respect subtyping.
Definition 3.4 (Well-typed substitutions).

A substitution θ is well-typed if $Γ ((X θ) ↓_{R, A x}) ≼ Γ (X)$ for all $X \in dom (θ)$ .

Since the type of any variable is atomic, this definition is independent of the representative of the $A x$ -equivalence class chosen for the normalized term. Hence, it is well-defined.

3.3. The finite variant property

The finite variant property simplifies equality checking and unification in equational theories. Given an equational theory $E = (Σ, E)$ and a term t, an $E$ -variant of t is a pair $(t^{'}, θ)$ such that $t θ =_{E} t^{'}$ . A decomposition $R = (Σ, A x, R)$ of $E$ (and hence $E$ ) has the finite variant property if for all terms $t \in T_{Σ} (V)$ , there is a finite set ${(t_{1}, θ_{1}), \dots, (t_{n}, θ_{n})}$ of $E$ -variants of t such that $t_{i}$ is $R, A x -normal$ and $dom (θ_{i}) \subseteq vars (t)$ for all $i \in \tilde{n}$ , and for all substitutions σ, there are a substitution η and $i \in \tilde{n}$ such that

$(t σ) ↓_{R, A x} =_{A x} t_{i} η$ ,

$X σ ↓_{R, A x} =_{A x} (X θ_{i}) η$ for all $X \in vars (t)$ .

We also call

R

a finite-variant decomposition of

E

. Given a such a decomposition, the algorithm in [22], based on the folding-variant narrowing strategy, computes a finite, complete, and minimal set of

R, A x

-variants of a given term t, denoted by

{⟦ t ⟧}_{R, A x}

. This set is unique up to

=_{A x}

-equality.

Example 3.5.
Consider the XOR theory from Example 3.3 and the terms $s = X \oplus Y \oplus X$ and $t = X \oplus Y$ . Then, with $i d$ denoting the identity substitution, the complete and minimal sets of $R, A x$ -variants of these terms are ${⟦ s ⟧}_{R, A x} = {(Y, i d)}$ and $\begin{matrix} {⟦ t ⟧}_{R, A x} & = {(X \oplus Y, i d), \\ (Z, {X \mapsto 0, Y \mapsto Z}), \\ (Z, {X \mapsto Z, Y \mapsto 0}), \\ (Z, {X \mapsto Z \oplus U, Y \mapsto U}), \\ (Z, {X \mapsto U, Y \mapsto Z \oplus U}), \\ (0, {X \mapsto U, Y \mapsto U}), \\ (Z_{1} \oplus Z_{2}, {X \mapsto U \oplus Z_{1}, Y \mapsto U \oplus Z_{2}})} . \end{matrix}$
Assumption 3.6.
For our theoretical development, we consider an arbitrary but fixed equational theory $E = (Σ, E)$ with a well-formed finite-variant decomposition $R = (Σ, A x, R)$ . We also assume that $R$ includes function symbols and rewrite rules for pairing and projections.

3.4. Protocols

We specify a security protocol as a partial function from agent variables to roles. A role is a sequence of events. We distinguish three types of events: send events, receive events, and signal events. A send event $send (t)$ indicates the transmission of a message that is an instance of the term t. Likewise, a receive event $recv (t)$ indicates the reception of a message that matches t. We assume a fixed set $Sig$ of signal events disjoint from ${send, recv}$ . A signal event $s i g \in Sig$ marks a progressive stage of an agent playing a role, i.e., it tells how far the agent has been executing. We use signal events to specify security properties. Past research has also employed signal events to express various authentication properties [43,44].

Given a set of terms T, we define the set of events $Evt (T) = {send (t), recv (t) ∣ t \in T} \cup Sig$ . We also define $term (e v (t)) = t$ for event $e v \in {send, recv}$ and leave it undefined for signals. A role is a sequence of events from $Evt (M)$ . We lift $term (\cdot)$ in the obvious way to sets and sequences of events.

Definition 3.7 (Protocol).

A protocol is a partial function $P : V_{α} ⇀ Evt {(M)}^{*}$ mapping agent variables to roles. Let $M_{P} = term (ran (P))$ be the set of protocol terms appearing in the roles of P, and let $V_{P}$ , $A_{P}$ , $F_{P}$ , and $C_{P}$ denote the sets of variables, agents, fresh values, and constants in $M_{P}$ .

Example 3.8 ( $IK E_{m}^{}$ protocol).

We formalize the $IK E_{m}^{}$ protocol from Section 2 in the rewrite theory of Example 3.2 as follows. The atoms $Σ_{c s}^{0}$ are composed of constants $C = {g, o, sA 1, sA 2, tSa, tSb}$ and fresh values $F = {na, nb, x, y, sPIa, sPIb}$ . The variables and their types are $A, B : α$ , $Ga, Gb : msg$ , $SPIa, SPIb, Na, Nb : nonce$ where $nonce$ is a user-defined type that satisfies $β_{n} ≼_{0} nonce$ for all $n \in F$ . We model $mac$ , $kdf$ , and $prf$ as hash functions. We also assume a set of signal events $Sig = {Running, Commit, Secret}$ . We later use $Running$ and $Commit$ to specify authentication properties and $Secret$ to specify secrecy properties (see Example 3.11). We formulate the initiator role A and the responder role B as follows. $\begin{array}{l} \begin{matrix} IK E_{m}^{} (A) & = send (sPIa, o, sA 1, \exp (g, x), na) \cdot recv (sPIa, SPIb, sA 1, Gb, Nb) \cdot Running \cdot \\ send (sPIa, SPIb, {| A, B, AUTHaa, sA 2, tSa, tSb |}_{SKa}) \cdot \\ recv (sPIa, SPIb, {| B, AUTHba, sA 2, tSa, tSb |}_{SKa}) \cdot Secret \cdot Commit \end{matrix} \\ \begin{matrix} IK E_{m}^{} (B) & = recv (SPIa, o, sA 1, Ga, Na) \cdot send (SPIa, sPIb, sA 1, \exp (g, y), nb) \cdot \\ recv (SPIa, sPIb, {| A, B, AUTHab, sA 2, tSa, tSb |}_{SKb}) \cdot Running \cdot \\ send (SPIa, sPIb, {| B, AUTHbb, sA 2, tSa, tSb |}_{SKb}) \cdot Secret \cdot Commit \end{matrix} \end{array}$ where the terms $\begin{array}{l} SKa = kdf (na, Nb, \exp (Gb, x), sPIa, SPIb) \\ SKb = kdf (Na, nb, \exp (Ga, y), SPIa, sPIb) \\ AUTHaa = mac (sh (A, B), sPIa, o, sA 1, \exp (g, x), na, Nb, prf (SKa, A)) \\ AUTHab = mac (sh (B, A), SPIa, o, sA 1, Ga, Na, nb, prf (SKb, A)) \\ AUTHba = mac (sh (A, B), sPIa, SPIb, sA 1, Gb, Nb, na, prf (SKa, B)) \\ AUTHbb = mac (sh (B, A), SPIa, sPIb, sA 1, \exp (g, y), nb, Na, prf (SKb, B)) \end{array}$ respectively represent the initiator A and the responder B’s view of the session key $SK$ and of the authenticators $AUTHa$ and $AUTHb$ .

3.5. Operational semantics

In this section, we introduce an operational semantics for security protocols. This semantics specifies the dynamic behaviour of the protocol roles when their events are executed. The protocol messages are sent to and received from the adversary, whom we identify with the network as usual.

Fig. 1.

Intruder deduction rules (where $Σ_{pub}^{⩾ 1} = Σ^{⩾ 1} \cap Σ_{pub}$ ).

We use a Dolev–Yao adversary model parametrized by an equational theory E. Its judgements are of the form $T ⊢_{E} t$ meaning that the intruder can derive term t from the set of terms T. The derivable judgements are defined in a standard way by the three deduction rules in Fig. 1.

When a protocol is executed, each of its roles can be executed an arbitrary number of times by possibly different agents in parallel. Such a single execution of a role is called a thread. We distinguish between different threads by associating each thread with a unique thread identifier. We index variables and fresh values with the thread identifier i to syntactically distinguish them from those of other threads. This ensures the uniqueness of fresh values.

Let $TID$ be a countably infinite set of thread identifiers. We define the indexing of a term t with $i \in TID$ as the term $t^{i}$ where every variable or fresh value u is replaced by $u^{i}$ . Constants and agents remain unchanged. For a set of messages $M \subseteq M$ , we define by $M^{TID} = {t^{i} ∣ t \in M \land i \in TID}$ the corresponding set of indexed terms. We assume that $V \cap V^{TID} = \emptyset$ and $F \cap F^{TID} = \emptyset$ . For variables and fresh values u, we define $Γ (u^{i}) = Γ (u)$ . Hence, indexing a term does not affect its type, i.e., we have $Γ (t^{i}) = Γ (t)$ . We extend indexing to (send and receive) events by applying it to the terms they contain. We also define the set of intruder-generated fresh values as $F^{∙} = {n_{k}^{∙} ∣ n \in F \land k \in N}$ with types $Γ (n_{k}^{∙}) = Γ (n) = β_{n}$ .

For example, suppose that thread i plays role A and is owned by $alice$ . Hence, the agent variable $A^{i}$ is bound to $alice$ . Suppose thread i contains a receive event $recv ({na, Nb}_{pk (A)})$ , meaning that it expects a message of the form ${{na}^{i}, m}_{pk (alice)}$ for some message m, which is bound to the variable ${Nb}^{i}$ . Such a message might originate from some thread j (e.g., with $m = {nb}^{j}$ a nonce generated by thread j) or from the adversary (e.g., with $m = n_{0}^{∙}$ a nonce generated by the adversary).

We thus define the set of network messages exchanged during protocol executions by $\begin{matrix} N = T (V^{TID}, A \cup C \cup F^{TID} \cup F^{∙}), \end{matrix}$ Note that $M^{TID} \subseteq N$ .

Given a protocol P, we define a transition system with states $(t r, t h, σ)$ , where

$t r \in {(TID \times Evt (M_{P}))}^{*}$ is a trace consisting of a sequence of pairs of thread identifiers and events,

$t h : TID ⇀ dom (P) \times Evt {(M_{P})}^{*}$ are threads, each executing some protocol role, and

$σ : V^{TID} ⇀ N$ is a well-typed ground substitution mapping instantiated protocol variables to network messages.

The trace

t r

as well as the executing role are symbolic (with terms in

M_{P}

). The substitution σ instantiates these messages to (ground) network messages as follows. The ground trace

t r σ \in Evt (N)

associated with such a state is recursively defined by

\begin{matrix} ϵ σ = ϵ and ((i, e) \cdot t r) σ = (i, e^{i} σ) \cdot t r σ . \end{matrix}

where ϵ denotes the empty sequence. The set

{Init}_{P}

of initial states is defined by

\begin{matrix} {Init}_{P} = {(ϵ, t h, σ) ∣ \forall i \in d o m (t h) . \exists R \in dom (P) . t h (i) = (R, P (R)) \land V_{P}^{TID} \subseteq dom (σ)} . \end{matrix}

Fig. 2.

Operational semantics.

The rules in Fig. 2 define the transitions. The first premise of each rule respectively states that a send, receive, or signal event heads thread i’s role. This event is removed and added together with the thread identifier i to the trace $t r$ . The second premise of $RECV$ requires that the network message $t^{i} σ$ matching the term t in the receive event is derivable from the intruder’s (ground) knowledge $IK (t r) σ \cup {IK}_{0}$ . Here, $IK (t r)$ denotes the (symbolic) intruder knowledge derived from a trace $t r$ as the set of terms in the send events on $t r$ , instantiated with the respective thread id, i.e., $\begin{matrix} IK (t r) = {t^{i} ∣ (i, send (t)) \in t r} \end{matrix}$ and ${IK}_{0}$ denotes the intruder’s (ground) initial knowledge. Note that the $SEND$ rule thus implicitly updates the intruder knowledge. The rule $SIGNAL$ expresses that the signal events’ only effect is to record a signal $s \in Sig$ in the trace. Note that transitions do not change the substitution σ; it is fixed with the (non-deterministic) choice of the initial state.

Finally, we define the semantics of a protocol P with respect to the intruder’s initial knowledge ${IK}_{0}$ as the set of states reachable from the initial states: $\begin{matrix} reach (P, {IK}_{0}) = {(t r, t h, σ) ∣ \exists s_{0} \in {Init}_{P} . s_{0} \to^{*} (t r, t h, σ)} \end{matrix}$ where $\to^{*}$ is the reflexive-transitive closure of the transition relation →. Note that these relations depend on ${IK}_{0}$ due to the rule $RECV$ . Later, we will use several sets representing the intruder’s initial knowledge for which we state the following global assumption.

Assumption 3.9 (Intruder’s initial knowledge).

We assume that the intruder’s initial knowledge ${IK}_{0}$ is a set of $R, A x -normal$ ground network messages that contains all constants, agents, and intruder-generated fresh values, but no fresh values generated by the protocol, i.e., $C \cup A \cup F^{∙} \subseteq {IK}_{0}$ and $F^{TID} \cap {IK}_{0} = \emptyset$ .

This assumption specifies the minimal requirements. The attacker usually also knows the long-term shared and private keys of the compromised agents and the public keys of all agents, i.e., the keys in $sh (A_{C}, A), sh (A, A_{C}), pri (A_{C})$ , and $pk (A)$ . However, since our proofs do not rely on these keys being included in ${IK}_{0}$ , they do not appear in our assumption.

Example 3.10 (Example trace).

We provide an example trace of a partial honest execution. In this trace, Alice performs a partial session with Bob, up to the point of Bob’s $Secret$ . Consider the initial state $s_{0} = (ϵ, t h, σ)$ where $t h$ at least contains $\begin{array}{l} t h (1) = (A, P (A)) \\ t h (2) = (B, P (B)) \end{array}$ and where σ meets the condition $\begin{matrix} σ & \supseteq {A^{1} \mapsto alice, A^{2} \mapsto alice, \\ B^{1} \mapsto bob, B^{2} \mapsto bob, \\ {Gb}^{1} \mapsto \exp (g, y^{2}), {Ga}^{2} \mapsto \exp (g, x^{1}), \\ {SPIb}^{1} \mapsto {sPIb}^{2}, {SPIa}^{2} \mapsto {sPIa}^{1}, \\ {Nb}^{1} \mapsto {nb}^{2}, {Na}^{2} \mapsto {na}^{1}} \end{matrix}$ In this case, one reachable state $(t r, t h^{'}, σ)$ has the trace: $\begin{matrix} t r & = (1, send (sPIa, o, sA 1, \exp (g, x), na)) \cdot \\ (2, recv (SPIa, o, sA 1, Ga, Na)) \cdot \\ (2, send (SPIa, sPIb, sA 1, \exp (g, y), nb)) \cdot \\ (1, recv (sPIa, SPIb, sA 1, Gb, Nb)) \cdot \\ (1, Running) \cdot \\ (1, send (sPIa, SPIb, {| A, B, AUTHaa, sA 2, tSa, tSb |}_{SKa})) \cdot \\ (2, recv (SPIa, sPIb, {| A, B, AUTHab, sA 2, tSa, tSb |}_{SKb})) \cdot \\ (2, Running) \cdot \\ (2, send (SPIa, sPIb, {| B, AUTHbb, sA 2, tSa, tSb |}_{SKb})) \cdot \\ (2, Secret) \end{matrix}$ where $t h^{'}$ denotes the threads after executing these events and $SKa$ , $SKb$ , $AUTHaa$ , $AUTHab$ , and $AUTHbb$ are as defined in Example 3.8.

In this trace, the adversary does not interfere. There are also traces in which he does interfere, e.g., traces in which the adversary sends the first message. In such traces, the first event could be a responder receive, for a suitable choice of σ in the initial state.

3.6. Property language

Meier et al. [33] define a predicate-based security property language. In this language, many security properties such as those from [13,16,32] can be specified. In this section, we introduce a specification language for security properties based on [33]. Our language is similar to the languages used in [1,21,26].

Syntax. Our property specification language is an instance of first-order logic with formulas in negation normal form (i.e., only atomic formulas can be negated). Let $X$ be a set of thread identifier variables disjoint from $V$ . The language consists of the following formulas over atomic predicates Q defined below. Explicit quantification is allowed only over thread identifier variables. $\begin{matrix} ϕ : : = Q ∣ \neg Q ∣ ϕ_{1} \land ϕ_{2} ∣ ϕ_{1} \lor ϕ_{2} ∣ \forall ι . ϕ^{'} ∣ \exists ι . ϕ^{'} \end{matrix}$

The atomic predicates and their informal meaning are as follows, where $ι, κ \in X$ are thread-id variables, $t, u \in M$ are messages, $R \in V_{α}$ is a role name, and $e, e^{'} \in Evt (M)$ are events. $\begin{matrix} Q & : : = & ι = κ & thread ι and thread κ are equal \\ ∣ & eq (ι, κ, t, u) & message t in thread ι ’s view equals message u in thread κ ’s view \\ ∣ & secret (ι, t) & the intruder does not know message t as seen by thread ι \\ ∣ & honest (ι, R) & the agent playing role R in thread ι ’s view is honest \\ ∣ & role (ι, R) & thread ι executes role R \\ ∣ & steps (ι, e) & thread ι has executed event e \\ ∣ & (ι, e) ≺ (κ, e^{'}) & thread ι has executed event e before thread κ has executed event e^{'} \end{matrix}$ We use some syntactic sugar and write $t^{@ ι} = u^{@ κ}$ for $eq (ι, κ, t, u)$ . An atomic predicate or negated atomic predicate is called literal. We say that an atomic predicate Q occurs positively (negatively) in a formula ϕ if there is a non-negated (negated) occurrence of Q in ϕ. To achieve attack preservation, we focus on the fragment of this logic where the predicate $secret (ι, t)$ only occurs positively. We call this language $L_{P}$ . A property is formula of $L_{P}$ where all thread-id variables appear in the scope of a quantifier. In examples, we freely use standard abbreviations (e.g., for implication) in formulas if there is an equivalent negation normal form in $L_{P}$ . We also write $honest (ι, {A_{1}, \dots, A_{n}})$ as an abbreviation for $⋀_{k = 1}^{n} honest (ι, A_{k})$ .

Semantics. We define the semantics of our language $L_{P}$ . Recall that $A_{H}$ denotes the set of honest agents. For a trace $t r$ , we define a total ordering $≺_{t r}$ over events occurring in $t r$ such that $a ≺_{t r} b$ if $t r = t r_{1} \cdot a \cdot t r_{2} \cdot b \cdot t r_{3}$ for some $t r_{1}$ , $t r_{2}$ , and $t r_{3}$ . The relation $≺_{t r}$ is crucial to express security properties that impose strong ordering constraints between events such as synchronization [16] (see also Section 6.1).

Let $s = (t r, t h, σ)$ be a state of the protocol P and let ϑ be a substitution interpreting thread-id variables from $X$ as thread identifiers in $dom (t h)$ . Given an equational theory E, we define formula satisfaction, $(s, ϑ) ⊨_{E} ϕ$ , as follows: $\begin{matrix} (s, ϑ) ⊨_{E} ι = κ & iff & ϑ (ι) = ϑ (κ) \\ (s, ϑ) ⊨_{E} t^{@ ι} = u^{@ κ} & iff & t^{ϑ (ι)} σ =_{E} u^{ϑ (κ)} σ \\ (s, ϑ) ⊨_{E} secret (ι, t) & iff & IK (t r) σ \cup {IK}_{0} ⊢_{E} t^{ϑ (ι)} σ is not derivable \\ (s, ϑ) ⊨_{E} honest (ι, R) & iff & R^{ϑ (ι)} σ \in A_{H} \\ (s, ϑ) ⊨_{E} role (ι, R) & iff & π_{1} (t h (ϑ (ι))) = R \\ (s, ϑ) ⊨_{E} steps (ι, e) & iff & (ϑ (ι), e) \in t r \\ (s, ϑ) ⊨_{E} (ι, e) ≺ (κ, e^{'}) & iff & (ϑ (ι), e) ≺_{t r} (ϑ (κ), e^{'}) \\ (s, ϑ) ⊨_{E} \neg A & iff & not (s, ϑ) ⊨_{E} A \\ (s, ϑ) ⊨_{E} ϕ_{1} \land ϕ_{2} & iff & (s, ϑ) ⊨_{E} ϕ_{1} and (s, ϑ) ⊨_{E} ϕ_{2} \\ (s, ϑ) ⊨_{E} ϕ_{1} \lor ϕ_{2} & iff & (s, ϑ) ⊨_{E} ϕ_{1} or (s, ϑ) ⊨_{E} ϕ_{2} \\ (s, ϑ) ⊨_{E} \forall ι . ϕ^{'} & iff & (s, ϑ [ι \mapsto i]) ⊨_{E} ϕ^{'} for all i \in d o m (t h) \\ (s, ϑ) ⊨_{E} \exists ι . ϕ^{'} & iff & (s, ϑ [ι \mapsto i]) ⊨_{E} ϕ^{'} for some i \in d o m (t h) \end{matrix}$

For properties ϕ, we write $s ⊨_{E} ϕ$ instead of $(s, ϑ) ⊨_{E} ϕ$ . A protocol P satisfies a property ϕ if $s ⊨_{E} ϕ$ holds for all reachable states s of P. We write $s ⊭_{E} ϕ$ if $s ⊨_{E} ϕ$ does not hold. We call a reachable state s of P an attack on ϕ if $s ⊭_{E} ϕ$ .

In the following example, we present our formalizations of secrecy and authentication properties for the $IK E_{m}^{}$ protocol. Additional examples of properties are given in Section 6.1.

Example 3.11 (Properties of $IK E_{m}^{}$ ).

We express the secrecy of the Diffie–Hellman key $\exp (Gb, x)$ for role A of the protocol $IK E_{m}^{}$ of Example 3.8 as follows. $\begin{matrix} ϕ_{\sec} = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Secret)) \Rightarrow secret (ι, \exp (Gb, x)) . \end{matrix}$ Intuitively, $ϕ_{\sec}$ states that whenever an agent a playing role A completes his thread with another agent b playing role B and both a and b are honest, the key $\exp (Gb, x)$ is secret. In this protocol, the completion of the thread coincides with the presence of the $Secret$ signal event in a trace.

We formalize non-injective agreement of A with B [32] on the nonces $na$ and $nb$ and the Diffie–Hellman half-keys $\exp (g, x)$ and $\exp (g, y)$ by $\begin{matrix} ϕ_{auth} & = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Commit)) \\ \Rightarrow (\exists κ . role (κ, B) \land steps (κ, Running) \land \\ {⟨ A, B, na, Nb, \exp (g, x), Gb ⟩}^{@ ι} = {⟨ A, B, Na, nb, Ga, \exp (g, y) ⟩}^{@ κ}) . \end{matrix}$ The formula $ϕ_{auth}$ states that whenever an agent a playing role A completes his thread with another agent b playing role B and both agents are honest, then b has previously been running the protocol with a. Moreover, a and b agree on $na$ and $nb$ and the Diffie–Hellman half-keys $\exp (g, x)$ and $\exp (g, y)$ . The authentication on these values is formulated by the equality in the formula, which also includes the agreement on the participating agents and their roles.

Note that our property language does not allow expressing the general notion of injective agreements as defined by Lowe [32], which amounts to counting the numbers of $Commit$ and $Running$ signals occurring in the trace. However, we can express a stronger version of injective agreement as an agreement where there is at most one $Commit$ signals for a given message to be agreed on. This trivially implies the injectiveness of the agreement. This property is suitable for protocols where the role emitting the $Commit$ signal contributes a fresh value to the message to be agreed on, in which case the two definitions coincide. For instance, we formalize the injective agreement of role A with role B on the Diffie–Hellman half-keys $\exp (g, x)$ and $\exp (g, y)$ by $\begin{matrix} ϕ_{i a u t h} & = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Commit)) \\ \Rightarrow (\exists κ . role (κ, B) \land steps (κ, Running) \land \\ {⟨ A, B, \exp (g, x), Gb ⟩}^{@ ι} = {⟨ A, B, Ga, \exp (g, y) ⟩}^{@ κ}) \land \\ (\forall λ . (role (λ, A) \land steps (λ, Commit) \land \\ {⟨ A, B, \exp (g, x), Gb ⟩}^{@ λ} = {⟨ A, B, \exp (g, x), Gb ⟩}^{@ ι}) \Rightarrow λ = ι) \end{matrix}$

Remark 3.12.
An alternative formulation of our protocol semantics and property language, suggested by one of the reviewers, is obtained by viewing each variable and fresh value as an unary function symbol and keeping the thread identifier variables as the only variables of the property language. The set of network messages would thus become $N^{alt} = T_{Σ \cup V_{P} \cup F_{P}} (X, A \cup C \cup F^{∙} \cup TID)$ . We briefly discuss how such a setup could look like and how it compares to ours.

The substitutions $σ : V^{TID} ⇀ N$ in the states would be replaced by first-order structures $σ : V \to (TID ⇀ N^{alt})$ interpreting the function symbols associated with the protocol variables as type-respecting functions mapping thread identifiers to network messages. We would leave the function symbols in $F_{P}$ uninterpreted as a simple way to model the uniqueness of fresh values. More precisely, the interpretation $‖ t ‖_{(σ, ϑ)}$ of a network message t would be $‖ V (ι) ‖_{(σ, ϑ)} = σ (V) (ϑ (ι))$ for $V \in V_{P}$ , $‖ n (ι) ‖_{(σ, ϑ)} = n (ϑ (ι))$ for $n \in F_{P}$ and extended homomorphically to all terms. Note that this interpretation is isomorphic to ours if we use thread variables and identifiers in $N^{alt}$ only as arguments of the function symbols in $V_{P}$ and $F_{P}$ .

We see two possibilities for dealing with protocol specifications in such an approach. The first possibility is to keep protocol specifications unchanged, i.e., using messages from $M$ , but replace the indexing of variables and fresh values in network messages by function application, i.e., we would have $V^{ι} = V (ι)$ and $n^{ι} = n (ι)$ for variables and fresh values and extend this to all terms as expected. One could keep the syntax of the property language, but would adapt its interpretation. For example, term equations would still be written $t^{@ ι} = u^{@ κ}$ for $t, u \in M$ , but the semantics would become $‖ t^{ι} ‖_{(σ, ϑ)} =_{E} ‖ u^{κ} ‖_{(σ, ϑ)}$ . While remaining very close to our formulation, the disadvantage of this approach is the non-uniform treatment of messages in specifications (with variables and substitutions as before) and network messages (with the new interpretation). This would complicate the development of our abstraction theory, as it applies abstractions to both protocol messages in specifications and network messages in traces.

The second possibility is to also use messages from $N^{alt}$ in protocol specifications. In this approach, every protocol role would be parametrized by a thread-id variable ι, which is used as an argument of all function symbols in $V_{P}$ and $F_{P}$ in the role. This variable would be instantiated with some $i \in TID$ to create the actual thread i (cf. definition of initial state). Indexing would no longer be needed. We could adapt the property language accordingly. For example, term equations would be written as $t = u$ for $t, u \in N^{alt}$ and interpreted as $‖ t ‖_{(σ, ϑ)} =_{E} ‖ u ‖_{(σ, ϑ)}$ . This would yield a more uniform picture again at the price of cluttering all variables and fresh values in protocol specifications with thread-id variables.

In both cases, the operational semantics and numerous details would have to be carefully adapted. We believe that our setup strikes a good balance between an economic notation for protocol specifications and a uniform treatment of different kinds of messages in our abstraction theory.

4. Security protocols abstractions

In this section, we present two kinds of protocol abstractions: Typed abstractions

transform a term’s structure by removing or reordering fields and by removing or splitting cryptographic operations. The types enable a fine-grained selection of the transformation to apply. The same transformation is applied to all terms of a given type and its subtypes.

Untyped abstractions

complement typed ones with two additional kinds of simplifications: atom/variable removal abstractions and redundancy removal abstractions. The former remove unprotected atoms or variables while the latter remove terms that the intruder can derive.

Typically, we will use typed abstractions to simplify the cryptographic structure of terms followed by untyped abstractions to remove atoms and variables as well as redundancies.

In Section 4.1, we give an overview of the different kinds of abstractions and their combined use. We then proceed with the formal definitions and results for our protocol abstractions that we will apply in the following chapters. Our main results are soundness theorems for the typed and untyped abstractions. They ensure that any attack on a given property of the original protocol translates to an attack on the abstracted protocol. Similar to [28], we follow a modular approach for proving this property. We first define a general notion of protocol abstraction for which we prove a general soundness theorem under certain conditions (Section 4.2). These conditions concern the preservation of intruder deducibility as well as of equalities and disequalities. We then go on to define each concrete kind of abstraction and prove its soundness (Sections 4.3–4.5). We illustrate the usefulness of our definitions on our running example. For the soundness proofs it then suffices to establish the conditions of the general soundness theorem. As we will see, each such soundness result in turn imposes certain conditions, which we will introduce and motivate by examples.

Upon first reading, readers may choose to skip the remainder of this section after reading the following overview and proceed to the next sections to get an impression of how we will use the abstractions.

4.1. Overview

Typed abstractions are our main mechanism to simplify the cryptographic structure of terms by removing protections that are not required to achieve a given property. We specify typed abstractions by a list of recursive equations. The following example illustrates a range of typical forms of defining equations. Messages are transformed according to the first matching pattern. If no pattern matches then the top-level symbol is transformed homomorphically. Typed abstractions leave atoms and variables untouched.

Example 4.1 (Typed abstractions).

Consider a simplified variant of the $IK E_{m}^{}$ protocol from Section 2 and Example 3.8, where in the first two messages each role sends the constant $sA 1$ , its Diffie–Hellman half-key, and a nonce and we authenticate the final two messages using signatures instead of MACs. We focus here on the final two events of the initiator A: $\begin{array}{l} send ({| A, B, sA 2, {[m 3, sA 1, na, Nb, \exp (g, x), SKa]}_{pri (A)} |}_{SKa}) \cdot \\ recv ({| B, sA 2, {[m 4, sA 1, na, Nb, Gb, SKa]}_{pri (B)} |}_{SKa}) \end{array}$ where $SKa = kdf (\exp (Gb, x), na, Nb)$ and $m 3$ and $m 4$ are tagging constants distinguishing the two messages. Suppose our goal is to verify that the initiator non-injectively agrees with the responder on $na$ , $\exp (g, x)$ , and $Gb$ . For this purpose, we aim at simplifying these events as follows: $\begin{array}{l} send ({[m 3, \exp (g, x), kdf (\exp (Gb, x))]}_{pri (A)}) \cdot \\ recv ({[m 4, na, Gb, kdf (\exp (Gb, x))]}_{pri (B)}) \end{array}$ Note that we drop $na$ and $Nb$ from the third message and $Nb$ from the fourth. To achieve this, we first specify a typed abstraction using the following four equations: $\begin{array}{l} f ({| X |}_{Y}) = f (X) \\ f (kdf (X, Y)) = kdf (f (X)) \\ f ({[T_{3}, S, N_{1}, N_{2}, Y]}_{pri (Z)}) = ⟨ {[f (T_{3}), f (Y)]}_{pri (f (Z))}, f (S), f (N_{1}), f (N_{2}) ⟩ \\ f ({[T_{4}, S, N_{1}, N_{2}, Y]}_{pri (Z)}) = ⟨ {[f (T_{4}), f (N_{1}), f (Y)]}_{pri (f (Z))}, f (S), f (N_{2}) ⟩ \end{array}$ where all variables have type $msg$ except for $T_{3} : γ_{m 3}$ and $T_{4} : γ_{m 4}$ . The equation for symmetric encryption simply drops the encryption and the one for the key derivation function $kdf$ drops the second component of the pair underneath it. There are also two equations for signatures. They both pull the tuple components S and $N_{2}$ out of the signature. The first equation additionally pulls out $N_{1}$ . Note that, when transforming the protocol’s events, the variable Y will match the pair of messages consisting of the Diffie–Hellman half-key and the session key. Note also that the patterns on the left-hand side of these two equations are identical except for the types of the variables $T_{3}$ and $T_{4}$ , which respectively match the tags $m 3$ and $m 4$ . Only the types allow us to distinguish the third and the fourth protocol messages and to transform them in different ways. Sound typed abstractions cannot remove arbitrary fields: while the equation for $kdf$ removes Y, those for the signatures pull some tuple components out of the signature instead of removing them (as we would like to).

Let us now apply this typed abstraction to the fourth message in A’s role. We elide the application of f to atoms and variables (where f is the identity) and on pairs, $pri$ and $\exp$ (where f behaves homomorphically). $\begin{matrix} f ({| B, sA 2, {[m 4, sA 1, na, Nb, Gb, SKa]}_{pri (B)} |}_{SKa}) \\ = ⟨ B, sA 2, f ({[m 4, sA 1, na, Nb, Gb, SKa]}_{pri (B)}) ⟩ \\ = ⟨ B, sA 2, {[m 4, na, Gb, f (SKa)]}_{pri (B)}, sA 1, Nb ⟩ \\ = ⟨ B, sA 2, {[m 4, na, Gb, f (kdf (\exp (Gb, x), na, Nb))]}_{pri (B)}, sA 1, Nb ⟩ \\ = ⟨ B, sA 2, {[m 4, na, Gb, kdf (\exp (Gb, x))]}_{pri (B)}, sA 1, Nb ⟩ \end{matrix}$ The third message abstracts to $⟨ A, B, sA 2, {[m 3, \exp (g, x), kdf (\exp (Gb, x))]}_{pri (B)}, sA 1, na, Nb ⟩$ .

Generally speaking, in order to preserve the deducibility of messages, typed abstractions cannot remove fields that are extractable (e.g., by projection, decryption, or signature verification), whereas removing the non-extractable fields under a hash-type function such as $kdf$ poses no problems. We use untyped abstractions to remove redundant or unprotected message elements including those we have pulled out of cryptographic operations using typed abstractions.

Example 4.2 (Atom-and-variable removal).

Applying the typed abstraction above to the fourth protocol message yielded $t = ⟨ B, sA 2, {[m 4, na, Gb, kdf (\exp (Gb, x))]}_{pri (B)}, sA 1, Nb ⟩$ . To obtain the desired result $t^{'} = {[m 4, na, Gb, kdf (\exp (Gb, x))]}_{pri (B)}$ , we want to remove the fields B, $sA 1$ , $sA 2$ , and $Nb$ . The atom-and-variable removal abstraction ${rem}_{T}$ is parametrized by a set T of atoms and variables and removes all cryptographically unprotected occurrences of the elements of T from a message (i.e., those visible within t without any decrypting). Soundness requires that the transformed messages must not contain any protected occurrence of the elements of T. In our case, we set $T = {A, B, sA 1, sA 2, nb, Nb}$ to obtain ${rem}_{T} (t) = t^{'}$ and we observe that the soundness condition is satisfied for this choice. Applying ${rem}_{T}$ to the abstracted third protocol message yields $⟨ {[m 3, \exp (g, x), kdf (\exp (Gb, x))]}_{pri (B)}, na ⟩$ . The soundness condition forbids the inclusion of the nonce $na$ in T, since $na$ also occurs protected by the signature in $t^{'}$ .

Example 4.3 (Redundancy removal).

Since $na$ is sent in the clear along with the Diffie–Helman half-key in the first protocol message, we use a redundancy removal abstraction to remove the redundant occurrence of $na$ in the abstracted third message. Redundancy removal abstractions are functions on messages that return a special value $nil$ for removed messages. They can remove message elements from a role that the intruder can deduce from his initial knowledge or from elements that he has learned earlier from the same role. Since $na$ and $Na$ already occur in the first message of the initiator or responder roles, this condition holds for the function that removes $na$ from $⟨ {[m 3, Gb, kdf (\exp (Gb, x))]}_{pri (B)}, na ⟩$ and $Na$ from role B’s third message while leaving all other messages unchanged. As an alternative to the atom-and-variable removal in Example 4.2, we could also remove the elements of T using a redundancy removal abstraction that removes all occurrences of A, B, $sA 1$ , and $sA 2$ (we assume the intruder knows all agents and constants) and all but the first occurrences of $nb$ and $Nb$ (similar to what we did with $na$ and $Na$ above).

We have chosen to factor out the removal of atoms and variables as well as redundancies from the typed abstractions, since this substantially simplifies their definition and soundness proofs.

4.2. General soundness theorem for protocol abstractions

We start by defining a general form of protocol abstraction that encompasses all of our concrete abstractions. We then prove a general soundness theorem for these abstractions, which we later instantiate to obtain concrete soundness results.

4.2.1. General protocol abstractions

A general protocol abstraction consists of two functions. The first functions transforms the terms in the protocol definition and in protocol executions, while the second one transforms properties. For some but not all concrete abstractions these functions will coincide. We introduce the set $T = M \cup N$ , which includes all terms that may occur in protocol specifications, properties, symbolic traces, or ground traces. In the definition below, we use the special symbol $nil$ to mark messages that are removed.

Definition 4.4 (General protocol abstraction).

A (general) protocol abstraction is a pair $G = (g_{p r o t}, g_{p r o p})$ where $g_{p r o t} : T \to T \cup {nil}$ and $g_{p r o p} : T \to T \cup {nil}$ . We define the application of $G$ to events, traces, and protocols by applying the appropriate component of $G$ to the terms they contain as follows.

For events: $G (s i g) = s i g$ for $s i g \in S i g$ and, for $e v \in {send, recv}$ , $G (e v (t)) = nil$ if $G (t) = nil$ and $G (e v (t)) = e v (g_{p r o t} (t))$ otherwise.

For event sequences: $G (ϵ) = ϵ$ and $G (e \cdot t l) = G (t l)$ if $G (e) = nil$ and $G (e \cdot t l) = G (e) \cdot G (t l)$ otherwise; this is extended to traces and threads in the expected way.

$G (P) = {(R, G (P (R))) ∣ R \in dom (P) \land G (P (R)) \neq ϵ}$ for protocols P.

For the atomic predicates of our property language: $\begin{array}{l} G (ι = κ) = (ι = κ) G (role (ι, A)) = role (ι, A) \\ G (t^{@ ι} = u^{@ κ}) = (g_{p r o p} {(t)}^{@ ι} = g_{p r o p} {(u)}^{@ κ}) G (steps (ι, e)) = steps (ι, G (e)) \\ G (secret (ι, t)) = secret (ι, g_{p r o p} (t)) G ((ι, e) ≺ (κ, e^{'})) = (ι, G (e)) ≺ (κ, G (e^{'})) \\ G (honest (ι, A)) = honest (ι, A) \end{array}$ We extend this mapping homomorphically to all formulas. Note that the terms in a formula’s events are abstracted by $g_{p r o t}$ , while those in equations and secrecy predicates are abstracted using $g_{p r o p}$ .

Although general protocol abstractions have two independent fields, our concrete typed and untyped abstractions will use only special forms. For typed abstractions and atom-variable removal abstraction, we will have $g_{p r o t} = g_{p r o p}$ and for redundancy removal abstractions $g_{p r o p} = i d$ (the identity function).

4.2.2. Soundness of general protocol abstractions

To justify the soundness of our abstractions $G$ , we show that any attack on a property ϕ of the original protocol P is reflected as an attack on the property $G (ϕ)$ of the abstracted protocol $G (P)$ . We decompose this into reachability preservation (RP) and attack preservation (AP) as follows. We require that, for all reachable states $(t r, t h, σ)$ of P, there is a substitution $σ^{'}$ such that

$(G (t r), G (t h), σ^{'})$ is a reachable state of $G (P)$ , and

$(t r, t h, σ) ⊭ ϕ$ implies $(G (t r), G (t h), σ^{'}) ⊭ G (ϕ)$ .

We will define the substitution

σ^{'}

g (σ) = g \circ σ

for some function

g : N \to N

on network messages. These two properties will require some assumptions about P, ϕ, and

G

. We start by defining and explaining the conditions on formulas. We first introduce some auxiliary sets of elements of a formula ϕ:

${Sec}_{ϕ}$ be the set of all terms t that occur in formulas $secret (ι, t)$ in ϕ,

${Eq}_{ϕ}$ be the set of tuples $(ι, κ, t, u)$ such that the equation $t^{@ ι} = u^{@ κ}$ occurs in ϕ and let ${EqTerm}_{ϕ} = {t, u ∣ \exists ι, κ . (ι, κ, t, u) \in {Eq}_{ϕ}}$ be the set of underlying terms, and

${Evt}_{ϕ}$ be the set of events occurring in ϕ.

Let

{Eq}_{ϕ}^{+}

and

{Eq}_{ϕ}^{-}

respectively be the sets of tuples representing equations with a positive and a negative occurrence in ϕ and let

{EqTerm}_{ϕ}^{+}

and

{EqTerm}_{ϕ}^{-}

be the corresponding sets of terms. Similarly, we define the subset

{Evt}_{ϕ}^{+}

of elements of

{Evt}_{ϕ}

with a positive occurrence in ϕ.

Definition 4.5 (Safe formulas).

Let $g : N \to N$ be a function on network messages. We define ϕ to be safe for P and $(G, g)$ if, for all well-typed ground substitutions σ, the following conditions hold:

$nil \notin G ({Sec}_{ϕ} \cup {EqTerm}_{ϕ} \cup {Evt}_{ϕ})$ ,

g is the identity function on $A$ ,

for all $(ι, κ, t, u) \in {Eq}_{ϕ}^{-}$ and thread-id interpretations ϑ, we have that $\begin{matrix} t^{ϑ (ι)} σ =_{E} u^{ϑ (κ)} σ implies g_{p r o p} (t^{ϑ (ι)}) g (σ) =_{E} g_{p r o p} (u^{ϑ (κ)}) g (σ), \end{matrix}$

for all $(ι, κ, t, u) \in {Eq}_{ϕ}^{+}$ and thread-id interpretations ϑ, we have that $\begin{matrix} g_{p r o p} (t^{ϑ (ι)}) g (σ) =_{E} g_{p r o p} (u^{ϑ (κ)}) g (σ) implies t^{ϑ (ι)} σ =_{E} u^{ϑ (κ)} σ, \end{matrix}$

for all $e (t) \in {Evt}_{ϕ}^{+}$ and $e (u) \in Evt (M_{P})$ , we have $g_{p r o t} (t) = g_{p r o t} (u)$ implies $t = u$ .

Condition (a) ensures that $nil$ does not occur in the abstracted formula. Condition (b) ensures that the two substitutions agree on agent variables. Condition (c) requires equality preservation for negatively occurring equations. Condition (d) expresses the injectivity of the abstraction on the terms in positively occurring equalities. This condition is required to preserve attacks on agreement properties. In other words, it prevents abstractions from fixing attacks on agreement by identifying two terms that differ in the original protocol. Finally, condition (e) is required for properties involving event orderings and $steps$ predicates. It states that the abstraction must not identify an event occurring positively in the property with a distinct protocol event.

We now state the soundness theorem for the general abstractions.

Theorem 4.6 (General soundness theorem).

Let P be a protocol, ϕ a property, $G = (g_{p r o t}, g_{p r o p})$ a protocol abstraction, and g a function on network messages. Suppose the following conditions hold:

For all states $(t r, t h, σ) \in reach (P, {IK}_{0})$ , thread id’s i, agent variables R, role suffixes $t l$ , and terms t such that $t h (i) = (R, recv (t) \cdot t l)$ and $g_{p r o t} (t) \neq nil$ , we have $\begin{matrix} IK (t r) σ, {IK}_{0} ⊢_{E} t^{i} σ implies IK (G (t r)) g (σ), {IK}_{0}^{'} ⊢_{E} g_{p r o t} (t^{i}) g (σ), \end{matrix}$

For all states $(t r, t h, σ) \in reach (P, {IK}_{0})$ , thread id’s i, and terms $t \in {Sec}_{ϕ}$ such that $g_{p r o p} (t) \neq nil$ we have $\begin{matrix} IK (t r) σ, {IK}_{0} ⊢_{E} t^{i} σ implies IK (G (t r)) g (σ), {IK}_{0}^{'} ⊢_{E} g_{p r o p} (t^{i}) g (σ), and \end{matrix}$

ϕ is safe for P and $(G, g)$ .

Then for all states

(t r, t h, σ) \in reach (P, {IK}_{0})

we have

$(G (t r), G (t h), g (σ)) \in reach (G (P), {IK}_{0}^{'})$ , and

$(t r, t h, σ) ⊭ ϕ$ implies $(G (t r), G (t h), g (σ)) ⊭ G (ϕ)$ .

Condition (i) ensures that derivability is preserved for received messages. Similarly, condition (ii) ensures the deducibility preservation for claimed secrets. Condition (i) is needed to establish conclusion 1 and conditions (ii) and (iii) are required for conclusion 2. Below we sketch the proof of this theorem. The full proof can be found in the full version [39].

Proof Sketch.
To show point 1 (reachability preservation), let $(t r, t h, σ) \in reach (P, {IK}_{0})$ . We establish $(G (t), G (t h), g (σ)) \in reach (G (P), {IK}_{0}^{'})$ by induction on the number n of transitions leading to the state $(t r, t h, σ)$ . The base case ( $n = 0$ ) is straightforward. For the inductive case, assume $(t r^{'}, t h^{'}, σ)$ is reachable in k steps and there is a transition $(t r^{'}, t h^{'}, σ) \to (t r, t h, σ)$ . By the induction hypothesis, we know that $(G (t r^{'}), G (t h^{'}), g (σ)) \in reach (G (P), {IK}_{0}^{'})$ . If $g_{p r o t} (t) = nil$ then we have $G (t r) = G (t r^{'})$ and $G (t h) = G (t h^{'})$ and hence $(G (t r), G (t h), g (σ)) \in reach (G (P), {IK}_{0}^{'})$ . Otherwise, we have $g_{p r o t} (t) \neq nil$ . We consider three cases according to the rule r that has been applied in step $k + 1$ . The cases for the rules $SEND$ and $SIGNAL$ are straightforward. For the remaining case $r = RECV$ , we know by the rule’s premises that $t h^{'} (i) = (R, recv (t) \cdot t l)$ and $IK (t r^{'}) σ, {IK}_{0} ⊢_{E} t^{i} σ$ for some r, t, and $t l$ . Using assumption (i) with the induction hypothesis, we establish the two premises of rule $RECV$ required to obtain $(G (t r^{'}), G (t h^{'}), g (σ)) \to (G (t r), G (t h), g (σ))$ . This implies the conclusion for this case. Hence, we have established point 1.

To show point 2 (attack preservation), we proceed by induction on the structure of ϕ and use assumptions (ii) and (iii). □

In the following subsections, we discuss each kind of protocol abstraction and the associated soundness result. For these proofs, it suffices to define the function g and to establish the conditions (i)–(iii) of the theorem above. In each case we introduce the assumptions that are needed for this purpose and motivate them by examples.
4.3. Typed protocol abstractions

Our typed abstractions are specified by a list of recursive equations subject to some conditions on their shape. We define their semantics in terms of a simple Haskell-style functional program. We use both pattern matching on terms and subtyping on types to select the equation to be applied to a given term. This ensures that terms of related types are transformed in a uniform manner.

4.3.1. Syntax and semantics

Let $W = {W_{τ}}_{τ \in Y}$ be a family of pattern variables disjoint from $V$ . We define the set of patterns by $P = T (W, \emptyset)$ . A pattern $p \in P$ is called linear if each (pattern) variable occurs at most once in p. We extend the typing function Γ to patterns by setting $Γ (X) = τ$ if and only if $X \in W_{τ}$ and then lifting it homomorphically to all patterns. Our typed message abstractions are instances of the following recursive function specifications.

Definition 4.7.
A function specification $F_{f} = (f, E_{f})$ consists of an unary function symbol $f \notin Σ^{1}$ and a list of equations $\begin{matrix} E_{f} = [f (p_{1}) = u_{1}, \dots, f (p_{n}) = u_{n}], \end{matrix}$ where each $p_{i} \in P$ is a linear pattern such that $u_{i} \in T_{Σ^{⩾ 1} \cup {f}} (vars (p_{i}))$ for all $i \in \tilde{n}$ , i.e., $u_{i}$ consists of variables from $p_{i}$ and function symbols from $Σ^{⩾ 1} \cup {f}$ .
Definition 4.8.
For $c \in Σ^{⩾ 1}$ , an equation $f (c (p_{1}, \dots, p_{n})) = u$ of $E_{f}$ is called a c-equation and it is called homomorphic if $u = c (f (Z_{1}), \dots, f (Z_{n}))$ and $p_{i} = Z_{i}$ are variables of type $msg$ . We say that $F_{f}$ is homomorphic for $c \in Σ^{⩾ 1}$ if all c-equations in $E_{f}$ are homomorphic.

The function specification $F_{f}^{0} = (f, E_{f}^{0})$ consists of a homomorphic equation for each $c \in Σ^{⩾ 1}$ and the final equation $f (Z) = Z$ with $Z : msg$ .

We use vectors (lists) of terms $\overline{t} = [t_{1}, \dots, t_{n}]$ for $n > 0$ . We define $set (\overline{t}) = {t_{1}, \dots, t_{n}}$ and $\hat{f} (\overline{t}) = ⟨ f (t_{1}), \dots, f (t_{n}) ⟩$ , the elementwise application of a function f to a vector where the result is converted to a tuple (with the convention $⟨ t ⟩ = t$ ). We define the splitting function by $split (⟨ t, u ⟩) = split (t) \cup split (u)$ on pairs and $split (t) = {t}$ on other terms t. We call the elements of $split (t)$ the fields of t. We extend $split$ to vectors by $split (\overline{t}) = split (set (\overline{t}))$ .
Definition 4.9 (Typed abstraction).

A typed abstraction is a function specification of the form $F_{f} = (f, E_{f}^{+})$ where $E_{f}^{+} = E_{f} \cdot E_{f}^{0}$ and each equation in $E_{f}$ has the form $\begin{matrix} (⋆) & f (c (p_{1}, \dots, p_{n})) = ⟨ e_{1}, \dots, e_{d} ⟩ \end{matrix}$ where for each $i \in \tilde{d}$ we have either

$e_{i} = f (q)$ such that $q \in split (p_{j})$ for some $j \in \tilde{n}$ , or

$e_{i} = c (\hat{f} (\overline{q_{1}}), \dots, \hat{f} (\overline{q_{n}}))$ with $c \neq ⟨ \cdot, \cdot ⟩$ such that, for all $j \in \tilde{n}$ , we have $set (\overline{q_{j}}) \subseteq split (p_{j})$ and, whenever $p_{i}$ is not a pair, we have $\overline{q_{i}} = [p_{i}]$ , i.e., $\hat{f} (\overline{q_{i}}) = f (p_{i})$ .

The concatenation of $E_{f}$ with $E_{f}^{0}$ ensures the totality of typed abstractions. The shape of the terms $e_{i}$ in equation (⋆) ensures that the abstractions can only weaken the cryptographic protection of terms but never strengthen it. Each defining equation maps a term with top-level symbol c to a tuple whose components have the form (a) or (b). In both forms, we can only apply f recursively on fields of the patterns $p_{i}$ . Form (a) allows us to pull fields out of the scope of c, hence removing c’s protection. Using form (b) we can reorder, duplicate, or remove fields in each argument of c. We cannot however turn a non-pair argument of c into a pair such as in $f (c (x)) = c (⟨ f (x), f (x) ⟩)$ . Furthermore, for the case where c is pairing we have to use form (a) to obtain the simple shape $f (⟨ p_{1}, p_{2} ⟩) = \hat{f} (\overline{q})$ with $set (\overline{q}) \subseteq split (⟨ p_{1}, p_{2} ⟩)$ .

Example 4.10.
We present a typed abstraction $F_{f} = (f, E_{f} \cdot E_{f}^{0})$ illustrating a representative selection of the possible message transformations. Suppose $X : γ_{c}$ , $Y : nonce$ , and $Z, U, V : msg$ and let $E_{f}$ consist of the following three equations: $\begin{array}{l} f (kdf (X, U, V)) = kdf (f (X), f (U)) \\ f ({[Y, Z]}_{pri (U)}) = ⟨ f (Y), f (Z) ⟩ \\ f ({| X, Y, Z |}_{U}) = ⟨ {| f (X), f (Z) |}_{f (U)}, f (Y) ⟩ \end{array}$ The patterns’ types filter the matching terms: X and Y only match the constant c respectively a nonce. The first equation removes the field V from a $kdf$ hash. The second equation removes the signature. The third one pulls the field Y out of an encryption. These are typical examples of typed abstractions that are generated by our abstraction heuristics described in Section 5. Our theory also supports other forms of typed abstractions such as the following two: $\begin{array}{l} f (⟨ X, Y, Z ⟩) = ⟨ f (Y), f (X), f (Z) ⟩ \\ f ({| X, Y, Z |}_{U}) = ⟨ {| f (X), f (Y) |}_{f (U)}, {| f (Z) |}_{f (U)} ⟩ \end{array}$ The first equation swaps the first two fields in n-tuples for $n ⩾ 3$ . In practice, such a re-ordering abstraction is useful to avoid type confusions, which may lead to spurious attacks. The second one splits an encryption: the pair $⟨ f (X), f (Y) ⟩$ and $f (Z)$ are encrypted separately with the key $f (U)$ .

Program 1.
Program f resulting from $F_{f} = (f, E_{f})$ , where $[f (p_{1}) = u_{1}, \dots, f (p_{k}) = u_{k}] = E_{f}^{+}$ .

The semantics of a typed abstraction $F_{f}$ is given by the Haskell-style functional program f (Program 1). We are overloading the symbol f here: we use it as a function symbol in $E_{f}^{+}$ as well as the name of the functional program constructed from the equations in $E_{f}^{+}$ . The $case$ statement has a clause $\begin{matrix} p ∣ Γ (t) ≼ Γ (p) \Rightarrow u \end{matrix}$ for each equation $f (p) = u$ of $E_{f}^{+}$ . Note that occurrences of f in u correspond to recursive calls of the program f. Such a clause is enabled if
the term t matches the pattern p, i.e., $t = p θ$ for some substitution θ, and

its type $Γ (t)$ is a subtype of $Γ (p)$ .
The first enabled clause is executed. Hence, the equations $E_{f}^{0}$ serve as fall-back clauses, which cover the terms not handled by $E_{f}$ . In particular, the last clause $f (Z) = Z$ handles exactly the atoms and variables.

We will often identify the typed abstraction $F_{f} = (f, E_{f})$ and the functional program f. The corresponding general protocol abstraction according to Definition 4.4 is then simply $G = (f, f)$ .
Example 4.11.
Consider the typed abstraction given by the first three equations from Example 4.10, including the types of the variables. Suppose we would like to use the associated program f (as specified in Program 1) to abstract the term $t = {| c, n, W |}_{kdf (c, k, A)}$ , which is composed of the constant $c : γ_{c}$ , the nonces $n, k : nonce$ , the message variable $W : msg$ , and the agent variable $A : α$ . The resulting reduction sequence and the corresponding subtyping conditions are as follows: $\begin{matrix} f (t) & = f ({| c, n, W |}_{kdf (c, k, A)}) {| γ_{c}, β_{n}, msg |}_{kdf (γ_{c}, β_{k}, α)} ≼ {| γ_{c}, nonce, msg |}_{msg} \\ = ⟨ {| c, W |}_{f (kdf (c, k, A))}, n ⟩ kdf (γ_{c}, β_{n}, msg) ≼ kdf (γ_{c}, msg, msg) \\ = ⟨ {| c, W |}_{kdf (c, k)}, n ⟩ \end{matrix}$ Note that we have elided the reduction steps for pairs and for atomic messages, which use the corresponding fallback equations in $E_{f}^{0}$ . Both subtyping conditions clearly hold. However, for the slightly different term $u = {| n, c, W |}_{kdf (d, k, A)}$ for $d : γ_{d}$ we obtain $f (u) = u$ , since in this case the two corresponding subtyping conditions do not hold. Therefore, only the homomorphic fallback equations in $E_{f}^{0}$ apply, which have trivial subtyping conditions.
4.3.2. Finding abstractions

Finding abstractions is fully automated by our tool using a heuristic that we will describe in Section 5. To show a concrete application of typed abstractions to our running example while giving a first idea of our heuristics, we use here the following simplified abstraction strategy: We start by identifying the terms that appear in the $secret (\cdot, \cdot)$ predicates and equations of the desired properties. Then we determine the cryptographic operations that are essential to achieve these properties and try to remove all other terms and operations. In this process, we have to be careful not to over-abstract the protocol, since this may easily introduce false negatives (i.e., spurious attacks). Therefore, apart from preserving the necessary cryptographic operations, we also avoid the introduction of new pairs of unifiable protocol terms.

Example 4.12 (from $IK E_{m}^{}$ to $IK E_{m}^{1}$ ).

To preserve the secrecy of the DH key $\exp (\exp (g, x), y)$ and the agreement on $na$ , $nb$ , $\exp (g, x)$ , and $\exp (g, y)$ , we have to keep either the $mac$ or the symmetric encryption with $SK$ (see Examples 3.8 and 3.11). We want to remove as many other fields and operations as possible (e.g., $prf$ ). We choose to remove the encryption as this allows us to later remove additional fields (e.g., $sA 2$ ) using untyped abstractions. We keep o in $AUTHa$ to prevent unifiability with $AUTHb$ and hence potential false negatives. This leads us to the typed abstraction $F_{f_{1}} = (f_{1}, E_{f_{1}})$ where $E_{f_{1}}$ is defined by the equations $\begin{array}{l} f_{1} ({| X, Y |}_{Z}) = ⟨ f_{1} (X), f_{1} (Y) ⟩ X : α \\ f_{1} (mac (X_{1}, \dots, X_{8})) = mac (\hat{f_{1}} ([X_{1}, X_{3}, X_{5}, X_{6}, X_{7}, X_{8}])) X_{3} : γ_{o} \\ f_{1} (mac (Y_{1}, \dots, Y_{8})) = mac (\hat{f_{1}} ([Y_{1}, Y_{5}, Y_{6}, Y_{7}, Y_{8}])) Y_{3} : nonce \\ f_{1} (kdf (Z_{1}, \dots, Z_{5})) = kdf (f_{1} (Z_{3})) \\ f_{1} (prf (U, Z)) = f_{1} (U) U : kdf (msg) \end{array}$ where we omitted the homomorphic clauses for the symbols $\exp$ , $sh$ , and $⟨ \cdot, \cdot ⟩$ . The types of some pattern variables are indicated on the right-hand side. All the remaining variables are of type $msg$ . Applying $f_{1}$ to $IK E_{m}^{}$ we obtain $IK E_{m}^{1}$ . Here is the abstracted initiator role. $\begin{matrix} S_{IK E_{m}^{1}} (A) & = send (sPIa, o, sA 1, \exp (g, x), na) \cdot \\ recv (sPIa, SPIb, sA 1, Gb, Nb) \cdot Running \cdot \\ send (sPIa, SPIb, A, B, {AUTHaa}^{'}, sA 2, tSa, tSb) \cdot \\ recv (sPIa, SPIb, B, {AUTHba}^{'}, sA 2, tSa, tSb) \cdot Secret \cdot Commit \end{matrix}$ where ${SKa}^{'} = kdf (\exp (Gb, x))$ is the session key and the authenticators are defined by ${AUTHaa}^{'} = mac (sh (A, B), o, \exp (g, x), na, Nb, {SKa}^{'})$ and ${AUTHba}^{'} = mac (sh (A, B), Gb, Nb, na, {SKa}^{'})$ . In a second step, we will remove most fields in the roles of $IK E_{m}^{1}$ using untyped abstractions.

4.3.3. Soundness of typed abstractions

We now turn to showing the soundness of the typed abstractions. We do this by establishing conditions (i)–(iii) of our general soundness theorem (Theorem 4.6). The main ingredients that we need for this purpose are the preservation of intruder deduction, equalities, and disequalities. These properties will not hold without restrictions on the protocol, the property, and the typed abstraction. We first formulate these properties, their scope, and introduce these restrictions informally. We then state our soundness theorem. We defer the detailed motivation and formal definitions of the restrictions to the subsequent subsections.

Remark 4.13.
For the correct interpretation of the properties of typed abstractions, it is important to remark that, given a term $t \in T$ , the expression $f (t)$ denotes the term in $T$ obtained by evaluating the functional program f on t. This is in contrast to a purely syntactical reading of $f (t)$ such as in the equations $E_{f}$ . Note that the term $f (t)$ itself is not an element of $T$ , since $f \notin Σ$ .

Suppose σ is the substitution component of a concrete state, $T \subseteq M^{TID}$ is a set of terms and $t, u \in M^{TID}$ be terms.
Deducibility preservation. Here, we require that $\begin{matrix} (P1) & T σ ⊢_{E} t σ ⟹ f (T) f (σ) ⊢_{E} f (t) f (σ) \end{matrix}$ This is needed to simulate the execution of receive events in the abstract protocol (condition (i) of Theorem 4.6) and for the preservation of secrecy (condition (ii) of Theorem 4.6). This property holds for typed abstractions f that are compatible with the rewrite theory (or $R, A x$ -compatible for short). This requires, for example, that f cannot remove fields that are extractable from a constructor using a rewrite rule (such as in decryption). We will discuss this property in more detail in Section 4.3.6.

Equality preservation. This means that $\begin{matrix} (P2) & t σ =_{E} u σ ⟹ f (t) f (σ) =_{E} f (u) f (σ) \end{matrix}$ This property is needed for proving deducibility preservation and for the preservation of equalities in protocol properties (condition (c) of Definition 4.5 needed in condition (iii) of Theorem 4.6). This property holds if f is compatible with the axioms $A x$ and with the variants ${⟦ t ⟧}_{R, A x}$ of the term t, i.e., f preserves axioms and the equality associated with each variant of t. We denote by $cdom (F_{f})$ the set of terms for which f is variant-compatible. Equality preservation is the topic of Section 4.3.5.

Disequality preservation. This can be formulated as the reverse direction of equality preservation: $\begin{matrix} (P3) & f (t) f (σ) =_{E} f (u) f (σ) ⟹ t σ =_{E} u σ \end{matrix}$ Disequality preservation is needed to prevent that abstractions “fix” attacks on agreement properties (condition (d) of Definition 4.5 needed in condition (iii) of Theorem 4.6). In Section 4.3.7 and the full version [39], we present syntactic criteria for this property.
To establish these properties, we will use the following substitution property, which we will discuss in detail in Section 4.3.4. For terms t and well-typed and $R, A x -normal$ substitutions θ: $\begin{matrix} (P4) & f (t θ) = f (t) f (θ) \end{matrix}$ This property requires that t is in the uniform domain of f, written $t \in udom (F_{f})$ . This ensures that a term t and its instances $t θ$ are uniformly transformed using the same equations of $E_{f}$ .

Finally, we can state our soundness result for typed abstractions.
Theorem 4.14 (Soundness of typed abstractions).

Let $F_{f}$ be a $R, A x$ -compatible typed abstraction. Assume further that

$f ({IK}_{0}) \subseteq {IK}_{0}^{'}$ ,

$M_{P} \cup {Sec}_{ϕ} \cup {EqTerm}_{ϕ}^{-} \subseteq udom (F_{f}) \cap cdom (F_{f})$ , and

$f (t^{ϑ (ι)}) f (σ) =_{E} f (u^{ϑ (κ)}) f (σ)$ implies $t^{ϑ (ι)} σ =_{E} u^{ϑ (κ)} σ$ for all $(ι, κ, t, u) \in {Eq}_{ϕ}^{+}$ , thread-id interpretations ϑ, and $R, A x -normal$ well-typed ground substitutions σ, and

$f (t) = f (u)$ implies $t = u$ , for all $e (t) \in {Evt}_{ϕ}^{+}$ and $e (u) \in Evt (M_{P})$ .

Then for all states

(t r, t h, σ) \in reach (P, {IK}_{0})

, we have

$(f (t r), f (t h), f (σ)) \in reach (f (P), {IK}_{0}^{'})$ , and

$(t r, t h, σ) ⊭ ϕ$ implies $(f (t r), f (t h), f (σ)) ⊭ f (ϕ)$ .

Proof.
It suffices to establish conditions (i)–(iii) of Theorem 4.6 for $G = (f, f)$ and $g = f$ . Let $(t r, t h, σ) \in reach (P, {IK}_{0})$ . We can assume without loss of generality that σ is $R, A x -normal$ .

Let $t \in M_{P} \cup {Sec}_{ϕ}$ . Using assumptions (i)–(ii) and property (P1) (formalized in Corollary 4.33 below), we derive that $IK (t r) σ, {IK}_{0} ⊢_{E} t^{i} σ implies f (IK (t r)) f (σ), {IK}_{0}^{'} ⊢_{E} f (t^{i}) f (σ)$ . Since $f (IK (t r)) = IK (f (t r))$ , conditions (i) and (ii) of Theorem 4.6 hold.

To prove that condition (iii) of Theorem 4.6 is satisfied, we have to establish conditions (a)–(e) in Definition 4.5. We look at each of these conditions in turn.
Condition (a): holds trivially since $nil \notin ran (f)$ .

Condition (b): clearly holds since σ is well-typed and f is the identity on atoms.

Condition (c): Here, $f (t^{ϑ (ι)}) f (σ) =_{E} f (u^{ϑ (κ)}) f (σ)$ follows from $t^{ϑ (ι)} σ =_{E} u^{ϑ (κ)} σ$ by assumption (iii) and properties (P2) and (P4) (formalized in Theorems 4.23 and 4.18 below).

Condition (d): holds by assumption (iii).

Condition (e): holds by assumption (iv).
This completes the proof of the theorem. □

In the following, we discuss each of the properties (P1)–(P4) in more detail. We give examples motivating the restrictions under which they hold and we formally define these restrictions. We then establish that the properties hold under the respective restrictions. We start our discussion with the substitution property. Readers who wish to first get an overview of our abstractions before delving into the technical details may want to skip to Section 4.4.
4.3.4. Substitution Property (P4)

The following example shows that the substitution property does not hold unconditionally.

Example 4.15.
Let $F_{f} = (f, E_{f})$ be a typed abstraction such that $E_{f}$ consists of the two equations $f (h (X : γ_{c})) = f (X)$ and $f (h (Y : msg)) = h (f (Y))$ where c is a constant and we have annotated the variables X and Y with their types for convenience. Let $t = h (Z)$ and $θ = {Z \mapsto c}$ where $Z : msg$ . Then we have $f (t θ) = f (h (c)) = c \neq h (c) = h (Z θ) = f (t) f (θ)$ .

The problem in this example is caused by the terms t and $t θ$ being transformed by two distinct clauses. To avoid this, we must ensure that t and all its instance $t θ$ are transformed uniformly, i.e., match the same clauses of $E_{f}$ . We therefore require that
the patterns in $E_{f}$ do not overlap (pattern disjointness), and

all recursive calls of f on composed terms during the transformation of t are handled by the clauses of $E_{f}$ , without recourse to the fall-back clauses in $E_{f}^{0}$ .
This is formalized in the following two definitions.
Definition 4.16.
A function specification $F_{f} = (f, E_{f})$ , where $E_{f} = [f (p_{1}) = u_{1}, \dots, f (p_{n}) = u_{n}]$ , is pattern-disjoint if the types in $Π_{f}$ are pairwise disjoint, i.e., $Γ (p_{i}) ↓ \cap Γ (p_{j}) ↓ = \emptyset$ for all $i, j \in \tilde{n}$ such that $i \neq j$ .

Note that the abstractions defined in Examples 4.10 and 4.12 are pattern-disjoint, while the one in Example 4.15 is not. Let $Π_{f} = Π (E_{f})$ , where $Π (L) = {Γ (p) ∣ (f (p) = u) \in L}$ denotes the set of pattern types of a list of equations L.
Definition 4.17 (Uniform domain).

We define the uniform domain of $F_{f}$ by $\begin{matrix} udom (F_{f}) = {t \in T ∣ Γ (Rec (F_{f}, t)) \subseteq Π_{f} ↓ \cup Y_{a t}} \end{matrix}$ where $Rec (F_{f}, t)$ is the set of terms u such that $f (u)$ is called in the computation of $f (t)$ .

We will require that the protocol terms $t \in M_{P}$ belong to $udom (F_{f})$ , which ensures that their instances $t θ$ with $R, A x -normal$ substitutions θ are transformed uniformly. Since our protocol and property semantics do not distinguish states with $=_{E}$ -equal substitutions, we can assume without loss of generality that σ is $R, A x -normal$ for all reachable states $(t r, t h, σ)$ of the protocol P.

Theorem 4.18 (Substitution property).

Suppose that $F_{f}$ is pattern-disjoint. Let $t \in udom (F_{f})$ and θ be a well-typed and $R, A x -normal$ substitution. Then $f (t θ) = f (t) f (θ)$ .

We henceforth assume that $F_{f}$ is pattern-disjoint. This concludes our discussion of (P4) and we now turn our attention to equality preservation.

4.3.5. Equality preservation (P2)

Using the substitution property, we can reduce (P2) to the property stating that $t σ =_{E} u σ$ implies $f (t σ) =_{E} f (u σ)$ for well-typed and $R, A x -normal$ substitutions σ. Using the decomposition of the equational theory $(Σ, E)$ into $(Σ, R, A x)$ , we further reduce this to the following two properties:

If $t =_{A x} u$ then $f (t) =_{A x} f (u)$ for all terms t and u.

$f (t σ) =_{E} f ((t σ) ↓_{R, A x})$ for all terms t and well-typed $R, A x -normal$ substitutions σ.

Neither of these properties holds in this generality (recall Remark 4.13). The following example illustrates a violation of (P2.a).

Example 4.19.
Let $na$ and $nb$ be nonces and let $F_{f} = (f, E_{f})$ be a typed abstraction such that $E_{f} = [f (h (X)) = f (X)]$ with $X : \exp (msg, β_{na})$ . We consider two terms $t = h (\exp (\exp (g, na), nb))$ and $u = h (\exp (\exp (g, nb), na))$ . Then we have $f (t) = h (\exp (\exp (g, na), nb))$ and $f (u) = \exp (\exp (g, nb), na)$ . Hence, $t =_{A x} u$ but $f (t) \neq_{A x} f (u)$ . The reason is that t and u are not transformed uniformly. In particular, t is transformed by a clause in $E_{f}^{0}$ which keeps t unchanged, while u is transformed by the clause in $E_{f}$ which removes the hash function $h$ .

To solve the problem described in Example 4.19, we introduce the notion of $A x$ -closedness, which requires that $F_{f}$ is homomorphic for the constructors in $funsym (A x)$ and that top-level constructors of axioms must not occur strictly inside any patterns’ type. This is sufficient to prove property (P2.a).
Definition 4.20 ( $A x$ -closedness).

$F_{f}$ is $A x$ -closed if it is homomorphic for $funsym (A x)$ and, for all equations $f (p) = u$ of $E_{f}$ , we have $topsym (Γ (subterm (p) ∖ {p})) \cap topsym (A x) = \emptyset$ .

Note that the abstraction $F_{1}$ from Example 4.12 is $A x_{c s}$ -closed, since it is homomorphic for the only constructors $\exp$ and $sh$ occurring in $A x_{c s}$ and these constructors occur at most in the top position of any of $F_{1}$ ’s pattern types. We henceforth assume that $F_{f}$ is $A x$ -closed. The following example exhibits a violation of (P2.b).

Example 4.21.
Let $F_{f} = (f, E_{f})$ be a typed abstraction which drops all symmetric encryptions, i.e., $E_{f} = [f ({| X |}_{K}) = f (X)]$ , and let $t = {| {| m |}_{k} |}_{k}^{- 1}$ for atomic terms m and k. Then $f (t) = {| m |}_{k}^{- 1}$ , but $f (t ↓_{R, A x}) = f (m) = m$ . Clearly, we have that $f (t) \neq_{E} f (t ↓_{R, A x})$ .

To establish (P2.b) for a term t, we make use of the finite variant property of our rewrite theory.
Definition 4.22 (Variant-compatibility).

We say that $F_{f}$ is variant-compatible for t if, for all ${(t^{'}, θ) \in ⟦ t ⟧}_{R, A x}$ , we have (i) $t^{'}, t θ \in udom (F_{f})$ and (ii) $f (t θ) =_{E} f (t^{'})$ . We denote by $cdom (F_{f})$ the set of terms for which f is variant-compatible.

For terms $t \in cdom (F_{f})$ , we can show (P2.b) using the substitution property. Note that variant-compatibility for t is checkable since ${⟦ t ⟧}_{R, A x}$ is finite due to the finite variant property. The theory including Diffie–Hellman exponentiation from Example 3.2 and the XOR theory in Example 3.3 both have the finite variant property.

Theorem 4.23 (Equality preservation).

Suppose that $F_{f}$ is pattern-disjoint and $A x$ -closed. Let $t, u \in cdom (F_{f})$ and σ be a well-typed $R, A x -normal$ substitution. Then $t σ =_{E} u σ$ implies $f (t σ) =_{E} f (u σ)$ .

This concludes our treatment of (P2). We proceed with deducibility preservation.

4.3.6. Deducibility preservation (P1)

To preserve reachability and secrecy properties, our typed protocol abstractions need to preserve term deducibility, i.e., whenever a term t is deducible from a set of terms T then $f (t)$ is also deducible from $f (T)$ . The following series of examples illustrates the main issues involved in the proof of deducibility preservation. The proof assumes T and t are $R, A x -normal$ and, without loss of generality that terms derived using the composition rule are immediately normalized. Thus, the interesting case is when a composition creates a term to which a rewrite rule $l \to r$ is applicable.

Example 4.24 (Preserving decryption).

Consider the composition rule $Comp$ instantiated for asymmetric decryption, which derives $T ⊢_{E} {X}_{K}^{- 1}$ from $T ⊢_{E} X$ and $T ⊢_{E} K$ . We have to make sure that, for all instances of this rule, we can preserve this deduction under f. The most interesting instance is $X = {u}_{pk (a)}$ and $K = pri (a)$ , in which case the conclusion can be reduced using the rewriting rule ${{u}_{pk (a)}}_{pri (a)}^{- 1} \to u$ modeling decryption. In this case, we can produce the following derived standard rule for asymmetric decryption, which we call $Adec$ . $\begin{matrix} T ⊢_{E} {u}_{pk (a)} T ⊢_{E} pri (a) T ⊢_{E} {{u}_{pk (a)}}_{pri (a)}^{- 1} Comp {{u}_{pk (a)}}_{pri (a)}^{- 1} =_{E} u T ⊢_{E} u Eq \end{matrix}$ To preserve this derived rule under f, we have to show that we can derive $f (T) ⊢_{E} f (u)$ from $f (T) ⊢_{E} f ({u}_{pk (a)})$ and $f (T) ⊢_{E} f (pri (a))$ . This clearly works if $F_{f}$ is homomorphic for all four constructors on the left hand side of the decryption rewrite rule. Let us consider the more interesting case where $u = ⟨ u_{1}, u_{2} ⟩$ and f pulls $u_{2}$ outside the encryption: $\begin{matrix} f ({u_{1}, u_{2}}_{pk (a)}) = ⟨ {f (u_{1})}_{f (pk (a))}, f (u_{2}) ⟩ . \end{matrix}$ By further assuming that f transforms decryptions, pairs, $pk$ , and $pri$ homomorphically, we obtain the required derivation as follows. $\begin{matrix} f (T) ⊢_{E} f ({u_{1}, u_{2}}_{pk (a)}) f (T) ⊢_{E} {f (u_{1})}_{pk (f (a))} {Proj}_{1} f (T) ⊢_{E} pri (f (a)) f (T) ⊢_{E} f (u_{1}) Adec f (T) ⊢_{E} f ({u_{1}, u_{2}}_{pk (a)}) f (T) ⊢_{E} f (u_{2}) {Proj}_{2} f (T) ⊢_{E} f (⟨ u_{1}, u_{2} ⟩) Comp \end{matrix}$ Here, the derived rules ${Proj}_{i}$ are used for projection. These are formed by applying the composition rule ( $Comp$ ) followed by a reduction ( $Eq$ ).

Generally speaking, we have to ensure that if a composed term $t = d (t_{1}, \dots, t_{n})$ can be reduced to a term u then (the fields of) $f (u)$ can still be derived from $f (t_{1}), \dots, f (t_{n})$ . The next examples illustrate that we must impose on f further restrictions related to the rewrite theory (in addition to $A x$ -closedness).

Example 4.25 (Dropping fields).

Consider the derivation of rule $Adec$ in Example 4.24 and $u = ⟨ u_{1}, u_{2} ⟩$ . Suppose f is now modified to drop $u_{2}$ from the encryption, i.e., $f ({u_{1}, u_{2}}_{pk (a)}) = {f (u_{1})}_{f (pk (a))}$ . Since $f (u_{2})$ is lost, this clearly prevents us from deriving $f (T) ⊢_{E} f (u)$ in general.

This example shows that we cannot drop fields from argument positions of a constructor that can be extracted by a rewrite rule (here decryption).

Example 4.26 (Transforming non-enclosing constructors).

Suppose f transforms asymmetric encryptions and $pk$ homomorphically, but drops the private key constructor $pri$ , i.e., $f (pri (X)) = f (X)$ . Clearly, we cannot extract $f (u)$ from ${f (u)}_{pk (f (a))}$ using the key $f (a)$ , since ${{f (u)}_{pk (f (a))}}_{f (a)}^{- 1}$ is irreducible.

The problem here is that the decryption rewrite rule is no longer applicable. This can be avoided by requiring that f is homomorphic for the constructors of the left-hand side l of the rewrite rule other than those enclosing the extracted term in l (here, the key constructors $pk$ and $pri$ ).

Example 4.27 (Non-linear variables).

This example illustrates another way to destroy the applicability of a rewrite rule by abstraction. Consider the rule $X \oplus (X \oplus Y) \to Y$ of the theory of XOR from Example 3.3. Suppose that $E_{f}$ includes the following $\oplus$ -equation, which drops the second component of a pair in the first argument of XOR if the second argument is also an XOR: $\begin{matrix} f (⟨ U, V ⟩ \oplus (W \oplus X)) = f (U) \oplus f (W \oplus X) . \end{matrix}$ Also suppose that $f (X \oplus Y) = f (X) \oplus f (Y)$ for all other cases. Let $t = ⟨ k_{1}, k_{2} ⟩ \oplus (⟨ k_{1}, k_{2} ⟩ \oplus m)$ . Clearly, t is reducible to m, but this is not the case for $f (t) = k_{1} \oplus (⟨ k_{1}, k_{2} ⟩ \oplus m)$ .

In this case, the problem is that the two instances of X in the rewrite rule are transformed differently, which destroys the matching. This suggests that if a constructor c enclosing the extracted term in l has a non-linear variable at its ith argument position then the equations of f must not split the ith argument of c.

The examples above (partly) motivate the following definitions.

Definition 4.28.
We call a typed abstraction $F_{f} = (f, E_{f})$ :
field-preserving for position i of c if, for all equations of $F_{f}$ of the form $f (c (p_{1}, \dots, p_{n})) = ⟨ e_{1}, \dots, e_{d} ⟩$ and all $q \in split (p_{i})$ , there is a $j \in \tilde{d}$ such that either $e_{j} = f (q)$ or $e_{j} = c (\dots, \hat{f} (\overline{q_{i}}), \dots)$ and $q \in set (\overline{q_{i}})$ .

non-splitting for position i of c if $p_{i}$ is not a pair for all equations of $F_{f}$ of the form $f (c (p_{1}, \dots, p_{n})) = ⟨ e_{1}, \dots, e_{d} ⟩$ .

Note that if $F_{f}$ is non-splitting for i of c then it is (trivially) field-preserving for position i of c. Moreover, if $F_{f}$ is homomorphic for c then it is non-splitting for all argument positions i of c.
Definition 4.29 (Extractable position).

We say that a rewrite rule $l \to r \in R$ extracts position $i \in \tilde{n}$ of $c \in Σ^{n}$ if there are terms $t_{1}, \dots, t_{n}$ such that $c (t_{1}, \dots, t_{n}) \in subterm (l)$ and $r = t_{i}$ . We call i an extractable position of c if there is a rewrite rule $l \to r \in R$ that extracts position i from c.

For example, the projection rewrite rule $π_{1} (⟨ X, Y ⟩) \to X$ extracts position 1 of pairs.

Definition 4.30 (Compatibility with rewrite theory).

A typed abstraction $F_{f}$ is compatible with a rewrite rule $l \to r$ if one of the following conditions holds:

$l = c (u_{1}, \dots, u_{n})$ and $r = u_{i}$ for some $i \in \tilde{n}$ such that $c \notin topsym (A x)$ ,

$l = d (u_{1}, \dots, u_{j - 1}, c (v_{1}, \dots, v_{n}), u_{j + 1}, \dots, u_{m})$ and $r = v_{i}$ for some $j \in \tilde{m}$ and $i \in \tilde{n}$ such that $c \notin topsym (A x)$ , none of the $v_{i}$ ’s is a pair, and the following conditions hold:

$F_{f}$ is field-preserving for the extracted position i of c,

$F_{f}$ is non-splitting for all positions i of c such that $v_{i}$ is a non-linear variable of l, and

$F_{f}$ is homomorphic for all $c^{'} \in funsym ({u_{1}, \dots, u_{j - 1}, v_{1}, \dots, v_{n}, u_{j + 1}, \dots, u_{m}})$ .

l has an arbitrary shape and either

r is a constant,

$l \in cdom (F_{f})$ and $F_{f}$ is homomorphic for $topsym (l)$ , or

$r \in cdom (F_{f})$ and $F_{f}$ is homomorphic for $funsym (l, r)$ .

We say that

F_{f}

is compatible with the rewrite theory

(Σ, A x, R)

, or

R, A x

-compatible for short, if

F_{f}

is pattern-disjoint,

A x

-closed, and compatible with all rewrite rules in R.

We illustrate this definition with an example.

Example 4.31.
Let us check that the typed abstraction $F_{f_{1}} = (f_{1}, E_{f_{1}})$ from Example 4.12 is compatible with the rewrite theory $R_{c s} = (Σ_{c s}, A x_{c s}, R_{c s})$ from Example 3.2. As already previously stated, $F_{f_{1}}$ is pattern-disjoint and $A x_{c s}$ -closed. It remains to check that it is compatible with all rewrite rules in $R_{c s}$ .

Let us consider the symmetric decryption rule ${| {| X |}_{K} |}_{K}^{- 1} \to X$ . We check that this rule satisfies condition (C2). We have $d = {| \cdot |}_{\cdot}^{- 1}$ , $u_{1} = {| X |}_{K}$ , $u_{2} = K$ , $c = {| \cdot |}_{\cdot}$ , $v_{1} = X$ , and $v_{2} = K$ . First, we confirm that there are no symmetric encryptions at the top-level of any axiom and that none of $v_{1}$ and $v_{2}$ is a pair. Second, we check conditions (C2.a–c) in turn. Condition (C2.a) holds, since the only relevant equation of $F_{f_{1}}$ is $f_{1} ({| X, Y |}_{Z}) = ⟨ f_{1} (X), f_{1} (Y) ⟩$ , which is clearly field-preserving for the cleartext position 1 extracted by the rewrite rule. Furthermore, the only non-linear variable in l is K and $F_{1}$ is non-splitting for the relevant key position 2 of symmetric encryption. Hence, (C2.b) also holds. Condition (C2.c) holds vacuously, since the set of function symbols $funsym ({v_{1}, v_{2}, u_{2}})$ is empty.

Next, we verify that $F_{f_{1}}$ is compatible with the signature verification rule $ver ({[X]}_{pri (Y)}, pk (Y)) \to X$ . Since $F_{f_{1}}$ is homomorphic for all constructors occurring in this rule and its right-hand side is a variable, it immediately follows that this rule satisfies condition (C3.c). Alternatively, we can show that it satisfies (C2). The compatibility of $F_{f_{1}}$ with the asymmetric decryption and projection rules is justified similarly.

We first establish a version of deducibility preservation without substitutions.
Theorem 4.32 (Deducibility preservation).

Let $F_{f}$ be a $R, A x$ -compatible typed abstraction and let $T \cup {t}$ be a set of $R, A x -normal$ terms such that T contains all constants, i.e., $C \subseteq T$ . Then we have $T ⊢_{E} t$ implies $f (T) ⊢_{E} f (t)$ .

By combining this theorem with Theorems 4.23 and 4.18, we can now derive (P1) which we formalize as the following corollary.

Corollary 4.33 (Deducibility preservation with substitutions).

Let $F_{f}$ be a $R, A x$ -compatible typed abstraction. Suppose σ is a $R, A x -normal$ well-typed ground substitution and $T \cup {u}$ is a set of terms such that (i) $f ({IK}_{0}) \subseteq {IK}_{0}^{'}$ and (ii) $T \cup {u} \subseteq udom (F_{f}) \cap cdom (F_{f})$ . Then we have that $T σ, {IK}_{0} ⊢_{E} u σ$ implies $f (T) f (σ), {IK}_{0}^{'} ⊢_{E} f (u) f (σ)$ .

This completes our discussion of (P1). Next, we discuss syntactic criteria for the disequality preservation in condition (iii) of Theorem 4.14.

4.3.7. Syntactic criteria for disequality preservation (P3)

Condition (iii) of Theorem 4.14 requires that, for all $(ι, κ, t, u) \in {Eq}_{ϕ}^{+}$ , all thread-id interpretations ϑ, and all $R, A x -normal$ well-typed ground substitutions σ, we have $\begin{matrix} (I) & f (t^{ϑ (ι)}) f (σ) =_{E} f (u^{ϑ (κ)}) f (σ) ⟹ t^{ϑ (ι)} σ =_{E} u^{ϑ (κ)} σ . \end{matrix}$ Since the universal quantification over substitutions makes this condition hard to check in practice, we propose syntactic criteria for its verification.

Here, we present such a criterion that is applicable if t and u do not contain any message variables. Assuming that $f (t) = t$ and $f (u) = u$ , we can derive $t^{ϑ (ι)} f (σ) =_{E} u^{ϑ (κ)} f (σ)$ from the premise of condition ( I ). Since we have that $f (X σ) = X σ$ for all non-message variables $X \in dom (σ)$ , we obtain $t^{ϑ (ι)} σ =_{E} u^{ϑ (κ)} σ$ as required. Hence, we have just proved the following simple syntactic criterion.

Proposition 4.34.
Let $(ι, κ, t, u) \in {Eq}_{ϕ}^{+}$ such that (i) $(vars (t) \cup vars (u)) \cap V_{msg} = \emptyset$ , and (ii) $f (t) = t$ and $f (u) = u$ . Then, for all thread-id interpretations ϑ and well-typed ground substitutions σ, we have that $f (t^{ϑ (ι)}) f (σ) =_{E} f (u^{ϑ (κ)}) f (σ)$ implies $t^{ϑ (ι)} σ =_{E} u^{ϑ (κ)} σ$ .

Note that for this criterion to be applicable, we require that f is the identity for the terms in positively occurring equations. This is often the case, as these terms typically have a simple structure, e.g., nonces or timestamps. However, this criterion cannot be used to justify the soundness of the typed abstraction from Example 4.12 with respect to the property $ϕ_{auth}$ from Example 3.11. Although we can expand the equality of the two tuples in that example into a conjunction of six simpler equations, we can only apply the criterion above to the first four of these. The last two contain message variables and require a more general syntactic criterion for condition ( I ). In the full version [39], we present such a criterion, which covers the case where message variables may occur on one side of the equation.
4.4. Atom-and-variable removal abstractions

Typed abstractions offer a wide range of possibilities to transform cryptographic operations including subterm removal, splitting, and pulling fields outside of such an operation. We complement these abstractions with two kinds of untyped abstractions. The first type, discussed here, allows us to remove unprotected atoms and variables of any type. The second type removes redundancy in the form of intruder-derivable terms and is discussed in the next subsection.

4.4.1. Specification of atom-and-variable removal

We first present the formal definition of atom-and-variable removal abstractions, then we motivate some restrictions needed for soundness, and finally we illustrate the application of atom-variable removal on our running example.

An atom-and-variable removal abstraction does not remove all occurrences of an atom or a variable from a given term t, but those that are fields of t. Intuitively, these unprotected atoms and variables do not themselves provide any security properties and can therefore safely be removed. This intuition is most obvious for atom removal: the intruder already knows all constants and agent names and he can replace unprotected fresh values by his own ones of the same type. In the following definition, we formulate atom-and-variable removal abstracions, where we use the abbreviation $av (t) = atoms (t) \cup vars (t)$ .

Definition 4.35.
An atom-and-variable removal abstraction is a general abstraction $G = ({rem}_{T}, {rem}_{T})$ , where $T \subseteq av (M_{P})$ is a parameter denoting the set of atoms and variables to be removed and ${rem}_{T} : T \to T \cup {nil}$ is defined by
${rem}_{T} (u) = nil$ if $u \in T \cup T^{TID}$

$\begin{matrix} {rem}_{T} (⟨ t_{1}, t_{2} ⟩) = \{\begin{matrix} {rem}_{T} (t_{1}) & if {rem}_{T} (t_{2}) = nil \\ {rem}_{T} (t_{2}) & if {rem}_{T} (t_{1}) = nil \\ ⟨ {rem}_{T} (t_{1}), {rem}_{T} (t_{2}) ⟩ & otherwise \end{matrix} \end{matrix}$

${rem}_{T} (t) = t$ for all other terms.

By point (i) any term in $T \cup T^{TID}$ is removed. Note that this covers unindexed terms in protocol specifications and security properties and indexed terms during execution. Point (ii) allows us to remove pairs or their components. Point (iii) ensures that all other terms remain unchanged. Note that, for all terms t, ${rem}_{T} (t)$ either does not contain $nil$ or equals $nil$ . Hence by Definition 4.4, $nil$ does not occur in abstracted roles and traces and therefore ${rem}_{T} (P)$ is a protocol (see Definition 3.7).

Due to point (iii) of Definition 4.35, atom-and-variable removal abstractions cannot remove an atom or variable from a non-pair term. It is even unclear how to define this in general. Let us attempt to define a hypothetical variant ${rem}_{T}^{'}$ of ${rem}_{T}$ . Consider a non-pair composed term $t = c (a_{1}, \dots, a_{n})$ and suppose ${rem}_{T}^{'}$ maps some but not all arguments of t to $nil$ . One may think of two possible definitions for ${rem}_{T}^{'} (t)$ : (1) ${rem}_{T}^{'} (t) = nil$ or (2) ${rem}_{T}^{'} (t)$ is the tuple consisting of the non- $nil$ arguments of c. The following two examples consider each of the two definitions in turn and show that neither of them preserves deducibility.
Example 4.36.
Consider the terms $t = ⟨ na, {| nb |}_{na} ⟩$ and $u = nb$ containing the nonces $na$ and $nb$ . Let $T = {na}$ . Then, we have ${rem}_{T}^{'} (t) = nil$ and ${rem}_{T}^{'} (u) = nb$ . Moreover, we also have $t ⊢_{E} u$ , but ${rem}_{T}^{'} (t) ⊢_{E} {rem}_{T}^{'} (u)$ does not hold, as $nb$ is not deducible from $nil$ .
Example 4.37.
Suppose that $h_{1}, h_{2} \in Σ^{2}$ are binary hash functions and $A x$ contains the following axiom: $\begin{matrix} h_{1} (h_{2} (X, Y), Z) ≃ h_{1} (h_{2} (X, Z), Y) . \end{matrix}$ Consider the two terms $t = h_{1} (h_{2} (n_{1}, n_{2}), n_{3})$ and $u = h_{1} (h_{2} (n_{1}, n_{3}), n_{2})$ where $n_{1}$ , $n_{2}$ and $n_{3}$ are nonces and let $T = {n_{3}}$ . Then we have ${rem}_{T}^{'} (t) = h_{2} (n_{1}, n_{2})$ , and ${rem}_{T}^{'} (u) = h_{1} (n_{1}, n_{2})$ . Moreover, we have $t =_{A x} u$ , but ${rem}_{T}^{'} (t) =_{A x} {rem}_{T}^{'} (u)$ fails to hold. Hence, t and u are derivable from each other, while neither of ${rem}_{T}^{'} (t)$ and ${rem}_{T}^{'} (u)$ is derivable from the other.

Similar counterexamples can also be constructed if variable removal abstractions are considered. This highlights the necessity of point (iii) in Definition 4.35.

The following example shows that the soundness of ${rem}_{T}$ calls for a restriction of the occurrences of the removed atoms and variables. Namely, they may occur exclusively as fields of a term, i.e., we cannot remove an atom or variable that also occurs under a cryptographic operation in the same term.
Example 4.38.
Consider terms $t = ⟨ na, h_{1} (na) ⟩$ and $u = h_{2} (na)$ , where $h_{1}$ and $h_{2}$ are hash function symbols and $na$ is a nonce. With $T = {na}$ we have ${rem}_{T} (t) = h_{1} (na)$ and ${rem}_{T} (u) = h_{2} (na)$ . Moreover, we also have $t ⊢_{E} u$ , but ${rem}_{T} (t) ⊢_{E} {rem}_{T} (u)$ fails to hold.

This example motivates the following definition.
Definition 4.39 (Clear terms).

A term u is clear in a term t if $u \notin subterm (split (t) ∖ {u})$ , i.e., u occurs at most as a field in t. For sets of terms T and U, we say that T is clear in a term t if every term in T is clear in t and that T is clear in a set of terms U if T is clear in every term in U.

Note that u is also clear in t if it does not appear at all in t. Our soundness result requires that all variables and atoms in T are clear in the terms to which ${rem}_{T}$ is applied. Moreover, it requires that the elements of T do not appear in the properties of interest.

In the following example, we illustrate the use of atom-and-variable removal abstractions to transform $IK E_{m}^{1}$ into $IK E_{m}^{2}$ .

Example 4.40 ( $IK E_{m}^{1}$ to $IK E_{m}^{2}$ ).

We use atom-and-variable removal to simplify the protocol $IK E_{m}^{1}$ . First, we recall the specification of (the initiator role of) $IK E_{m}^{1}$ . $\begin{matrix} S_{IK E_{m}^{1}} (A) & = send (\underline{sPIa}, o, \underline{sA 1}, \exp (g, x), na) \cdot \\ recv (\underline{sPIa}, \underline{SPIb}, \underline{sA 1}, Gb, Nb) \cdot Running \cdot \\ send (\underline{sPIa}, \underline{SPIb}, A, B, {AUTHaa}^{'}, \underline{sA 2}, \underline{tSa}, \underline{tSb}) \cdot \\ recv (\underline{sPIa}, \underline{SPIb}, B, {AUTHba}^{'}, \underline{sA 2}, \underline{tSa}, \underline{tSb}) \cdot Secret \cdot Commit \end{matrix}$ To highlight the changes in this abstraction step, we have underlined the terms to be removed from $IK E_{m}^{1}$ : the constants $sA 1$ , $sA 2$ , $tSa$ , and $tSb$ , the fresh values $sPIa$ and $sPIb$ , and the variables $SPIa$ and $SPIb$ . We use the atom-and-variable removal abstraction ${rem}_{T}$ with parameter $T = {sA 1, sA 2, tSa, tSb, sPIa, sPIb, SPIa, SPIb}$ . Note that we can neither remove the constant o nor the variables A and B, since these terms are not clear in the authenticators ${AUTHaa}^{'}$ and ${AUTHab}^{'}$ . Applying ${rem}_{T}$ to $IK E_{m}^{1}$ , we obtain the (initiator role of the) protocol $IK E_{m}^{2}$ as given below. $\begin{matrix} S_{IK E_{m}^{2}} (A) & = send (o, \exp (g, x), na) \cdot recv (Gb, Nb) \cdot Running \cdot \\ send (A, B, {AUTHaa}^{'}) \cdot recv (B, {AUTHba}^{'}) \cdot Secret \cdot Commit \end{matrix}$ Note that the session keys and the authenticators are non-pair composed terms and hence remain untouched. We later use a redundancy removal to further simplify $IK E_{m}^{2}$ by removing intruder-derivable occurrences of the constant o and the agent variables A and B from the role descriptions.

4.4.2. Soundness for atom-and-variable removal abstractions

We now turn our attention to the soundness result for atom-and-variable removal abstraction. This result requires that we restrict our attention to well-formed protocols. To define this predicate on protocols, we first introduce the notion of accessible variables.

Definition 4.41 (Accessible variables).

We say that a variable X is accessible in a term t if either

$t = X$ or

$t = c (t_{1}, \dots, t_{n})$ for some $c \in Σ^{n}$ , some position $i \in \tilde{n}$ of c is extractable, and X is accessible in $t_{i}$ .

Intuitively, a variable X is accessible in a term t if there is a path from t’s root to an occurrence of X consisting of only extractable positions. This is to ensure that if X is accessible then it is potentially deducible. For example, X is accessible in ${| X |}_{k}$ since an agent can derive X from ${| X |}_{k}$ using the rewrite rule ${| {| X |}_{K} |}_{K}^{- 1} \to X$ , of course provided it also knows k. In contrast, X is not accessible in $h (X)$ since there is no way to deduce X from $h (X)$ . However, X is accessible in $⟨ X, h (X) ⟩$ since it is accessible using the first projection. We now give the formal definition of well-formed protocols.

Definition 4.42.
A protocol P is well-formed if all non-agent variables first occur in receive events, i.e., for all roles $R \in dom (P)$ and all send and receive events $e v (t)$ in role $P (R)$ and all non-agent variables $X \in vars (t) ∖ V_{α}$ , there is an event $recv (u)$ in $P (R)$ such that $recv (u)$ equals or precedes $e v (t)$ in $P (R)$ and X is accessible in u.

A well-formed protocol captures the intuition that an agent must know what he sends and the elements that he receives into variables are accessible, e.g., by decrypting a ciphertext. Our notion of well-formedness is a weaker form of executability, which would additionally require that the agent also knows the relevant keying material. Hence, all practical protocols satisfy this condition.

Our soundness result for atom-and-variable removal abstractions is stated in the following theorem.
Theorem 4.43 (Soundness for atom-and-variable removal abstractions).

Let P be a well-formed protocol, $ϕ \in L_{P}$ a property, $T \subseteq av (M_{P})$ a set of atoms and variables such that

T is clear in $M_{P}$ ,

$T \cap av ({EqTerm}_{ϕ}) = \emptyset$ ,

$nil \notin {rem}_{T} ({Sec}_{ϕ} \cup {Evt}_{ϕ})$ , and

${IK}_{0} \subseteq {IK}_{0}^{'}$ ,

for all $e (t) \in {Evt}_{ϕ}^{+}$ and $e (u) \in Evt (M_{P})$ , we have ${rem}_{T} (t) = {rem}_{T} (u)$ implies $t = u$ .

Then for all states

(t r, t h, σ) \in reach (P, {IK}_{0})

, there is a ground substitution

σ^{'}

such that

$({rem}_{T} (t r), {rem}_{T} (t h), σ^{'}) \in reach ({rem}_{T} (P), {IK}_{0}^{'})$ ,

$(t r, t h, σ) ⊭ ϕ$ implies $({rem}_{T} (t r), {rem}_{T} (t h), σ^{'}) ⊭ {rem}_{T} (ϕ)$ .

To preserve attacks, condition (i) ensures that the removed atoms and variables are clear in all protocol terms. Condition (ii) requires that no removed atom or variable occurs in the property’s equalities. Together with condition (iii) it implies condition (a) of the definition of safe formulas (Definition 4.5). Condition (iv) requires that the initial knowledge of the intruder in the abstract protocol subsumes that in the original protocol. Finally, condition (v) reflects condition (e) of the definition of safe formulas (Definition 4.5).

We prove Theorem 4.43 by composing two separate soundness results for atom removal and for variable removal abstractions, respectively. Their statements and proofs appear in the full version [39].

4.5. Redundancy removal abstractions

The second kind of untyped abstractions are redundancy removal abstractions. A redundancy removal abstraction $r d$ enables the elimination of redundancies within each role of a protocol. Intuitively, a protocol term t appearing in a role r can be abstracted to $r d (t)$ if t and $r d (t)$ are derivable from each other under the intruder knowledge T containing the terms preceding t in r and the initial knowledge ${IK}_{0}$ . For example, we can simplify $r = send (t) \cdot recv (⟨ t, u ⟩)$ to $send (t) \cdot recv (u)$ . In contrast to atom-and-variable removal, redundancy removal can also remove composed terms. It is therefore a very effective ingredient for automatic abstraction, which we describe in Section 6.

4.5.1. Specification of redundancy removal abstractions

We now formally define our class of redundancy removal abstractions.

Definition 4.44.
A redundancy removal abstraction for a protocol P is a general abstraction $G = (rd, i d)$ where $i d$ is the identity function on $T$ and the function $rd : T \to T \cup {nil}$ satisfies two conditions:
for all $R \in dom (P)$ , we have that ${RD}_{rd} ({IK}_{0}, P (R))$ holds, where the predicate ${RD}_{rd} (T, S)$ is inductively defined by the following three rules:

Note that in these rules, $rd (t)$ is removed from the deducibility conditions if it equals $nil$ . We also define $rd (t^{i}) = rd {(t)}^{i}$ for all $i \in TID$ and $t \in M_{P}$ .

for all terms $t \notin M_{P} \cup M_{P}^{TID}$ , we have $rd (t) = t$ .

Intuitively, the predicate ${RD}_{rd} (T, ev (t) \cdot r)$ ensures for a protocol message t that the intruder is able to derive t from $rd (t)$ and his knowledge T, and vice versa. The first rule says that ${RD}_{rd} (T, ϵ)$ always holds. This captures the intuition that any redundancy removal works for the empty role description. The second rule allows us to ignore all the signals events as they do not affect the intruder’s knowledge. In the last rule, the first premise requires that the predicate holds for T plus the term t in the first element of the event sequence, and the tail r. By adding t to T, we capture the fact that the intruder learns t after the event $e v (t)$ has been executed. The second premise ensures that t is derivable from T, $V_{α}$ , and $rd (t)$ . The set of agent variables $V_{α}$ is added to T to symbolically represent the intruder knowledge of all agents. The third premise is the same as the second one, except that the roles of t and $rd (t)$ are swapped. We will usually identify the pair $(rd, i d)$ with its first, non-trivial component $rd$ .

In the following example, we illustrate the use of redundancy removal abstractions to further simplify the protocol $IK E_{m}^{2}$ . Example 4.45.
First, we recall $IK E_{m}^{2}$ whose role descriptions are given below, where the authenticator terms ${AUTHxx}^{'}$ correspond to abstractions of the corresponding $AUTHxx$ terms, resulting from the first abstraction step described in Example 4.12. $\begin{array}{l} \begin{matrix} S_{IK E_{m}^{2}} (A) & = send (\underline{o}, \exp (g, x), na) \cdot recv (Gb, Nb) \cdot Running \cdot \\ send (\underline{A}, \underline{B}, {AUTHaa}^{'}) \cdot recv (\underline{B}, {AUTHba}^{'}) \cdot Commit \end{matrix} \\ \begin{matrix} S_{IK E_{m}^{2}} (B) & = recv (\underline{o}, Ga, Na) \cdot send (\exp (g, y), nb) \cdot \\ recv (\underline{A}, \underline{B}, {AUTHab}^{'}) \cdot Running \cdot send (\underline{B}, {AUTHbb}^{'}) \cdot Commit \end{matrix} \end{array}$ To remove the underlined terms, we use the following redundancy removal abstraction $rd$ : $\begin{array}{l} rd (⟨ \underline{o}, \exp (g, x), na ⟩) = ⟨ \exp (g, x), na ⟩ \\ rd (⟨ \underline{o}, Ga, Na ⟩) = ⟨ Ga, Na ⟩ \\ rd (⟨ \underline{A}, \underline{B}, {AUTHaa}^{'} ⟩) = {AUTHaa}^{'} \\ rd (⟨ \underline{A}, \underline{B}, {AUTHab}^{'} ⟩) = {AUTHab}^{'} \\ rd (⟨ \underline{B}, {AUTHba}^{'} ⟩) = {AUTHba}^{'} \\ rd (⟨ \underline{B}, {AUTHbb}^{'} ⟩) = {AUTHbb}^{'} \\ rd (t) = t for all other messages t \end{array}$ It is not difficult to see that $rd$ satisfies the conditions of Definition 4.44. Applying $rd$ to $IK E_{m}^{2}$ , we obtain the protocol $IK E_{m}^{3}$ specified as follows. $\begin{array}{l} \begin{matrix} S_{IK E_{m}^{3}} (A) & = send (\exp (g, x), na) \cdot recv (Gb, Nb) \cdot Running \cdot \\ send ({AUTHaa}^{'}) \cdot recv ({AUTHba}^{'}) \cdot Commit \end{matrix} \\ \begin{matrix} S_{IK E_{m}^{3}} (B) & = recv (Ga, Na) \cdot send (\exp (g, y), nb) \cdot \\ recv ({AUTHab}^{'}) \cdot Running \cdot send ({AUTHbb}^{'}) \cdot Commit \end{matrix} \end{array}$ In Fig. 3, we depict the message sequence chart of this protocol with all abbreviations expanded.
Fig. 3.
The $IK E_{m}^{3}$ protocol.

4.5.2. Soundness for redundancy removal abstractions

The soundness result for redundancy removal abstractions is stated in the following theorem.

Theorem 4.46 (Soundness for redundancy removal abstractions).

Let P be a protocol, $ϕ \in L_{P}$ a property, and $rd \in {RD}_{P}$ a redundancy removal abstraction. Suppose that

${IK}_{0} \subseteq {IK}_{0}^{'}$ ,

$nil \notin rd ({Evt}_{ϕ})$ ,

for all $e (t) \in {Evt}_{ϕ}^{+}$ and $e (u) \in Evt (M_{P})$ , we have $rd (t) = rd (u)$ implies $t = u$ .

Then for all states

(t r, t h, σ) \in reach (P, {IK}_{0})

, we have

$(rd (t r), rd (t h), σ) \in reach (rd (P), {IK}_{0}^{'})$ , and

$(t r, t h, σ) ⊭ ϕ$ implies $(rd (t r), rd (t h), σ) ⊭ ϕ$ .

4.6. Well-formedness preservation for protocol abstractions

In this section, we present well-formedness preservation results for our three types of protocol abstractions. These results are required for the composition of typed abstractions, atom-and-variable removal abstractions, and redundancy removal abstractions to transform well-formed protocols.

Proposition 4.47.
Let $F_{f} = (f, E_{f})$ be a typed abstraction. If P is well-formed then so is $f (P)$ .
Proposition 4.48.
Let T be a set of atoms and variables such that T is clear in $M_{P}$ . If P is well-formed, then so is ${rem}_{T} (P)$ .
Proposition 4.49.
Let $rd$ be a redundancy removal abstraction and P be a well-formed protocol. Assume that for all non-agent variables $X \in V_{P}$ and all receive events $recv (t)$ in which X first occurs, we have that X is accessible in $rd (t)$ . Then $rd (P)$ is well-formed.

The proofs of these propositions can be found in the full version [39].
5. Using protocol abstractions for efficient verification

Fig. 4.

The abstraction workflow for the analysis of security protocols.

Recall that our aim is to make protocol verification more efficient. Given a protocol and a property, our high-level idea is to construct a simpler version of the protocol and the property that is easier to verify. In particular, if the simpler version is a sound abstraction of the original, then we can conclude that the original also satisfies its property.

In the previous section, we gave sufficient conditions for abstractions to be sound. However, not all sound abstractions are useful for verification. In particular, if an abstraction is vulnerable to an attack that does not apply to the original, then we might waste verification time to find this attack, without being able to draw any conclusion about the original. Ideally, abstractions for verification extract the “core” of the cryptographic protocol, i.e., those parts of the protocol that are instrumental in achieving the property, and omit all other constructions. In this ideal case, the abstractions would have exactly the same properties as the original.

In this section, we describe an algorithm for efficient protocol verification based on such abstractions. Because we do not have a direct construction algorithm for sound abstractions, we use heuristics to generate reasonable abstractions and then check if they meet the soundness conditions. The workflow of our algorithm is described in Fig. 4: we first generate a stack of successively more abstract protocols and properties, with at the bottom the original, and at the top an abstract protocol that we hope represents the core of the protocol required to establish the property.

We then verify the protocols and the properties in this stack top-down, based on the assumption that it is more efficient to analyze a more abstract protocol. We provide empirical evidence for this in the next section. If we can successfully verify a protocol from the stack, we know the original protocol meets its property, and we can stop the analysis. If we find an attack, we try to reconstruct the attack on the original. If this is possible, we know the original protocol does not satisfy the property. If not, the attack is spurious, and we proceed to the next protocol on the stack, which is less abstract than the previous one.

We describe in Section 5.1 how we generate abstractions and in Section 5.2 how we check for spurious attacks.

5.1. Generating abstractions for verification

Our heuristics to generate abstractions uses three strategies, corresponding to our three types of abstractions, which we apply in order. After applying a strategy, we check if the resulting abstraction is sound. We discuss the three strategies in turn.

5.1.1. Simplifying or removing constructors that might not be needed to establish the property

Fig. 5.

Structure of u.

Many protocols use (cryptographic) constructors that are, at most, needed to guarantee some (but not all) of its desired properties. To see this, consider the following example.

Example 5.1 (The purpose of cryptographic constructors).

Let k be a session key and t an arbitrary term. Let u be defined as $({| t |}_{k}, {| {| A, t |}_{sh (A, B)} |}_{k})$ . In Fig. 5 we give a graphical representation of the structure of u.

If the security property encodes that t needs to be authenticated, we look for the strongest mechanism that could guarantee this. Within u, this would be the symmetric encryption with the long-term key $sh (A, B)$ , since we do not need to rely on the secrecy of the session key. Thus, within u, authentication of t can be guaranteed by this constructor only. If we are only interested in authentication of t, we can consider removing t from the protection of all other constructors, which in this case are the encryptions with k.

If the security property encodes that t needs to be secret, the situation changes, since secrecy needs to be guaranteed for all occurrences of t, and not just one. Thus, in the left branch, secrecy of t is guaranteed on the basis of the session key k, whereas in the right branch, secrecy is guaranteed on the basis of both constructors. Thus, within u, t’s secrecy depends on the secrecy of another term, and not just the long-term key. When we want to abstract the term u sent in a protocol without introducing new attacks, we need to ensure we do not make the situation worse. Thus, we would not modify the left branch. However, in the right branch we could remove t from the protection of either one of the constructors, since the overall guarantee within u would still be the same.

We will exploit this intuition by first determining which (sub)terms are relevant for establishing the desired property. We represent this by assigning security labels to each of them. In a second step, we give an algorithm that moves subterms out of their encapsulating constructor as long as their security labels are not increased.

For the first step, we first define which constructors serve which purpose. For example, a hash function does not authenticate its subterms, but it does not reveal its subterms either, and hence may be used in the context of secrecy. We differentiate between two main objectives (authentication and confidentiality) and assign one of three labels for each.

Security labels. We define the set of (security) labels $Label = {NO, MAYBE, YES}$ , with a total order $⩽_{l b}$ such that $NO ⩽_{l b} MAYBE ⩽_{l b} YES$ . The lowest label $NO$ encodes that the property is not met, the highest label $YES$ that it can be met, and the middle label $MAYBE$ that it depends on the properties of another term (e.g., a session key).

The labels for constructors (i.e., the guarantees they establish for their subterms) are specified by the functions $ℓ_{a}$ and $ℓ_{c}$ defined in Table 2. When extending these labels to a complete protocol, the simplest case occurs for authentication, where we simply determine the label of the strongest constructor that provides authenticity for the target term t. Intuitively, t needs to be authenticated only once in the protocol.

We define an auxiliary function $pathmax$ that takes a term x, a position p, and a labelling function f, and returns the maximum of f applied to all subterms from the root along the path to p. Formally, we define $pathmax (x, p, f) = max ({f (x |_{p_{1}}) ∣ \exists p_{2} . p_{2} \neq ϵ \land p_{1} \cdot p_{2} = p})$ . We will use $pathmax$ to take the maximum of f over all constructors within x that might authenticate $x |_{p}$ or keep it confidential.

Table 2
Security labels for different cryptographic operations, encoding what they might achieve for their strict subterms

Confidentiality Authentication

Top-level constructor of t $ℓ_{c} (t)$ $ℓ_{a} (t)$

symmetric encryptions or MACs with long-term keys $YES$ $YES$

MACs with session keys $YES$ $MAYBE$

symmetric encryptions with session keys $MAYBE$ $MAYBE$

public-key encryptions or hashes $YES$ $NO$

signatures $NO$ $YES$

others $NO$ $NO$

	Confidentiality	Authentication
symmetric encryptions or MACs with long-term keys	$YES$	$YES$
MACs with session keys	$YES$	$MAYBE$
symmetric encryptions with session keys	$MAYBE$	$MAYBE$
public-key encryptions or hashes	$YES$	$NO$
signatures	$NO$	$YES$
others	$NO$	$NO$

Definition 5.2 (Protocol authentication label).

Let P be a protocol, ϕ a property, and t a term. We define the protocol authentication label $authlabel (P, ϕ, t)$ as follows:

$authlabel (P, ϕ, t) = NO$ , if ${IK}_{0}, V_{α} ⊢_{E} t$ or $t \notin subterm (M_{P}) \cap subterm ({EqTerm}_{ϕ})$ , and

$authlabel (P, ϕ, t) = max ({pathmax (u, p, ℓ_{a}) ∣ u \in M_{P} \land u |_{p} = t})$ , otherwise.

For confidentiality, we cannot take the maximum over all positions, since we need to ensure that all occurrences of t are protected. Thus, we consider the labels of all paths on which t occurs, and take the minimum.

Definition 5.3 (Protocol confidentiality label).

Let P be a protocol, ϕ a property, and t a term. We define the protocol authentication label $conflabel (P, ϕ, t)$ as follows:

$conflabel (P, ϕ, t) = NO$ , if ${IK}_{0}, V_{α} ⊢_{E} t$ or $t \notin subterm (M_{P}) \cap subterm ({Sec}_{ϕ})$ , and

$conflabel (P, ϕ, t) = min ({pathmax (u, p, ℓ_{c}) ∣ u \in M_{P} \land u |_{p} = t})$ , otherwise.

Example 5.4.
Let us consider the terms u and t in Fig. 5. Suppose that $u \in M_{P}$ , $t \in subterm ({Sec}_{ϕ} \cap {EqTerm}_{ϕ})$ , and ${IK}_{0}, V_{α} ⊬_{E} t$ . Let P be a protocol such that (a) u occurs in $M_{P}$ and (b) all occurrences of t in $M_{P}$ are within u. Then we have $authlabel (P, ϕ, t) = YES$ and $conflabel (P, ϕ, t) = MAYBE$ .

We use the label definitions to construct an abstraction in the following way. First, we compute the authentication and confidentiality labels for all terms in the protocol and property. Second, we construct candidate abstractions in which we pull subterms out of their constructors (e.g., abstracting ${| x_{1}, x_{2}, x_{3} |}_{k}$ to $⟨ x_{2}, {| x_{1}, x_{3} |}_{k} ⟩$ ). For each candidate, we compute the new labels. Our main criterion for applying an abstraction is that,
for each term, the labels in the candidate abstraction are not lower than those of the corresponding terms in the original.
Additionally, we can remove a constructor entirely, if all its arguments can be pulled out. To prevent the introduction of spurious attacks, we do not perform abstractions that turn two non-unifiable subterms into unifiable ones. In the full version [39], we discuss in more detail how to generate an abstraction based on security labels.
5.1.2. Removing atoms or variables that might not be needed to establish the property

In many cases, there are atoms or variables that occur in the protocol messages but that do not occur in the security property ϕ. They might therefore be redundant and we generate an abstraction in which they are removed from the protocol messages. Such simplifications can be achieved by atom-and-variable removal abstractions. In the full version [39], we present an algorithm that identifies unnecessary atoms and variables, and removes them from the protocol messages.

5.1.3. Removing redundant terms based on preceding intruder knowledge

A somewhat related case occurs for terms in a protocol message m that the intruder can derive from his previous knowledge. A sufficient condition for this is that they can be derived from the combination of the initial intruder knowledge and the messages sent before m in the same role. As before, they might be redundant and we generate an abstraction in which they are removed from the protocol messages. In the full version [39], we explain how to eliminate such redundancies using redundancy removal abstractions.

5.2. Checking for spurious attacks

Our abstractions are sound, but not complete. Therefore, we may encounter false negatives, i.e., spurious attacks. To check whether an attack on a security property ϕ in an abstract model corresponds to a real attack in the original one, we perform the following steps. First, for each thread in the attack trace, we construct a (symbolic) trace whose events correspond to those occurring in the abstract thread. Then, we ask the verifier to search for an attack in the original protocol such that this attack contains only threads that are computed in the previous step. Formally, let $(t r, t h, σ)$ be the state that is corresponding to the attack found in the abstract model and $ID \subseteq TID$ be the set of thread identifiers in $t r$ . For each $i \in ID$ , let $e_{i}$ be the last event of thread i in $t r$ , $e_{i}^{'}$ be the corresponding event in the original protocol description, and let ${tr}_{i}$ be the symbolic trace such that ${tr}_{i} = (i, e v_{1}) \cdot (i, e v_{2}) \dots (i, e v_{m})$ , where $e v_{j}$ is the j-th event in the role $P (π_{1} (t h (i)))$ of the original protocol P and $e v_{m} = e_{i}^{'}$ . Intuitively, ${tr}_{i}$ is the original symbolic trace corresponding to the abstract trace obtained by projecting the attack trace $t r$ to thread i’s events. The verifier checks whether there exists a concrete attack consisting only of the events in the traces ${tr}_{i}$ for $i \in ID$ .

6. Implementation and case studies

In this section, we explain how we have implemented our abstraction mechanism for the Scyther tool. The resulting tool is available online [36]. We then validate the effectiveness of our method on a large number of real-world case studies.

6.1. Implementation for the Scyther tool

Scyther [14] is a leading automated security protocol verification tool. It supports verification for both a bounded and an unbounded number of threads. It also supports multi-protocol analysis, i.e., verifying a composition of multiple protocols. Scyther takes as input a security protocol description specified by a set of linear role scripts, which include the intended security properties. The tool supports both user-defined types and hash functions. These features match our setting very well.

In this section, we first present the correspondence between claim events in Scyther and our security property formulas. Then, we describe an extension of the labeling mechanism and the abstraction heuristics. In the full version [39], we demonstrate the application of our abstraction heuristics on an example.

6.1.1. Claim events and security properties

In Scyther, security properties are specified by means of claim events, which are integrated into protocol role specifications. Intuitively, claim events express the intended security goal that an agent executing a given protocol role expects to achieve. For our implementation, we consider the following types of claim events that are used to express secrecy and various forms of authentication properties. We adopt the definitions of these properties from [13,14,32]. All these properties include the additional premise that both the agent owning the thread executing the claim and its (intended) communication partner are honest, which we do not repeat below.

$claim (A, Secret, t)$ expresses the secrecy of a term t for role A, i.e., whenever an agent a executes a role A thread up to the claim event, term t cannot be derived by the adversary.

$claim (A, Alive)$ expresses the aliveness property for role A, i.e., whenever an agent a executes a role A thread up to the claim event, apparently with an agent b, then b has previously been running a protocol thread.

Note that this property still holds even when b was running the protocol with someone else (not a). Strengthening aliveness leads us to the notion of weak agreement property.

$claim (A, Weakagree)$ expresses weak agreement property for role A, i.e., whenever an agent a executes a role A thread to the claim event, apparently with an agent b, then b has previously been running a protocol thread, apparently with a.

Neither aliveness nor weak agreement guarantee that agents agree on their respective roles or on any data exchanged. This additional requirement is captured by non-injective agreement.

$claim (A, Commit, B, m)$ and $claim (B, Running, A, m^{'})$ are used to formalize non-injective agreement as defined by Lowe [32]. We say that a protocol guarantees non-injective agreement for role A with role B on a message m if, whenever a executes a role A thread up to the $Commit$ claim event, apparently with b in role B, then b has previously run a role B thread (at least) up to the $Running$ claim, apparently with a in role A, and the instances of m and $m^{'}$ agree according to the local views of these two agents’ threads.

$claim (A, Niagree)$ expresses another form of non-injective agreement stating that role A satisfies non-injective agreement if for each role A thread reaching the claim in some trace, there exist threads for all other roles of the protocol, such that all events causally preceding the claim (according to the protocol specification) must have occurred before the claim (in the trace) and each pair of matching send and receive events agree on the messages they contain.

$claim (A, Nisynch)$ expresses the non-injective synchronization property. This claim strengthens $claim (A, Niagree)$ by additionally requiring that the order of the events preceding $claim (A, Nisynch)$ must be correct as found in the protocol description, i.e., the send events occur before the corresponding receive events.

Note that non-injective agreement specified by $claim (A, Niagree)$ is different from that specified by the $Running$ and $Commit$ signals. The property does not require agreement on a specified set of data values. Instead, it requires agreement on the messages exchanged between the agents, which implies agreement on the data contained in those messages.

We now explain how to formalize these properties in our security property language using an example.

Example 6.1.
Consider the Needham–Schroeder public-key (NSPK) protocol from [35]. We mimic the claim events by introducing the corresponding signal events with the following set of signals: $\begin{matrix} Sig = {Create, Secret, Alive, Weakagree, Commit, Running, Niagree, Nisynch} . \end{matrix}$ The signal event $Create$ models the creation of a new protocol thread, which mimics the semantics of the $Create$ event defined in [13, page 27]. The remaining signals represent the corresponding claim events. Our formalization of the Needham–Schroeder public-key protocol is now given as follows. $\begin{array}{l} \begin{matrix} NS (A) & = Create \cdot send ({A, na}_{pk (B)}) \cdot recv ({na, Nb}_{pk (A)}) \cdot Running \cdot send ({Nb}_{pk (B)}) \cdot \\ Commit \cdot Secret \cdot Alive \cdot Weakagree \cdot Niagree \cdot Nisynch \end{matrix} \\ \begin{matrix} NS (B) & = Create \cdot recv ({A, Na}_{pk (B)}) \cdot Running \cdot send ({Na, nb}_{pk (A)}) \cdot recv ({nb}_{pk (B)}) \cdot \\ Commit \cdot Secret \cdot Alive \cdot Weakagree \cdot Niagree \cdot Nisynch \end{matrix} \end{array}$ We formalize the secrecy, aliveness, weak agreement, non-injective agreement, and non-injective synchronization properties for role A as follows.
Secrecy of $na$ : $\begin{matrix} ϕ_{s e c}^{NS} & = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Secret)) \\ \Rightarrow secret (ι, na) \end{matrix}$

Aliveness: $\begin{matrix} ϕ_{alive}^{NS} & = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Alive)) \\ \Rightarrow (\exists κ . steps (κ, Create) \land \\ ((role (κ, A) \land A^{@ κ} = B^{@ ι}) \lor \\ (role (κ, B) \land B^{@ κ} = B^{@ ι}))) \end{matrix}$

Weak agreement: $\begin{matrix} ϕ_{wagree}^{NS} & = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Weakagree)) \\ \Rightarrow (\exists κ . steps (κ, Create) \land \\ ((role (κ, A) \land A^{@ κ} = B^{@ ι} \land B^{@ κ} = A^{@ ι}) \lor \\ (role (κ, B) \land B^{@ κ} = B^{@ ι} \land A^{@ κ} = A^{@ ι}))) \end{matrix}$

Non-injective agreement (on $na$ and $nb$ ) based on Running and Commit claims: $\begin{matrix} ϕ_{cm}^{NS} & = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Commit)) \\ \Rightarrow (\exists κ . role (κ, B) \land steps (κ, Running) \land \\ {⟨ A, B, na, Nb ⟩}^{@ ι} = {⟨ A, B, Na, nb ⟩}^{@ κ}) \end{matrix}$

Non-injective agreement specified by $claim (A, Niagree)$ : $\begin{matrix} ϕ_{niagree}^{NS} & = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Niagree)) \\ \Rightarrow (\exists κ . role (κ, B) \land \\ steps (κ, recv ({A, Na}_{pk (A)})) ≺ steps (ι, Niagree) \land \\ steps (κ, send ({Na, nb}_{pk (A)})) ≺ steps (ι, Niagree) \land \\ {⟨ A, B ⟩}^{@ ι} = {⟨ A, B ⟩}^{@ κ} \land \\ {({A, na}_{pk (B)})}^{@ ι} = {({A, Na}_{pk (B)})}^{@ κ} \land \\ {({na, Nb}_{pk (A)})}^{@ ι} = {({Na, nb}_{pk (A)})}^{@ κ}) \end{matrix}$

Non-injective synchronization: $\begin{matrix} ϕ_{nisyn}^{NS} & = \forall ι . (role (ι, A) \land honest (ι, {A, B}) \land steps (ι, Nisynch)) \\ \Rightarrow (\exists κ . role (κ, B) \land \\ steps (ι, send ({A, na}_{pk (B)})) ≺ steps (κ, recv ({A, Na}_{pk (B)})) \land \\ steps (κ, send ({Na, nb}_{pk (A)})) ≺ steps (ι, recv ({na, Nb}_{pk (A)})) \land \\ {⟨ A, B ⟩}^{@ ι} = {⟨ A, B ⟩}^{@ κ} \land \\ {({A, na}_{pk (B)})}^{@ ι} = {({A, Na}_{pk (B)})}^{@ κ} \land \\ {({na, Nb}_{pk (A)})}^{@ ι} = {({Na, nb}_{pk (A)})}^{@ κ}) \end{matrix}$
The last two properties are obtained by instantiating the general definitions from [13] for the A role of the Needham–Schroeder public-key protocol. To see that $ϕ_{nisyn}^{NS}$ strengthens $ϕ_{niagree}^{NS}$ , note that the event ordering predicates in the latter formula are implied by those in the former together with event orderings within roles A and B, which always hold.

6.1.2. An extension of the labeling mechanism and the abstraction heuristics

In practice, it turns out that the labeling mechanism previously described is not sufficient to achieve good abstractions. There are protocols that employ cryptographic primitives in particular ways to achieve certain security goals, even though these primitives do not provide the desired properties themselves. In such cases, the heuristic may assign security labels to terms incorrectly, or accidentally remove elements that are important to achieve these properties.

Example 6.2.
Let us come back to the NSPK protocol, specified (without signals) as: $\begin{array}{l} NS (A) = send ({A, na}_{pk (B)}) \cdot recv ({na, Nb}_{pk (A)}) \cdot send ({Nb}_{pk (B)}) \\ NS (B) = recv ({A, Na}_{pk (B)}) \cdot send ({Na, nb}_{pk (A)}) \cdot recv ({nb}_{pk (B)}) \end{array}$ Suppose that we are interested in non-injective agreement for an agent in role A with an agent in role B on the nonce $na$ . The agent variable A in the first sent message is crucial to achieve this property. However, our heuristic may pull A out of the messages ${A, na}_{pk (B)}$ and ${A, Na}_{pk (B)}$ , as this abstraction preserves the label $NO$ for authentication and confidentiality of A. It is not hard to see that the resulting abstracted protocol no longer provides the desired property. Furthermore, the heuristic incorrectly decides that $na$ has authentication label $NO$ . Thus, we may also pull $na$ out of the encryptions in the first two events of role A, as this abstraction clearly preserves the security label of $na$ . However, no authentication is guaranteed for the abstracted protocol.

To deal with this issue, we enable the heuristic to detect such a pattern, i.e., an asymmetric encryption that includes an agent identity which is different from the one indicated in the encryption key. In this case, at least one occurrence of the identity must be kept, and the encryption is associated with authentication label $YES$ . Similarly, we must also keep agent identities that occur in symmetric encryptions.
6.2. Experimental results

We have validated the effectiveness of our abstractions on a total of 24 members of the IKE and ISO/IEC 9798 protocol families and on the PANA-AKA protocol [4] and the KSL protocol. We verify these protocols using five tools based on four different techniques: Scyther [14], CL-Atse [47], OFMC [8], SATMC [6], and ProVerif [9]. Only Scyther and ProVerif support verification of an unbounded number of threads. In Table 3, we present a selection of the experimental results for Scyther and refer to the full version [39] for a complete account, including results for the other tools for which we used hand-crafted abstractions. While our execution model closely fits Scyther’s, there are subtle differences with the execution models and specification languages of the other tools. However, our initial results suggest that our techniques can be formally adapted to increase the efficiency of those tools as well. Our models of the IKE and ISO/IEC 9798 protocols are based on Cremers’ [11,12]. Since Scyther uses a fixed signature with standard cryptographic primitives and no equational theories, the IKE models approximate the DH equational theory by oracle roles.

For our case studies, we verify several security properties including secrecy, aliveness, weak agreement, and non-injective agreement. We mark verified properties by ✓ and falsified ones by ×. An entry $✓ / \times$ means the property holds for one role but not for the other. Each row consists of two lines, corresponding to the analysis time without (line 1) and with (line 2) abstraction for 3-8 or unboundedly many (∞) threads. The times were measured on a cluster of 12-core AMD Opteron 6174 processors with 64 GB RAM each. They include computing the abstractions (4-20 ms) and the verification itself.

Table 3
Experimental results for Scyther. The time is in seconds. No: Number of abstractions. Properties: Secrecy, Aliveness, Weak agreement, and Non-injective agreement

Protocol No Properties Number of threads

S A W N 3 4 5 6 7 8 ∞

IKE

IKEv1-pk2-a2 1 ✓ ✓ 40.25 302.21 1679.69 9947.75 TO TO TO

6.12 26.40 154.26 959.02 6412.25 TO TO

IKEv1-pk-a22 1 ✓ ✓ 15.14 80.80 244.45 530.94 979.88 1677.69 TO

0.95 1.44 2.36 4.00 7.54 10.37 TO

IKEv2-eap 5 ✓ ✓ TO TO TO TO TO TO TO

78.94 773.49 4345.58 18572.70 TO TO TO

IKEv2-mac 4 ✓ ✓ 1.82 5.13 6.21 7.52 8.30 8.59 8.69

0.70 1.58 1.72 1.72 1.72 1.71 1.72

IKEv2-mactosig 6 ✓ ✓ 13.29 135.64 1076.56 7389.01 TO TO TO

2.68 12.38 24.54 38.68 53.36 65.07 77.68

IKEv2-sigtomac 6 ✓ ✓ 6.11 26.18 65.61 137.53 165.84 206.29 238.28

1.70 7.78 28.44 44.44 55.11 66.97 67.15

IKEv1-pk-m 2 × 48.62 269.92 507.40 869.23 16254.80 TO TO

0.16 0.22 0.37 0.66 1.19 2.05 TO

IKEv1-pk-m2 2 $✓ / \times$ 12.94 178.49 2198.81 TO TO TO TO

0.21 0.30 0.26 0.28 0.30 0.35 TO

IKEv1-sig-m 2 × 0.35 0.45 0.45 0.45 0.45 0.46 0.45

0.35 0.33 0.34 0.34 0.34 0.35 0.39

IKEv1-sig-m-perlman 2 × 3.55 14.11 47.16 67.61 72.20 72.15 73.83

17.59 17.61 17.53 17.53 17.59 17.53 17.58

IKEv2-sig-child 6 ✓ ✓ $✓ / \times$ 235.11 11274.66 TO TO TO TO TO

38.04 462.53 874.21 17713.06 TO TO TO

ISO/IEC

ISO/IEC 9798-2-5 1 ✓ 0.79 9.12 72.75 557.77 4260.57 TO TO

0.07 0.11 0.12 0.11 0.11 0.11 0.11

ISO/IEC 9798-2-6 1 ✓ 0.59 3.82 18.84 67.38 197.42 575.42 21254.67

0.05 0.04 0.05 0.05 0.05 0.05 0.05

ISO/IEC 9798-3-6-1 2 ✓ ✓ 42.68 795.11 8915.40 ME ME ME ME

0.14 0.20 0.21 0.21 0.21 0.21 0.21

ISO/IEC 9798-3-6-2 1 ✓ ✓ 2.47 8.66 19.48 33.94 48.26 60.05 70.81

0.12 0.15 0.15 0.15 0.15 0.15 0.15

ISO/IEC 9798-3-7-1 2 ✓ ✓ 41.63 752.82 7769.87 15863.97 ME ME ME

0.15 0.20 0.21 0.21 0.21 0.21 0.21

ISO/IEC 9798-3-7-2 1 ✓ ✓ 2.46 7.97 16.93 26.41 34.67 50.30 TO

0.21 0.30 0.31 0.31 0.31 0.31 0.31

Others

PANA-AKA 7 ✓ ✓ ✓ ✓ 5762.53 TO TO TO TO TO TO

0.23 0.22 0.23 0.23 0.23 0.23 0.23

KSL 1 ✓ 17.81 1272.50 TO TO TO TO TO

0.03 0.03 0.03 0.03 0.03 0.03 0.03

Protocol	No	Properties	Number of threads
IKE
IKEv1-pk2-a2	1	✓			✓	40.25	302.21	1679.69	9947.75	TO	TO	TO
		6.12	26.40	154.26	959.02	6412.25	TO	TO
IKEv1-pk-a22	1	✓			✓	15.14	80.80	244.45	530.94	979.88	1677.69	TO
		0.95	1.44	2.36	4.00	7.54	10.37	TO
IKEv2-eap	5	✓			✓	TO	TO	TO	TO	TO	TO	TO
		78.94	773.49	4345.58	18572.70	TO	TO	TO
IKEv2-mac	4	✓			✓	1.82	5.13	6.21	7.52	8.30	8.59	8.69
		0.70	1.58	1.72	1.72	1.72	1.71	1.72
IKEv2-mactosig	6	✓			✓	13.29	135.64	1076.56	7389.01	TO	TO	TO
		2.68	12.38	24.54	38.68	53.36	65.07	77.68
IKEv2-sigtomac	6	✓			✓	6.11	26.18	65.61	137.53	165.84	206.29	238.28
		1.70	7.78	28.44	44.44	55.11	66.97	67.15
IKEv1-pk-m	2				×	48.62	269.92	507.40	869.23	16254.80	TO	TO
			0.16	0.22	0.37	0.66	1.19	2.05	TO
IKEv1-pk-m2	2				$✓ / \times$	12.94	178.49	2198.81	TO	TO	TO	TO
			0.21	0.30	0.26	0.28	0.30	0.35	TO
IKEv1-sig-m	2				×	0.35	0.45	0.45	0.45	0.45	0.46	0.45
			0.35	0.33	0.34	0.34	0.34	0.35	0.39
IKEv1-sig-m-perlman	2				×	3.55	14.11	47.16	67.61	72.20	72.15	73.83
			17.59	17.61	17.53	17.53	17.59	17.53	17.58
IKEv2-sig-child	6		✓	✓	$✓ / \times$	235.11	11274.66	TO	TO	TO	TO	TO
	38.04	462.53	874.21	17713.06	TO	TO	TO
ISO/IEC
ISO/IEC 9798-2-5	1	✓				0.79	9.12	72.75	557.77	4260.57	TO	TO
			0.07	0.11	0.12	0.11	0.11	0.11	0.11
ISO/IEC 9798-2-6	1	✓				0.59	3.82	18.84	67.38	197.42	575.42	21254.67
			0.05	0.04	0.05	0.05	0.05	0.05	0.05
ISO/IEC 9798-3-6-1	2		✓		✓	42.68	795.11	8915.40	ME	ME	ME	ME
		0.14	0.20	0.21	0.21	0.21	0.21	0.21
ISO/IEC 9798-3-6-2	1		✓		✓	2.47	8.66	19.48	33.94	48.26	60.05	70.81
		0.12	0.15	0.15	0.15	0.15	0.15	0.15
ISO/IEC 9798-3-7-1	2		✓		✓	41.63	752.82	7769.87	15863.97	ME	ME	ME
		0.15	0.20	0.21	0.21	0.21	0.21	0.21
ISO/IEC 9798-3-7-2	1		✓		✓	2.46	7.97	16.93	26.41	34.67	50.30	TO
		0.21	0.30	0.31	0.31	0.31	0.31	0.31
Others
PANA-AKA	7	✓	✓	✓	✓	5762.53	TO	TO	TO	TO	TO	TO
0.23	0.22	0.23	0.23	0.23	0.23	0.23
KSL	1	✓				17.81	1272.50	TO	TO	TO	TO	TO
			0.03	0.03	0.03	0.03	0.03	0.03	0.03

Verification. For 13 of the 19 original protocols that are analyzed, an unbounded verification attempt results in a timeout (TO = 8h cpu time) or memory exhaustion (ME). In 7 of these, our abstractions enabled the verification of all properties in less than 0.4 seconds and in one case in 78 seconds. However, for the first three protocols, we still get a timeout. For the large majority of the bounded verification tasks, we significantly push the bound on the number of threads and achieve massive speedups. For example, our abstractions enable the verification of the complex nested protocols IKEv2-eap and PANA-AKA. Scyther verifies an abstraction of IKEv2-eap for up to 6 threads and, more strikingly, completes the unbounded verification of the simplified PANA-AKA in under 0.3 seconds whereas it can handle only 4 threads of the original version.

For these protocols, our tool aggressively simplifies the original models by removing unnecessary cryptographic protections and redundant fields. The IKEv2-eap protocol consists of two roles exchanging 8 messages. The messages are large and contain up to 5 layers of cryptographic operations (such as encryptions, signatures, and hashes). However, the most abstract model generated by our tool only exchanges 5 messages (i.e., 3 messages are completely removed by untyped abstractions). The most deeply nested messages contain only 3 layers of cryptographic operations. The PANA-AKA protocol exhibits a similar complexity. It employs up to 6 layers of cryptographic operations. Even though the most abstract model for PANA-AKA still exchanges 7 messages, the messages are substantially smaller than those of the original model and use at most 3 layers of cryptographic operations. We also achieve dramatic speedups for many other protocols, most notably for IKEv1-pk-a22, ISO/IEC 9798-2-6, and ISO/IEC 9798-3-6-2. This shows that our abstractions work particularly well for protocols that have complex message structures or large numbers of exchanged messages, as these features can significantly deteriorate the performance of protocol verifiers.

More interestingly, our abstractions also perform very well on another class of protocols which have simple message structures but still render verification challenging. For example, the ISO/IEC 9798-3-6-1, ISO/IEC 9798-3-7-1 and KSL protocols contain relatively small messages with at most one layer of encryption. However, the verification attempts for the original versions of the ISO/IEC 9798-3-6-1 and ISO/IEC 9798-3-7-1 protocols both result in memory exhaustion after 7 threads. Similarly, the verification of KSL already times out for 5 threads. We attribute this difficulty to the presence of untyped variables, i.e., variables of type $msg$ in our type system, in clear texts. As there is no constraint on the shapes of the messages that can be used to instantiate these variables, protocol verifiers typically need to consider all possible forms of instantiations, which potentially results in performance degradation. By removing unnecessary occurrences of untyped variables with respect to the security properties of interest, our abstractions enable the verification of KSL for an unbounded number of threads in only 0.03 seconds. Analogously, the tool successfully verifies ISO/IEC 9798-3-6-1 and ISO/IEC 9798-3-7-1 for an unbounded number of threads in 0.21 seconds.

Apart from enormous performance gains, the speedup is more modest for a few protocols, e.g., IKEv1-pk2-a2, IKEv2-sigtomac, and IKEv2-mac. These protocols have simple message structures, e.g., using at most 3 layers of cryptographic operations and only up to 4 exchanged messages. Moreover, they use untyped variables only in protected positions, i.e., as arguments of a hash or an encryption. They therefore do not leave much room for abstractions. In fact, although the generated abstract models for these protocols have smaller message sizes, they have similar message structures compared to the original ones. Nevertheless, our abstractions enable the reduction of the verification time by an order of magnitude in some cases, e.g., for the IKEv1-pk2-a2 protocol.

Additionally, we observe that the verification time for many abstracted protocols increases much more slowly than for their originals as the number of threads increases. We obtain almost constant verification times for the six ISO/IEC 9798 protocols, whereas the time significantly increases on some originals, e.g., for the ISO/IEC 9798-3-6-1 protocol.

Falsification. For rows marked by ×, the second line corresponds to falsification time for the most abstract model, which is much faster than on the original one. For example, for 8 threads of the IKEv1-pk-m protocol, we reduce falsification time from a timeout to 2.05 seconds. Note that for falsification, a check for spurious attacks is needed. This subroutine renders the performance gains less substantial than that for verification. For instance, in the unbounded case, the speedup factors are 1.15 for IKEv1-sig-m and 4.19 for IKEv1-sig-m-perlman. Note that our tool automatically checks for spurious attacks. Interestingly, all attacks found in the most abstract protocols are real, suggesting that our measures to prevent spurious attacks are effective.

Combination. For the IKEv1-pk-m2 and IKEv2-sig-child protocols, the tool verifies non-injective agreement for one role and falsifies it for the other one. Analogous to other case studies, we obtain a remarkable speedup for these protocols. Our abstractions raise the feasibility bound by 2 to 3 additional threads.

7. Related work

Hui and Lowe [28] define several kinds of abstractions similar to ours with the aim of improving the performance of the CASPER/FDR verifier. They establish soundness only for ground messages and encryption with atomic keys. We work in a more general model, cover additional properties, and treat the non-trivial issue of abstracting the open terms in protocol specifications. Other works [17,18,41] also propose a set of syntactic transformations, however without formally establishing their soundness. Using our results, we can, for instance, justify the soundness of the refinements in [18, Section 3.3].

Backes et al. [7] study the abstraction of authentication protocols formalized in the ρ-spi calculus. They propose a static analysis for authentication protocols by abstracting challenge-response messages into non-cryptographic versions expressed in a different language, called the CR calculus. Their abstraction method is based on non-increasing security labels similar to those of our heuristics. However, there are several differences with our work. First, since their sound abstractions map protocol specifications to a different language, the abstract protocols cannot be further abstracted. In our setting, protocol specifications and abstract protocols are expressed in the same language and abstractions can be composed. Second, the construction of the abstractions requires the identification of challenge-response components of a protocol, for which they do not give an algorithm. Third, since they designed a specialized technique for proving authentication properties, they cannot employ existing protocol verification tools to verify the abstract protocols. In contrast, our abstractions are composable, computed automatically by our tool, and can be verified using standard protocol verifiers. Finally, their method is restricted to agreement properties, while ours supports an expressive property specification language, which covers secrecy and a variety of authentication properties.

Guttman [24 ,25] studies the preservation of security properties for a rich class of protocol transformations in the strand space model. His approach to property preservation is based on the simulation of protocol analysis steps instead of execution steps. Each such analysis step explains the origin of a message. Apart from this different approach to soundness, there are other differences with our work. First, instead of working at the level of protocol messages, his protocol transformations are applied to strand space nodes and then lifted to protocol specifications and security properties. In contrast to our work, his approach does not restrict the shape of the transformed protocol message with respect to the original message. In his theory, one can, for instance, transform a hash of a message X and a key K into an encryption of X with K. We do not support such general transformations. Second, his protocol transformations are required to preserve the origination of values and the plaintext subterms of messages. The former condition means that if a value x first occurs in a transmission node then it also occurs first in the corresponding transformed node. Our soundness results do not require such conditions. For example, we can completely remove fresh values that are in clear or fields in a hash. Third, since his primary focus was to set up a general framework to express and justify security protocol transformations, he does not provide syntactic soundness conditions, guidance for the choice of appropriate abstractions, or automated verification. It might be possible to identify a subset of his transformations for which this is possible, but this would require additional work. In contrast, our tool automatically determines suitable abstractions and checks their soundness.

Refinement is abstraction viewed in the reverse direction, i.e., from abstract to concrete. Sprenger et al. [31,45,46] have proposed a hierarchical development method for security protocols based on stepwise refinement that spans several levels of abstraction. Each development starts from abstract models of security properties and proceeds down to cryptographic protocols secure against a Dolev–Yao intruder. The development process traverses intermediate levels of abstraction based on message-less protocols and communication channels with authenticity and confidentiality properties. Security properties, once proved for a given model, are preserved by further refinements. They have applied their method to develop families of authentication and key transport protocols. The abstractions in the present paper belong to their most concrete level of cryptographic protocols. They have embedded their approach in the Isabelle/HOL theorem prover, but each refinement step essentially requires a separate soundness proof.

8. Conclusions

In this work, we propose a set of syntactic protocol transformations that allows us to abstract realistic protocols and capture a large class of attacks. Unlike previous work [28,37], our theory and soundness results accommodate equational theories and a fine-grained type system that supports untyped variables, user-defined types, and subtyping. These features allow us to accurately model protocols, capture type-flaw attacks, and adapt to different verification tools, e.g., those supporting equational theories such as ProVerif and CL-Atse. We have extended Scyther with an abstraction module, which we validated on various IKE and ISO/IEC 9798 protocols and others. We also tested our technique (with manually produced abstractions) on ProVerif, CL-Atse, OFMC, and SATMC. Our experiments show that modern protocol verifiers can substantially benefit from our abstractions, which often either enable previously infeasible verification tasks or lead to dramatic speedups. Our abstraction tool supports checking for spurious attacks, which allows us to not only verify but also falsify security protocols efficiently.

As for future work, we plan to extend our soundness results to more expressive security protocol models such as multiset rewriting. This would allow us to cover more security protocols, for instance, protocols involving loops such as the TESLA protocol [42] or non-monotonic states such as contract signing protocols [3], as well as more security properties and adversary capabilities such as perfect forward secrecy, key compromise impersonation, and adversaries capable of revealing the local state of agents. We believe that our soundness results can also be extended to support else-branches in such theories by additionally establishing preservation theorems for disequality tests. Another direction for future research could be to generalize the tool and support more protocol verifiers. Possible improvements might be gained from applying techniques from the field of counter-example guided refinement: when a spurious attack is found, it might be possible to extract information from it to guide the exploration of the generated abstractions.

Footnotes

Acknowledgments

We thank Mathieu Turuani and Michael Rusinowitch for our fruitful technical discussions on the topic of this paper. We are also grateful to David Basin, Ognjen Maric, Ralf Sasse, and the anonymous reviewers for their careful proof-reading and helpful suggestions. This work was partially supported by the Air Force Office of Scientific Research, grant number FA9550-17-1-0206, and the EU FP7-ICT-2009 Project No. 256980, NESSoS: Network of Excellence on Engineering Secure Future Internet Software Services and Systems.

References

Almousa,

S.A.

Mödersheim,

Modesti and

Viganò, Typing and compositionality for security protocols: A generalization to the geometric fragment, in: ESORICS, Lecture Notes in Computer Science, Springer, 2015.

Arapinis and

Duflot, Bounding messages for free in security protocols, in: FSTTCS, 2007, pp. 376–387.

Arapinis,

Ritter and

M.D.

Ryan, StatVerif: Verification of stateful processes, in: Proceedings of the 24th IEEE Computer Security Foundations Symposium, CSF 2011, Cernay-la-Ville, France, 27–29 June, 2011, IEEE Computer Society, 2011, pp. 33–47. doi:10.1109/CSF.2011.10.

Arkko and

Haverinen, RFC 4187: Extensible authentication protocol method for 3rd generation authentication and key agreement (EAP-AKA), 2006, http://www.ietf.org/rfc/rfc4187.

Armando,

Arsac,

Avanesov,

Barletta,

Calvi,

Cappai,

Carbone,

Chevalier,

Compagna,

Cuéllar,

Erzse,

Frau,

Minea,

Mödersheim,

von Oheimb,

Pellegrino,

S.E.

Ponta,

Rocchetto,

Rusinowitch,

M.T.

Dashti,

Turuani and

Viganò, The AVANTSSAR platform for the automated validation of trust and security of service-oriented architectures, in: TACAS, 2012, pp. 267–282.

Armando and

Compagna, SAT-based model-checking for security protocols analysis, International Journal of Information Security 7(1) (2008), 3–32. doi:10.1007/s10207-007-0041-y.

Backes,

Cortesi,

Focardi and

Maffei, A calculus of challenges and responses, in: Proceedings of the 2007 ACM Workshop on Formal Methods in Security Engineering, FMSE ’07, ACM, New York, NY, USA, 2007, pp. 51–60. ISBN 978-1-59593-887-9. doi:10.1145/1314436.1314444.

D.A.

Basin,

Mödersheim and

Viganò, OFMC: A symbolic model checker for security protocols, Int. J. Inf. Sec. 4(3) (2005), 181–208. doi:10.1007/s10207-004-0055-7.

Blanchet, An efficient cryptographic protocol verifier based on prolog rules, in: 14th IEEE Computer Security Foundations Workshop (CSFW-14 2001), Cape Breton, Nova Scotia, Canada, 11–13 June 2001, IEEE Computer Seciety, 2001, pp. 82–96. doi:10.1109/CSFW.2001.930138.

10.

Cousot and

Cousot, Abstract interpretation: A unified lattice model for static analysis of programs by construction or approximation of fixpoints, in: POPL, 1977, pp. 238–252.

11.

Cremers, IKEv1 and IKEv2 protocol suites, 2011, https://github.com/cascremers/scyther/tree/master/gui/Protocols/IKE.

12.

Cremers, ISO/IEC 9798 authentication protocols, 2012, https://github.com/cascremers/scyther/tree/master/gui/Protocols/ISO-9798.

13.

Cremers and

Mauw, Operational Semantics and Verification of Security Protocols, Information Security and Cryptography, Springer, 2012. ISBN 978-3-540-78636-8. doi:10.1007/978-3-540-78636-8.

14.

C.J.F.

Cremers, The Scyther tool: Verification, falsification, and analysis of security protocols, in: CAV, 2008, pp. 414–418.

15.

C.J.F.

Cremers, Key exchange in IPsec revisited: Formal analysis of IKEv1 and IKEv2, in: ESORICS, 2011, pp. 315–334.

16.

C.J.F.

Cremers,

Mauw and

E.P.

de Vink, Injective synchronisation: An extension of the authentication hierarchy, Theor. Comput. Sci. 367(1–2) (2006), 139–161. doi:10.1016/j.tcs.2006.08.034.

17.

Datta,

Derek,

J.C.

Mitchell and

Pavlovic, Abstraction and refinement in protocol derivation, in: Proc. 17th IEEE Computer Security Foundations Workshop (CSFW), 2004.

18.

Datta,

Derek,

J.C.

Mitchell and

Pavlovic, A derivation system and compositionl logic for security protocols, Journal of Computer Security 13 (2005), 423–482. doi:10.3233/JCS-2005-13304.

19.

Dolev and

A.C.

Yao, On the security of public key protocols, IEEE Transactions on Information Theory 29(2) (1983), 198–207. doi:10.1109/TIT.1983.1056650.

20.

Durán and

Meseguer, A Church-Rosser checker tool for conditional order-sorted equational maude specifications, in: Rewriting Logic and Its Applications – 8th International Workshop, WRLA 2010, Held as a Satellite Event of ETAPS 2010, Revised Selected Papers, Paphos, Cyprus, March 20–21, 2010, pp. 69–85. doi:10.1007/978-3-642-16310-4_6.

21.

Escobar,

Meadows and

Meseguer, Maude-NPA: Cryptographic protocol analysis modulo equational properties, in: FOSAD, 2007, pp. 1–50.

22.

Escobar,

Sasse and

Meseguer, Folding variant narrowing and optimal variant termination, J. Log. Algebr. Program. 81(7–8) (2012), 898–928. doi:10.1016/j.jlap.2012.01.002.

23.

Giesl,

Schneider-Kamp and

Thiemann, Automatic termination proofs in the dependency pair framework, in: Automated Reasoning, Third International Joint Conference, IJCAR 2006, Proceedings, Seattle, WA, USA, August 17–20, 2006, 2006, pp. 281–286. doi:10.1007/11814771_24.

24.

J.D.

Guttman, Transformations between cryptographic protocols, in: ARSPA-WITS, 2009, pp. 107–123.

25.

J.D.

Guttman, Security goals and protocol transformations, in: Theory of Security and Applications (TOSCA), an ETAPS Associated Event, LNCS, Vol. 6993, Springer, 2011.

26.

J.D.

Guttman, Establishing and preserving protocol security goals, Journal of Computer Security 22(2) (2014), 203–268. doi:10.3233/JCS-140499.

27.

Harkins and

Carrel, The Internet key exchange (IKE), IETF RFC 2409 (proposed standard), 1998, Obsoleted by RFC 4306, updated by RFC 4109, http://www.ietf.org/rfc/rfc2409.txt.

28.

M.L.

Hui and

Lowe, Fault-preserving simplifying transformations for security protocols, Journal of Computer Security 9(1/2) (2001), 3–46. doi:10.3233/JCS-2001-91-202.

29.

J.-P.

Jouannaud and

Kirchner, Completion of a set of rules modulo a set of equations, SIAM J. Comput. 15(4) (1986), 1155–1194. doi:10.1137/0215084.

30.

Kaufman,

Hoffman,

Nir and

Eronen, Internet key exchange protocol version 2 (IKEv2), IETF RFC 5996, 2010, http://tools.ietf.org/html/rfc5996.

31.

Lallemand,

D.A.

Basin and

Sprenger, Refining authenticated key agreement with strong adversaries, in: 2017 IEEE European Symposium on Security and Privacy, EuroS&P 2017, Paris, France, April 26–28, 2017, pp. 92–107. doi:10.1109/EuroSP.2017.22.

32.

Lowe, A hierarchy of authentication specifications, in: IEEE Computer Security Foundations Workshop, IEEE Computer Society, Los Alamitos, CA, USA, 1997, pp. 31–43. doi:10.1109/CSFW.1997.596782.

33.

Meier,

C.J.F.

Cremers and

D.A.

Basin, Strong invariants for the efficient construction of machine-checked protocol security proofs, in: Proceedings of the 23rd IEEE Computer Security Foundations Symposium, CSF 2010, Edinburgh, United Kingdom, July 17–19, 2010, IEEE Computer Seciety, 2010, pp. 231–245. doi:10.1109/CSF.2010.23.

34.

Meier,

Schmidt,

Cremers and

D.A.

Basin, The TAMARIN prover for the symbolic analysis of security protocols, in: CAV, 2013, pp. 696–701.

35.

R.M.

Needham and

M.D.

Schroeder, Using encryption for authentication in large networks of computers, Commun. ACM 21(12) (1978), 993–999. doi:10.1145/359657.359659.

36.

B.T.

Nguyen, The Scyther-abstraction tool, 2018, https://github.com/binhnguyen1984/scyther-abstraction.

37.

B.T.

Nguyen and

Sprenger, Sound security protocol transformations, in: POST, 2013, pp. 83–104.

38.

B.T.

Nguyen and

Sprenger, Abstractions for security protocol verification, in: Principles of Security and Trust – 4th International Conference, POST 2015, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2015, Proceedings, London, UK, April 11–18, 2015,

Focardi and

A.C.

Myers, eds, Lecture Notes in Computer Science, Vol. 9036, Springer, 2015, pp. 196–215. doi:10.1007/978-3-662-46666-7_11.

39.

B.T.

Nguyen,

Sprenger and

Cremers, Abstractions for security protocol verification, Technical report, Department of Computer Science, ETH Zurich, 2018. doi:10.3929/ethz-b-000266360.

40.

Paulson, The inductive approach to verifying cryptographic protocols, J. Computer Security 6 (1998), 85–128. doi:10.3233/JCS-1998-61-205.

41.

Pavlovic and

Meadows, Deriving secrecy in key establishment protocols, in: Proc. 11th European Symposium on Research in Computer Security (ESORICS), 2006, pp. 384–403.

42.

Perrig,

J.D.

Tygar,

Song and

Canetti, Efficient authentication and signing of multicast streams over Lossy channels, in: Proceedings of the 2000 IEEE Symposium on Security and Privacy, SP ’00, IEEE Computer Society, Washington, DC, USA, 2000, p. 56. ISBN 0-7695-0665-8.

43.

Schneider, Verifying authentication protocols with CSP, in: 10th Computer Security Foundations Workshop (CSFW ’97), Rockport, Massachusetts, USA, June 10–12, 1997, 1997, pp. 3–17. doi:10.1109/CSFW.1997.596775.

44.

S.A.

Shaikh,

V.J.

Bush and

S.A.

Schneider, Specifying authentication using signal events in CSP, Computers & Security 28(5) (2009), 310–324. doi:10.1016/j.cose.2008.10.001.

45.

Sprenger and

Basin, Developing security protocols by refinement, in: Proc. 17th ACM Conference on Computer and Communications Security (CCS), 2010, pp. 361–374. doi:10.1145/1866307.1866349.

46.

Sprenger and

Basin, Refining key establishment, in: Proc. 25th IEEE Computer Security Foundations Symposium (CSF), 2012, pp. 230–246.

47.

Turuani, The CL-Atse protocol analyser, in: RTA, 2006, pp. 277–286.

	Confidentiality	Authentication
Top-level constructor of t	$ℓ_{c} (t)$	$ℓ_{a} (t)$
symmetric encryptions or MACs with long-term keys	$YES$	$YES$
MACs with session keys	$YES$	$MAYBE$
symmetric encryptions with session keys	$MAYBE$	$MAYBE$
public-key encryptions or hashes	$YES$	$NO$
signatures	$NO$	$YES$
others	$NO$	$NO$

Abstractions for security protocol verification

Abstract

Keywords

1. Introduction

Table 1 Structure of paper Topic Main description Motivating example: IKE Section 2 Modeling security protocols Section 3 Abstraction theory Section 4 Abstraction generation algorithm Section 5 Algorithm implementation in Scyther Section 6.1 Experimental results Section 6.2

3. Security protocol model

3.1. Type system

3.2. Equational theories

3.3. The finite variant property

Definition 3.7 (Protocol).

Example 3.8 ( IK E m protocol).

3.5. Operational semantics

Example 3.10 (Example trace).

3.6. Property language

Example 3.11 (Properties of IK E m ).

4.1. Overview

Example 4.1 (Typed abstractions).

Example 4.2 (Atom-and-variable removal).

Example 4.3 (Redundancy removal).

4.2. General soundness theorem for protocol abstractions

4.2.1. General protocol abstractions

Definition 4.4 (General protocol abstraction).

4.2.2. Soundness of general protocol abstractions

Definition 4.5 (Safe formulas).

Theorem 4.6 (General soundness theorem).

4.3.1. Syntax and semantics

Example 4.12 (from IK E m to IK E m 1 ).

4.3.3. Soundness of typed abstractions

Theorem 4.18 (Substitution property).

4.3.5. Equality preservation (P2)

Theorem 4.23 (Equality preservation).

4.3.6. Deducibility preservation (P1)

Example 4.24 (Preserving decryption).

Example 4.25 (Dropping fields).

Example 4.26 (Transforming non-enclosing constructors).

Example 4.27 (Non-linear variables).

Definition 4.30 (Compatibility with rewrite theory).

Corollary 4.33 (Deducibility preservation with substitutions).

4.3.7. Syntactic criteria for disequality preservation (P3)

4.4.1. Specification of atom-and-variable removal

Example 4.40 ( IK E m 1 to IK E m 2 ).

4.4.2. Soundness for atom-and-variable removal abstractions

Definition 4.41 (Accessible variables).

4.5. Redundancy removal abstractions

4.5.1. Specification of redundancy removal abstractions

Theorem 4.46 (Soundness for redundancy removal abstractions).

4.6. Well-formedness preservation for protocol abstractions

5.1.1. Simplifying or removing constructors that might not be needed to establish the property

Definition 5.3 (Protocol confidentiality label).

5.1.3. Removing redundant terms based on preceding intruder knowledge

5.2. Checking for spurious attacks

6. Implementation and case studies

6.1. Implementation for the Scyther tool

6.1.1. Claim events and security properties

8. Conclusions

Footnotes

Acknowledgments

References

Table 1
Structure of paper

Topic Main description

Motivating example: IKE Section 2

Modeling security protocols Section 3

Abstraction theory Section 4

Abstraction generation algorithm Section 5

Algorithm implementation in Scyther Section 6.1

Experimental results Section 6.2

Example 3.8 ( $IK E_{m}^{}$ protocol).

Example 3.11 (Properties of $IK E_{m}^{}$ ).

Example 4.12 (from $IK E_{m}^{}$ to $IK E_{m}^{1}$ ).

Example 4.40 ( $IK E_{m}^{1}$ to $IK E_{m}^{2}$ ).