Abstract
This paper describes an approach for improving cyber resilience through the synthesis of optimal microsegmentation policy for a network. By leveraging microsegmentation security architecture, we can reason about fine-grained policy rules that enforce access for given combinations of source address, destination address, destination port, and protocol. Our approach determines microsegmentation policy rules that limit adversarial movement within a network according to assumed attack scenarios and mission availability needs. For this problem, we formulate a novel optimization objective function that balances cyberattack risks against accessibility to critical network resources. Given the application of a particular set of policy rules as a candidate optimal solution, this objective function estimates the adversary effort for carrying out a particular attack scenario, which it balances against the extent to which the solution restricts access to mission-critical services. We then apply artificial intelligence techniques (evolutionary programming) to learn microsegmentation policy rules that optimize this objective function.
1. Introduction
In modern military doctrine, armed conflict involves “multiple layers of stand-off” in all operational domains, including cyberspace. 1 Effective layering of defenses in cyberspace requires addressing all phases of the cyberattack lifecycle. Given increasingly complex networked systems and advanced cyber threats, there is growing recognition of the need for cyber resilience, that is, the ability to continue to operate in spite of ongoing cyberattacks.2–4 For optimizing cyber resilience, a key challenge is being able to assess various candidate security policies under given mission and threat circumstances.
Assessment of security policy must consider not only potential impact from adversarial activities, but also any restricted availability of mission-critical services due to security hardening. This is especially true inside network perimeters, since systems and services that can be exploited by adversaries already inside a network are likely to be more critical (vs outside facing ones) for mission operations. Given indications of likely adversarial avenues of approach (or indicators of actual compromise) and measures of mission criticality for allowed access to network resources, policy rules can be optimized to account for that information. This requires fine granularity in how such rules are expressed and enforced.
A way of accomplishing the kind of fine-grained control over access policy needed for optimal cyber resilience is through network microsegmentation. 5 Microsegmentation is a granular approach for workload isolation and security. More traditional methods of network segmentation secure traffic in the north-south (outside vs inside) orientation. Microsegmentation provides greater control over east-west (lateral) traffic inside a network, for example, for limiting lateral movement by adversaries who have breached perimeter defenses. With implementation through virtualization technology, microsegmentation supports flexible and adaptive security policy in response to changing mission requirements and threat situations. Examples of such virtualization in modern weapon systems include the US Army’s Command Post Computing Environment (CPCE) software deployed on Tactical Server Infrastructure (TSI). 6
This paper describes an approach for optimizing network microsegmentation policy for maximum resilience. Through microsegmentation security architecture for enforcement, we consider sets of fine-grained policy rules that allow access for given combinations of source address, destination address, destination port, and protocol. Our approach analyzes the efficacy of candidate policy solutions with respect to (1) minimizing the risks of a particular threat situation while (2) maximizing accessibility to critical network resources. We formulate this as a novel optimization problem that quantifies attacker reachability in terms of multi-step exploitation. Unlike previous approaches, for example, Noel and colleagues7,8 that require all exploitable paths to be broken for a given threat situation, we compute the numbers of potential attack walks of given lengths, which allows for quantitative comparisons among candidate policies for resiliency tradeoff analysis.
The numbers of attack walks provide an estimate of adversary effort resulting from the application of a candidate policy. The optimization problem is then a joint maximization of adversary effort and mission availability. In this way, we relax the brittle binary cut-set blocking of previous approaches. Instead, we measure to what extent a candidate policy solution blocks shorter (least adversary effort) exploitation paths, to allow tunable tradeoffs between security and usability.
We then apply artificial intelligence (AI) techniques (evolutionary algorithms) to learn the optimal microsegmentation policy according to the optimization objective, as part of the MITRE Adaptive Resiliency Experimentation System (ARES). ARES employs off-the-shelf cybersecurity tools and custom AI-powered components for optimizing cyber resilience, including microsegmentation, authentication, policy generalization, redundancy, deception, and zero-trust architecture. This paper focuses on the optimization of microsegmentation policy (allowed source/destination address, destination port, and protocol) in ARES, enforced through Amazon Web Service (AWS) security groups (virtual host-based firewalls).
The next section describes our approach for optimizing microsegmentation policy. Section 3 formulates our optimization problem in terms of the tradeoff between maximizing adversary effort and maximizing access to mission-critical resources. Section 4 describes key experimental results for our approach. Section 5 then summarizes our approach.
2. Technical approach
Figure 1 shows a high-level overview of our approach. In Data Collection, various host and network sensors forward data to a central repository, where the data elements are correlated across the sensor types. In Model Building, correlated network data are ingested and mapped to a model representing the network environment and mission/threat situation. In Policy Optimization, a set of microsegmentation rules is synthesized that provide optimal resiliency for the network as defined by the optimization objectives (maximizing adversary effort and mission availability).

Overview of approach.
Policy optimization simulates potential multi-step lateral movement through the network according to a particular threat situation. The threat situation includes any presumed or detected adversarial presence in the network and identifies any mission-critical hosts that are to be prioritized for protection from the adversary. For a given candidate policy solution, the policy is applied to the network environment, and scored according to the optimization objective defined in section 3.
In general, the effects of microsegmentation rules on threats are combinatorial, so that the rules that comprise a candidate solution cannot be considered independently of one another. We apply evolutionary programming for searching the combinatorial space of rules to learn the optimal solution to this non-deterministic polynomial-time (NP)-hard optimization problem. In this way, we synthesize optimized policy that balances cyberattack risks and mission needs.
3. Problem formulation
A key result of ARES model building (Figure 1) is an inferred initial state of allowed connections, based on generalization of observed traffic via unsupervised learning (clustering). We treat this as a baseline policy for microsegmentation. This baseline policy is a set of rules, with each rule defined as an allowed combination of source address, destination address, destination port, and protocol. Our policy optimization then considers changes to the baseline policy, that is, a set of denied rules (with respect to the baseline set) that best meets an optimization fitness function.
In ARES, the baseline policy is represented as a graph, with host IP addresses as nodes and combinations of destination port and protocol as edges (a multigraph, since multiple destination port and protocol combinations are possible for a given source/destination IP pair). Nodes and edges also have various properties assigned to them, which encode information such as vulnerabilities that can be exploited by an adversary and mission criticality of network connections.
In our problem formulation, we define certain subgraphs of the baseline policy graph (casting multigraphs as simple graphs), which we apply for policy optimization. We represent these graphs as adjacency matrices, with computations needed for optimization expressed as matrix operations. This provides a convenient mathematical notation, as well as efficient implementation via established software and hardware such as sparse matrix processing and parallel processing with graphics processing units (GPUs).
Table 1 lists key terms that are introduced in this section.
Key terms.
In section 3.1, we define foundational mathematical models that capture salient aspects of cyber resilience. In section 3.2, we start with ideal assumptions (constraints) about the problem space to define a restricted form of optimization. In section 3.3, we relax certain of those constraints to obtain a more realistic and meaningful problem formulation. In section 3.4, we relax a final constraint (mission-impact budget), yielding a multi-objective optimization problem that allows a Pareto-optimal tradeoff between security (thwarting an assumed attack scenario) and mission needs (minimizing impact from blocked services).
3.1. Mathematical foundations
For each optimization problem in this section, assume that we are given graphs that represent two kinds of host-to-host relationships: an attack graph
We express attack graph
Here, binary element
On the mission side, element
In some of our optimization problem formulations, the elements of
For a given instance of an optimization problem, a graph of candidate policy rules
That is, element
Here, the ° symbol denotes the Hadamard (elementwise) product,
9
defined by
The graphs
We can apply an optimal policy rules graph
Similarly, we can apply an optimal policy rules graph
Leveraging the adjacency matrix representation for graphs, we can evaluate the existence of attack walks of a given length through matrix multiplication.10,11 For a square
Here, matrix multiplication is in the usual sense,
12
that is, an element of
The matching of rows and columns in matrix multiplication corresponds to matching path steps of an attack graph, and the summation counts the numbers of matching steps. The elements of

Graph adjacency matrix and its powers.
To determine attack reachability over any length of walk (attack depth), we can form the transitive closure of the attack graph. This expresses whether a given host is reachable (through any depth of walk) from another given host. We can write the transitive closure
Here, each matrix power
This has the same form as Equation (9), except the multiplication and addition operations are logical (Boolean) rather than arithmetic, that is, ∧ denotes the conjunction (logical AND) and ∨ denotes the disjunction (logical OR). Thus, Boolean matrix multiplication (and transitive closure) represents the presence of walks rather than the number of walks between a given pair of hosts. That is, element
We can also apply matrix multiplication to determine attack reachability (numbers of walks of a given length) starting from a particular set of hosts. This provides additional information (beyond transitivity) for evaluating candidate policy solutions. For representing attack walks from each starting point using matrix multiplication, it is convenient to transpose the attack adjacency matrix:
Recall our definitions of (attack and mission) adjacency matrices, that is, Equations (1) and (2). Here, elements of row
We define an
This yields the
We can also define an
We can apply this kind of matrix multiplication for determining the number of attack walks (from start to goal) of a given length. We use the attack adjacency matrix of power
Using the transitive closure adjacency matrix, we can determine the existence of walks of any depth from attack start to goal:
For optimization fitness evaluation, we can determine the number of attack walks of length
Depending on the instance of an optimization problem, there might be some policy rules (or combination of rules) that are infeasible as elements of a solution (e.g., that do not obey problem constraints). Or there might be security solutions that operate on rules in a certain way.
For example, a particular application might require certain hosts and services to be connected for the application to function properly, so that all those connections should be considered as a single “safety set” unit (rather than individually) in optimization. Another example (not employing microsegmentation) is that network-based firewalls are only able to enforce policy for traffic that is routed through them, so that optimization needs to consider only those vulnerable connections from one network segment to another (separated by a firewall).
To represent such aggregations of individual security rules, we define a
Figure 3 summarizes the relationships among security settings, policy rules, attack vectors (original and hardened), and mission needs (original and hardened). This figure also includes an optimization objective function

Elements of policy optimization problem space.
In our problem formulation, policy rules are determined by the security settings. We generally assume that
3.2. Ideal assumptions
We begin with a set of ideal assumptions, which form optimization constraints in terms of blocked attack paths and unblocked mission edges. This leads to the definition of a constrained optimization problem that minimizes the total number of blocked edges (that block attack paths and do not block mission edges).
Thus, we define the following constrained optimization problem:
Optimization Problem P1
(Objective O1) Minimize the number of blocked (denied) host-to-host edges such that (Constraint C1.1) no mission host pair is blocked and (Constraint C1.2) there exists no attack path from given attack start host(s) to given attack goal host(s).
Recall that for an attack graph with adjacency matrix
In Problem P1, we make the ideal assumptions that it is possible to block all attack paths from attack start(s) to attack goal(s) (i.e., Constraint C1.2) while simultaneously ensuring that no mission host pair (graph edge) is blocked (Constraint C1.1). Here, the sets of attack starting hosts and attack goal hosts are disjoint (non-overlapping); otherwise, we have a degenerate case in which the attack goal hosts are already compromised by the adversary.
For a policy rules graph
Under Constraint C1.1, all mission edges are to be allowed via the policy rules
With the hardened attack graph
Here,
In the following sections, we examine illustrative examples of Optimization Problem P1. In section 3.2.1, we examine a particular instance of P1, in which there is a single optimal solution to this problem. In section 3.2.2, we examine a different instance of P1, in which the solution to this problem is non-unique (more than one solution yields the same value of the objective function).
3.2.1. Unique solution
This section examines a particular instance of Optimization Problem P1. As we show, this instance of P1 has a unique solution that optimizes Objective O1 while obeying Constraints C1.1 and C1.2.
As shown in Figure 4, this instance of P1 has four network hosts. The graph nodes 1,…,4 represent these hosts. The blue (solid) graph edges represent host-to-host connectivity needed for the organizational mission. The red edges (dashed) represent the potential attacks from one host to another. The figure includes adjacency lists for the mission and attack edges (respectively).

Example 1 (mission and attack edges).
As depicted in the figure, we assume that an attacker starts from Host 4, and seeks to compromise Host 2. That is, for Constraint C1.2 for this instance of P1, Host 4 is the attack start host
Expressing the attack graph as a
Expressing the mission graph as a
Optimization Problem P1 seeks a policy that minimizes the number of blocked edges (maximum number of unblocked edges). This is consistent with the constraint that no mission edges are blocked. Furthermore, only blocking edges in the attack graph will impact the existence of paths from Host 4 to Host 2 (the other constraint of P1). In the example of Figure 4, these attack-graph edges are
Taking these constraints into consideration, we have:
This shows that there are only 22 = 4 combinations (binary values of
So, since
We next examine deeper paths, which could potentially lead from Host 4 to Host 2. Since
Figure 5 summarizes the optimization outcomes for each potential solution of this problem instance. For each combination of the values of

Candidate solutions for Example 1.
The combinations
Figure 6 shows the optimal solution for this instance of Optimization Problem P1. For implementation (e.g., as an access control list), this corresponds to a “allow by default” (default binary ones of

Optimal solution for Example 1.
3.2.2. Non-unique solutions
In the previous section, there is a single optimal solution for the given instance of Problem P1. In this section, we describe another instance of P1 that has more than one optimal solution, illustrating that optimal solutions are not necessarily unique.
Consider the instance of Optimization Problem P1 in Figure 7. In comparison to Figure 4, this instance of P1 has two additional attack-graph edges:

Example 2 (mission and attack edges).
The mission graph adjacency matrix
The attack graph adjacency matrix
As described in the previous section, the policy rules (blocked edges)
This shows that there are now 24 = 16 combinations (binary values of
To determine attack paths from the starting host (Host 4), we define the column vector
Applying Equation (13) to
This shows that there are two length-1 attack walks from Host 4, ending at Host 2 and Host 3. Next, computing
Applying Equation (13), substituting
This shows that there is one length-2 attack walk from Host 4, which ends at Host 2. Repeating this for
Applying Equation (10), we find the transitive closure (binary reachability through walks of any length) for the attack graph
We can then apply Equation (13), substituting
Figure 8 summarizes the optimization’ outcomes for each potential solution of this problem instance.

Candidate solutions for Example 2.
For each combination of the values of
Figure 9 shows the optimal solutions for this instance of Optimization Problem P1. In this case, there is no unique optimal solution, that is, there are two solutions that are both feasible and have the minimum value of blocked edges (two). These optimal solutions are

Optimal solutions for Example 2.
3.3. More realistic assumptions
Next, we relax the ideal assumptions from the previous section to more realistic ones. Rather than insisting that no mission edges are blocked by policy rules, assume that we are given a budget for an allowed amount of mission impact. We also relax the constraint of blocking all attack paths to an optimization objective, that is, maximizing the number of blocked shortest paths. From these relaxed assumptions, we formulate a more realistic optimization problem, which maximizes resilience in terms of blocked attack paths within a given mission-impact budget.
We then define the following constrained optimization problem:
Optimization Problem P2
(Objective O2) In priority order, (Sub-Objective O2.1) maximizes the blocked shortest attack paths from attack start(s) to goal(s), (Sub-Objective O2.2) with minimum impact on the mission, (Sub-Objective O2.3) using the least number of blocked (deny) policy edges such that (Constraint C2) the mission impact is within a given budget.
For both attack and mission graphs, we consider the possibility that graph edges (representing host-to-host connectivity) are weighted. Edge weights are intended to represent the following:
For a mission graph, a weight for an edge represents the value of the edge to the organizational mission. For example, if an organization values high traffic volume as an indicator of mission need, the relative volume of traffic between hosts could be the mission edge weights.
For an attack graph, a weight for an edge represents the value of the edge in helping to thwart attacks. For example, the expected time to compromise one host from another could be the attack edge weights.
Sub-Objective O2.1 is a kind of shortest-paths problem. However, we are not merely interested in policies that only block the single shortest path from attack start to goal—that would leave other (longer) paths available for the attacker. At the other extreme, enumerating all paths is prohibitively expensive, since counting all graph paths from a start node to a goal node (the s–t paths problem) is #P complete (the analogue of NP completeness for counting problems). 13 Also, while it is possible to estimate the number of s–t paths in a graph, 14 we require the actual paths (not simply the number of paths) for evaluating O2.1.
We therefore define Sub-Objective O2.1 in terms of the k-shortest-paths problem.
15
This problem is about finding the k-shortest paths from a start node
For a given shortest path
Sub-Objective O2.2 seeks to minimize impact on the mission. For a weighted mission graph (adjacency matrix)
Sub-Objective O2.3 minimizes the number of blocked (deny) edges in the policy graph, independent of the attack and mission graphs. For a policy rules graph
The overall Objective O2 is defined in terms of dominance relations among Sub-Objectives O2.1, O2.2, and O2.3, that is:
This denotes that for Objective O2, Sub-Objective O2.1 (shortest attack paths blocked) dominates the other two sub-objectives, and that Sub-Objective O2.2 (minimum mission impact) dominates Sub-Objective O2.3 (minimum policy edges blocked). So, for example, if Solution A is better than Solution B in terms of O2.1, then Solution A is more optimal, regardless of O2.2 and O2.3 for either solution. Or, if Solutions A and B have the same O2.1 optimality, and Solution A is better than Solution B in terms of O2.2, then Solution A is more optimal.
These dominance relations are depicted in Figure 10. Here, we assume that attack path edges have unit weight. Thus, blocked shortest paths of unit length dominate shortest paths of length 2, and so on.

Dominance relations among optimization objectives (Problem P2).
Consider the instance of Optimization Problem P2 in Figure 11. In this example, we are given a budget of 10 units for allowed mission impact (Constraint C2).

Example 3 (mission and attack edges).
Here is the weighted mission graph adjacency matrix
Here is the attack graph adjacency matrix
The policy rules (blocked edges)

Candidate solutions for Example 3.
For each of these 32 policy combinations, Figure 12 shows the blocked shortest attack paths (Sub-Objective O2.1), mission impact (Sub-Objective O2.2), and blocked policy edges (Sub-Objective O2.3). Mission-impact values that exceed the budget of 10 units are marked in red (infeasible). The resulting optimal solution is marked in blue.
In this problem instance, here are the attack paths to be blocked, in dominance order (length of attack path):
Length-1 path (1): Host 4 → Host 2.
Length-2 paths (2): Host 4 → Host 1 → Host 2. Host 4 → Host 3 → Host 2.
Length-3 path (1): Host 4 → Host 3 → Host 1 → Host 2.
In Figure 12, there is a column for the numbers of shortest paths of each length that are blocked by a given policy combination. Since the attack-graph edges are assigned unit (equal) weights, blocking attack paths of equal length have equal optimality.
Figure 13 shows the optimal solution for this instance of Optimization Problem P2. This solution is

Optimal solution for Example 3.
3.4. Multi-objective optimization
Next, we relax the mission-impact budget constraint, and cast it as a second optimization objective. The resulting multi-objective optimization problem allows a Pareto-optimal tradeoff between security (attack resilience) and mission needs (impact from blocked hosts).
This leads to the following constrained multi-objective optimization problem:
Optimization Problem P3
(Objective O3.1) Maximize (Sub-Objective O3.1.1) the blocked shortest paths from attack start(s) to attack goal(s), (Sub-Objective O3.1.2) using the least number of blocked (deny) policy edges. (Objective O3.2) Minimize the impact on the mission.
As for Problem P2, we assume that the (host-to-host) edges of the attack and mission graphs are weighted (for thwarting attacks and supporting the mission, respectively). Sub-Objective O3.1.1 is the same as Sub-Objective O2.1 (section 3.3), that is, maximizing blocked shortest attack paths:
Sub-Objective O3.1.2 is the same as Sub-Objective O2.3 (section 3.3), that is, minimizing blocked edges (maximizing unblocked edges) in the policy graph, independent of the attack and mission graphs:
The overall Objective O3.1 is defined in terms of dominance relations among Sub-Objectives O3.1.1 and O3.1.2:
This denotes that for Objective O3.1, Sub-Objective O3.1.1 (shortest attack paths blocked) dominates Sub-Objective O3.1.2 (minimum policy edges blocked). So, for example, if Solution A is better than Solution B in terms of O3.1.1, then Solution A is more optimal, regardless of O3.1.2 for either solution. Or, if Solutions A and B have the same O3.1.1 optimality, and Solution A is better than Solution B in terms of O3.1.2, then Solution A is more optimal.
Objective O3.2 is the same as Sub-Objective O2.2 (section 3.3), that is, minimizing mission impact from the application of a policy:
Figure 14 shows the dominance relations between O3.1.1 and O3.1.2 (for Objective O3.1). Objective O3.2 is a separate objective in Problem P3, which shows as non-dominated in the partial order in Figure 14. In the figure, it is assumed that attack path edges have unit weight. Thus, blocked shortest paths of unit length dominate shortest paths of length 2, and so on.

Dominance relations among optimization objectives (Problem P3).
Figure 15 depicts how the dominance relations among elements of Sub-Objectives O3.1.1 and O3.1.2 can be implemented as numerical ranges, so that Objective O3.1 can be weighted along with O3.1. The scheme in Figure 15 maps the numbers of allowed shortest paths from attack starts to attack goals to a particular range for each path length.

Mapping dominance relations of optimization objectives to numerical ranges.
For a given candidate solution, the number of blocked paths (or walks, if computed via matrix multiplication) of length 1 is mapped to the range of
This pattern of mapping numbers of progressively longer walks to progressively smaller numerical ranges yields overall objective fitness numbers that rank candidate solutions according to the priority (dominance relation) defined in Objective O3.1.1. The last part of this dominance-relation mapping is to map the total number of unblocked edges to the remaining range (Objective O3.1.2), that is, for a maximum walk length of 4, from zero (no edges unblocked) to
We now examine an instance of Optimization Problem P3, using Example 3 in Figure 11 of section 3.3. Since Problem P3 does not have a mission-impact constraint (budget), the budget shown in Figure 11 does not apply.
As before, the blocked edges in policy
Figure 16 evaluates five of these six attack-graph edge combinations (i.e.,

Candidate solutions of Problem P3 for Example 3.
In multi-objective optimization, a solution is called Pareto optimal if none of its objectives can be improved without worsening some of its other objectives. 17 A Pareto front is a set of solutions that are Pareto optimal. In terms of dominance relations, a Pareto front is a set of non-dominated solutions, that is, for which no objective can be improved without sacrificing at least one other objective. A Pareto front identifies set of candidate solutions (e.g., from among a larger set of dominated solutions) for analyzing tradeoffs among conflicting objectives. Figure 16 marks the Pareto front resulting from evaluating candidate solutions to this instance of Problem P3. The candidate solutions evaluated in Figure 16 are plotted in Figure 17.

Candidate solutions of Problem P3 for Example 3 (Pareto front).
In Figures 16 and 17, each solution is written in terms of its values for mission impact, blocked shortest paths, and blocked edges. In these figures, the candidate solutions are plotted in two dimensions, that is, Objective 3.1 (maximize blocked shortest paths, minimize blocked edges) and Objective 3.2 (minimize total mission impact).
Figure 18 shows the application of the three Pareto-optimal solutions in Figure 17 to the mission and attack graphs.

Pareto-optimal solutions of Problem P3 for Example 3.
4. Experimentation
In our approach, data from a network to be defended are collected, correlated, and used to build a graph-based model for potential multi-step lateral movement through the network. Figure 19(a) shows such a model, built from observed network traffic for a baseline (non-optimized) representative enterprise network within the ARES testbed.

Baseline policy model for testbed network, including (a) the full graph from observed network traffic and (b) its vulnerable subgraph.
In Figure 19, graph nodes are network hosts and edges represent the set of all network flows for a given source and destination host. Edges are colored dark (black) if they include at least one (known) vulnerable service on the destination host; otherwise, edges are light colored (blue). Figure 19(b) shows the vulnerable subgraph for this network, that is, only those nodes and edges with at least one vulnerable service reachable from source host to destination host.
In the network of Figure 19, an adversary group (the ARES red team) has an initial presence on three hosts (red rectangles) in a validation exercise. In this exercise, the threat situation is that the red team starts from their initial presence, that is, the hosts marked threat source in Figure 19(b), and then moves laterally through the network until reaching the two hosts marked crown jewels.
In Figure 20(a), the graph is constrained to include only those vulnerable edges that lie between the threat sources (attack starts) and the crown jewels (attack goals). This represents the potential adversary lateral movement (attack paths) for this threat situation. Note that for this attack graph, there are paths from only two of the three attack starts that lead to the goals.

Optimizing microsegmentation policy over a threat/mission situation, including (a) the vulnerable graph that lies between assumed attack starts and goals and (b) genetic algorithm convergence to an optimal solution.
In Figure 20(a), the graph edges are also labeled with numbers indicating the total mission criticality for all network flows represented by each edge. There are various approaches, for example, Rebovich et al., 18 Musman et al., 19 Heinbockel et al. 20 and Schulz et al. 21 and others for analyzing and quantifying such mission dependency/criticality for network assets. In this experiment, we apply these mission criticality weights to the corresponding terms (denied edges) in the mission-impact summation defined for Sub-Objective O3.1.2 (section 3.4).
The network model and threat/mission situation in Figure 20(a) then forms the input for our optimization of microsegmentation policy. We apply evolutionary programming in the form of a genetic algorithm 22 to learn the optimal policy. In the genetic algorithm, each individual in a population represents a candidate solution to the optimization problem. Each candidate solution is a particular combination of allowed or denied edges in the network model. At each step of simulated evolution, the genetic algorithm selects individuals for reproduction based on how well they meet the objective (fitness) defined in section 3.4, that is, maximizing a given level of tradeoff between adversary effort and access to mission resources for the given threat/mission situation.
Figure 20(b) shows statistics for fitness values over time as the genetic algorithm population evolves to an optimal solution. In this case, there are 88 vulnerable edges (exploitable from attack starts to attack goals) in the attack graph. Thus, the overall search space of combinations of allowed/denied edges (policy instances) for this attack graph is 1026. The genetic algorithm execution time in this case is 14 s, for evolution over 100 generations, with a population size of 400, and tournament selection with a tournament size of two.
Figure 21 shows the resulting optimal policies computed by the genetic algorithm, for six different relative weights between maximizing adversary effort (Objective O3.1) and maximizing mission accessibility (Objective O3.2) in the fitness function. In the figure, the red (thicker) edges indicate that the vulnerable connections from source to destination are denied in the policy. The gray (thinner) edges indicate that all connections are allowed from source to destination, even the vulnerable ones.

Optimal policies for different usability vs security tradeoffs.
In Figure 21, when mission accessibility is weighted 100%, the optimal policy is to allow all edges. This is because blocking an edge makes it unavailable for the organizational mission, but no emphasis is placed on how the blocked edge contributes to increasing the adversary effort. Then, as less emphasis is placed on mission accessibility (e.g., as the threat becomes more severe), there is an optimal tradeoff for a given relative weighting between usability and security, with additional emphasis on maximizing adversary effort resulting in additional blocked edges. In each case, edges that support shorter exploitation paths (from attack starts to goals) and have lower mission criticality are preferentially blocked versus other edges. Then, at the other extreme, when the emphasis is fully on maximizing adversary effort, the optimal policy is to block all paths from attack starts to attack goals, using the minimum number of blocked edges (Objective O3.1.2 from section 3.4), since mission accessibility (Objective O3.2) has no impact on policy optimization.
Figure 22 shows the number of denied vulnerable edges for optimal policies via our genetic algorithm, sampling across the range of relative weights between maximizing adversary effort (Objective O3.1) and maximizing mission accessibility (Objective O3.2). This shows the discrete nature of this problem, that is, how particular optimal solutions span a range of weight values, switching to another solution when the fitness function crosses a particular threshold value.

Full range of tradeoff between usability and security.
Operationally, our synthesized microsegmentation policies are enforced via AWS security groups. This acts as a virtual host-based firewall for each AWS Elastic Compute Cloud (EC2) instance. 23 A deployed policy is comprised of security group rules controlling inbound and outbound traffic for each EC2 instance. This provides access control at the network transport layer (source/destination IP addresses, source/destination ports, and protocols). We then conduct ongoing red teaming exercises within a testbed environment to assess the efficacy of synthesized policies.
Figure 23 shows run times for our microsegmentation policy optimization for networks of various sizes. The run times are for a MacBook Pro with a 2.6 GHz 6-Core Intel Core i7 processor and 32 GB 2400 MHz DDR4 memory.

Performance for computing optimal microsegmentation policy.
For the results in Figure 23, the data for each network are synthesized via a generative model that learns statistical distributions from our live testbed environment (comprised of 74 hosts). While the details are omitted for brevity, we apply a genetic algorithm that learns the parameters of a scale-free generative model that provides the best fit to our live network data, as measured by triadic census. This generative model provides the ability to generate data sets of arbitrary scale for such performance testing. Figure 24 shows traffic flows for the three synthesized network data sets used as inputs for the optimization run times in Figure 23.

Synthesized networks for optimization performance testing.
In Figure 24, the host nodes (as blue text) represent the host IP addresses. Each edge (gray line) represents the presence of one or more flows from a source address to a destination address in the generated flow data. In these synthesized networks, based on the statistical model learned from our live network, there is a core region with densely interconnect hosts, with sparsely connected hosts on the periphery.
The visualizations in Figure 24 represent all flow data synthesized by our generative model (for a given optimization performance test). In the synthesized data sets, certain edges (from source to destination IP address) have at least one flow to a vulnerable service on the destination. In the optimization fitness function, such edges represent potential lateral movement by an adversary. Figure 25 shows the vulnerable subset of edges (shown as red lines) for the 500-host synthesized data set.

Vulnerable edges (potential lateral movement) for 500-node network.
The fitness function also considers a given attack scenario, in which a set of hosts is defined as potential attack starting points and another set of hosts is defined as attack goals (e.g., critical hosts prioritized for protection). In Figure 25, those hosts are marked with Threat Actor (red) and Victim Target (for attack start and goal hosts, respectively) icons from the Structured Threat Information eXpression (STIX™) language. Figure 26 shows the paths through the paths of potential lateral movement through the vulnerable subgraph of Figure 25, that is, leading from attack start host to attack goal host.

Potential lateral movement for assumed attack scenario (500-node network).
5. Summary
This paper describes an approach for synthesizing optimal network microsegmentation policy, with tunable tradeoffs between security and usability. For this, we formulate a novel optimization objective function that balances the need to maximize adversary effort (minimize cyberattack risks) while maximizing accessibility to critical network resources. We then apply this objective function for learning the optimal microsegmentation policy for a network through AI techniques (genetic algorithms).
Our objective function quantifies attacker reachability in terms of multi-step exploitation (lateral movement) through a network. We estimate the effort needed for an adversary to carry out a particular attack scenario, given the application of a particular candidate microsegmentation policy. This measures the extent to which a candidate policy solution blocks particular paths of exploitation, with a bias toward blocking shorter (less adversary effort) paths. We also measure the extent to which a given policy solution imposes restrictions on access to mission-critical services. These two measures (adversary effort and mission accessibility) support the balance between security and usability in our overall policy optimization objective function.
Our approach to cyber resiliency optimization leverages the fine level of granularity provided by microsegmentation, for limiting potential adversarial movement within a network. Our optimized microsegmentation policies are part of the MITRE Adaptive Resiliency Experimentation System (ARES), which jointly optimizes microsegmentation, authentication, policy generalization, redundancy, deception, and zero-trust architecture for adaptive intelligent cyber resiliency.
Footnotes
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
