POMDP + Information-Decay: Incorporating Defender's Behaviour in Autonomous Penetration Testing

Jonathon Schwartz, Hanna Kurniawati, Edwin El-Mahassni

PosterID: 12 PDF Slides Poster BibTeX

Penetration testing (pen-testing) aims to assess vulnerabilities in a computer network by emulating possible attacks. Autonomous pen-testing allows for frequent and regular pentesting to be performed, which is increasingly necessary as networks become larger and more complex. Autonomous pen-testing is essentially a planning under uncertainty problem, where the uncertainty is caused by lack of reliability of attack tools, partial observability of the network, and possible changes in the network that are triggered by the network administrator (the defender). Autonomous pen-testing approaches that account for the first two causes of uncertainty have been developed based on the mathematically principled framework, Partially Observable Markov Decision Process (POMDP). However, they do not account for the third type of uncertainty. Work that accounts for the defender’s actions does not account for partial observability. This paper proposes a POMDP-based autonomous pen-testing framework that accounts for the defender’s behaviour. Key to our model is the observation that the defender’s actions can be abstracted into two types: Network analysis, which does not alter the network, and Counter-attacks, which alter the network. This observation enables us to represent the defender’s behavior as a single variable: An information decay factor. This decay factor represents the expected time for the defender to move from analysing the network to performing counter-attack(s), and therefore represents the decay of a pen-tester’s understanding about the network. We propose D-PenTesting, which assumes that the decay factor is known prior to execution, and LD-PenTesting, which learns the decay factor as it attempts to break into the network. LD-PenTesting adopts the Bayesian Reinforcement Learning framework and casts the problem as yet another POMDP. Simulation tests on two benchmark scenarios indicate that D-PenTesting and LD-PenTesting outperform autonomous pen-testers that do not account for the defender’s behavior and is more robust than those that assume the defender is optimal.

Session Aus1: Non-deterministic & Probabilistic Planning
Canb 10/27/2020, 20:00 – 21:00
10/30/2020, 11:00 – 12:00
Paris 10/27/2020, 10:00 – 11:00
10/30/2020, 01:00 – 02:00
NYC 10/27/2020, 05:00 – 06:00
10/29/2020, 20:00 – 21:00
LA 10/27/2020, 02:00 – 03:00
10/29/2020, 17:00 – 18:00