Evolving Reinforcement Learning Environment to Minimize Learner's Achievable Reward: An Application on Hardening Active Directory Systems

Goel, D.; Neumann, A.; Neumann, F.; Nguyen, H.; Guo, M.

Please use this identifier to cite or link to this item: https://hdl.handle.net/2440/139311

Scopus	Web of Science®	Altmetric
Citations
?	?

Full metadata record

DC Field	Value	Language
dc.contributor.author	Goel, D.	-
dc.contributor.author	Neumann, A.	-
dc.contributor.author	Neumann, F.	-
dc.contributor.author	Nguyen, H.	-
dc.contributor.author	Guo, M.	-
dc.contributor.editor	Paquete, L.	-
dc.date.issued	2023	-
dc.identifier.citation	Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '23), 2023 / Paquete, L. (ed./s), pp.1348-1356	-
dc.identifier.isbn	9798400701191	-
dc.identifier.uri	https://hdl.handle.net/2440/139311	-
dc.description.abstract	We study a Stackelberg game between one attacker and one defender in a configurable environment. The defender picks a specific environment configuration. The attacker observes the configuration and attacks via Reinforcement Learning (RL trained against the observed environment). The defender's goal is to find the environment with minimum achievable reward for the attacker. We apply Evolutionary Diversity Optimization (EDO) to generate diverse population of environments for training. Environments with clearly high rewards are killed off and replaced by new offsprings to avoid wasting training time. Diversity not only improves training quality but also fits well with our RL scenario: RL agents tend to improve gradually, so a slightly worse environment earlier on may become better later. We demonstrate the effectiveness of our approach by focusing on a specific application, Active Directory (AD). AD is the default security management system for Windows domain networks. AD environment describes an attack graph, where nodes represent computers/accounts/etc., and edges represent accesses. The attacker aims to find the best attack path to reach the highest-privilege node. The defender can change the graph by removing a limited number of edges (revoke accesses). Our approach generates better defensive plans than the existing approach and scales better.	-
dc.description.statementofresponsibility	Diksha Goel, Aneta Neumann, Frank Neumann, Hung Nguyen, Mingyu Guo	-
dc.language.iso	en	-
dc.publisher	Association for Computing Machinery	-
dc.rights	© 2023 by the Association for Computing Machinery, Inc. (ACM).	-
dc.source.uri	https://dl.acm.org/doi/proceedings/10.1145/3583131	-
dc.subject	Active directory; reinforcement learning; evolutionary diversity optimization; attack graph	-
dc.title	Evolving Reinforcement Learning Environment to Minimize Learner's Achievable Reward: An Application on Hardening Active Directory Systems	-
dc.type	Conference paper	-
dc.contributor.conference	Genetic and Evolutionary Computation Conference (GECCO 2023) (15 Jul 2023 - 15 Jul 2023 : Lisbon, Portugal)	-
dc.identifier.doi	10.1145/3583131.3590436	-
dc.publisher.place	New York, NY	-
dc.relation.grant	http://purl.org/au-research/grants/arc/DP190103894	-
dc.relation.grant	http://purl.org/au-research/grants/arc/FT200100536	-
pubs.publication-status	Published	-
dc.identifier.orcid	Goel, D. [0000-0001-8212-8793]	-
dc.identifier.orcid	Neumann, A. [0000-0002-0036-4782]	-
dc.identifier.orcid	Neumann, F. [0000-0002-2721-3618]	-
dc.identifier.orcid	Nguyen, H. [0000-0003-1028-920X]	-
dc.identifier.orcid	Guo, M. [0000-0002-3478-9201]	-
Appears in Collections:	Computer Science publications

Files in This Item:

There are no files associated with this item.

Show simple item record

Adelaide Research & Scholarship