The University of Passau (represented by the Chair for Reliable Distributed Systems and the Chair for Data Science), Innowerk-IT GmbH and G DATA CyberDefense AG launched a research project to improve the state-of-the-art of in-memory analysis on Windows and Linux. The project „Synthesizing ML training data in the IT security domain for VMI-based attack detection and analysis“ (SmartVMI) is funded by the Federal Ministry of Education and Research (BMBF) and coordinated by the German Aerospace Center (DLR).
The SmartVMI project is dedicated to improving artificial intelligence (AI)-based attack detection, enabling attack defense, analysis, and digital forensics by generating tailored synthetic attack patterns. This will enable the simulation of novel attack scenarios and the testing of existing attack detection and analysis mechanisms as well as the optimization of these mechanisms for new attacks.
Objectives
During the project a Virtual-Machine-Introspection (VMI) engine will be developed, which allows to record system and API calls transparently for the guest OS. This has the advantage that potential malware cannot detect any analysis components in the system. In addition, access from outside the VM allows any memory area to be read and manipulated, regardless of the security mechanisms of the operating system. Special attention is paid to the execution speed of the VM in combination with VMI. The goal here is to generate as little overhead as possible so that VMI does not artificially increase the runtime.
The behavioral data obtained with the VMI engine will be used within the SmartVMI project for training machine learning algorithms. The goal is to abstract malicious behavior and thus allow generic detection of new malware variants. These new insights will be used by GDATA to protect its customers and form a central part in the detection of new threats without human interaction.
Analysis, accelerated
In addition to the detection of malware, methods for the automatic analysis of memory images with machine learning algorithms are being developed. The goal is to detect data structures such as processes in an arbitrary memory image of an operating system in a fully automated manner. This should help to close the semantic gap between a raw memory image and its meaningful interpretation. This process is currently only possible with a lot of manual work and corresponding debug symbols of the operating systems and requires an adjustment with each new operating system version.
All tools developed in the course of the SmartVMI project will be made available under an open-source license and validated training data obtained will be published in the public data set model. The code will be released on Github.
G DATA CyberDefense uses the outcome of the research project to improve the Machine-Learning based NextGen detection technologies as well as to further increase the protection of our customers from novel attacks.
BMBF support code: 01IS21063A-C
Project Runtime: 01.10.2021 - 30.09.2023