Label Sanitization against Label Flipping Poisoning Attacks

Andrea Paudice, Luis Muñoz-González, Emil C. Lupu. 2018. Label Sanitization against Label Flipping Poisoning Attacks. arXiv preprint arXiv:1803.00992.

Many machine learning systems rely on data collected in the wild from untrusted sources, exposing the learning algorithms to data poisoning. Attackers can inject malicious data in the training dataset to subvert the learning process, compromising the performance of the algorithm producing errors in a targeted or an indiscriminate way. Label flipping attacks are a special case of data poisoning, where the attacker can control the labels assigned to a fraction of the training points. Even if the capabilities of the attacker are constrained, these attacks have been shown to be effective to significantly degrade the performance of the system. In this paper we propose an efficient algorithm to perform optimal label flipping poisoning attacks and a mechanism to detect and relabel suspicious data points, mitigating the effect of such poisoning attacks.

Ensuring the resilience of WSN to Malicious Data Injections through Measurements Inspection

Vitorio Paolo Illiano, PhD Thesis

Malicious data injections pose a severe threat to the systems based on Wireless Sensor Networks (WSNs) since they give the attacker control over the measurements, and on the system’s status and response in turn. Malicious measurements are particularly threatening when used to spoof or mask events of interest, thus eliciting or preventing desirable responses. Spoofing and masking attacks are particularly difficult to detect since they depict plausible behaviours, especially if multiple sensors have been compromised and collude to inject a coherent set of malicious measurements. Previous work has tackled the problem through measurements inspection, which analyses the inter-measurements correlations induced by the physical phenomena. However, these techniques consider simplistic attacks and are not robust to collusion. Moreover, they assume highly predictable patterns in the measurements distribution, which are invalidated by the unpredictability of events. We design a set of techniques that effectively detect malicious data injections in the presence of sophisticated collusion strategies, when one or more events manifest. Moreover, we build a methodology to characterise the likely compromised sensors. We also design diagnosis criteria that allow us to distinguish anomalies arising from malicious interference and faults. In contrast with previous work, we test the robustness of our methodology with automated and sophisticated attacks, where the attacker aims to evade detection. We conclude that our approach outperforms state-of-the-a
rt approaches. Moreover, we estimate quantitatively the WSN degree of resilience and provide a methodology to give a WSN owner an assured degree of resilience by automatically designing the WSN deployment. To deal also with the extreme scenario where the attacker has compromised most of the WSN, we propose a combination with software attestation techniques, which are more reliable when malicious data is originated by a compromised software, but also more expensive, and achieve an excellent trade-off between cost and resilience.

Download Thesis from here. 

Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection

Andrea Paudice, Luis Muñoz-González, Andras Gyorgy, Emil C. Lupu. 2018. Detection of Adversarial Training Examples in Poisoning Attacks through Anomaly Detection. arXiv preprint arXiv:1802.03041.

Machine learning has become an important component for many systems and applications including computer vision, spam filtering, malware and network intrusion detection, among others. Despite the capabilities of machine learning algorithms to extract valuable information from data and produce accurate predictions, it has been shown that these algorithms are vulnerable to attacks.

 Data poisoning is one of the most relevant security threats against machine learning systems, where attackers can subvert the learning process by injecting malicious samples in the training data. Recent work in adversarial machine learning has shown that the so-called optimal attack strategies can successfully poison linear classifiers, degrading the performance of the system dramatically after compromising a small fraction of the training dataset. In this paper we propose a defence mechanism to mitigate the effect of these optimal poisoning attacks based on outlier detection. We show empirically that the adversarial examples generated by these attack strategies are quite different from genuine points, as no detectability constrains are considered to craft the attack. Hence, they can be detected with an appropriate pre-filtering of the training dataset.

Determining Resilience Gains From Anomaly Detection for Event Integrity in Wireless Sensor Networks

Vittorio P. Illiano, Andrea Paudice, Luis Muñoz-González, and Emil C. Lupu. 2018. Determining Resilience Gains From Anomaly Detection for Event Integrity in Wireless Sensor Networks. ACM Trans. Sen. Netw. 14, 1, Article 5 (February 2018), 35 pages. DOI: https://doi.org/10.1145/3176621

Abstract: Measurements collected in a wireless sensor network (WSN) can be maliciously compromised through several attacks, but anomaly detection algorithms may provide resilience by detecting inconsistencies in the data. Anomaly detection can identify severe threats to WSN applications, provided that there is a sufficient amount of genuine information. This article presents a novel method to calculate an assurance measure for the network by estimating the maximum number of malicious measurements that can be tolerated. In previous work, the resilience of anomaly detection to malicious measurements has been tested only against arbitrary attacks, which are not necessarily sophisticated. The novel method presented here is based on an optimization algorithm, which maximizes the attack’s chance of staying undetected while causing damage to the application, thus seeking the worst-case scenario for the anomaly detection algorithm. The algorithm is tested on a wildfire monitoring WSN to estimate the benefits of anomaly detection on the system’s resilience. The algorithm also returns the measurements that the attacker needs to synthesize, which are studied to highlight the weak spots of anomaly detection. Finally, this article presents a novel methodology that takes in input the degree of resilience required and automatically designs the deployment that satisfies such a requirement.

AF-Cyber: Logic-based Attribution and Forensics in Cyber Security

Connected devices will continue to grow in volume and variety. The increase of connectivity brings a drastic impact on the increase of cyber attacks. Protecting measurements are not enough, while finding who did the attack is a crucial for preventing the escalation of cyber attacks. The impact of forensics in cyber security is becoming essential for the reduction and mitigation of attacks. Forensics and attribution forensics come along with their own challenges, like the difficulties on collecting suitable evidence, and the vastness of anti-forensics tools used by the attackers to cover their traces.

The main goal of AF-Cyber is to investigate and analyse the problem of attributing cyber attacks. We plan to construct a logic-based framework for performing attribution of cyber attacks, based on cyber forensics evidence, social science approaches and an intelligent methodology for dynamic evidence collection. AF-Cyber will relieve part of the cyberattacks problem, by supporting forensics investigation and attribution with logical-based frameworks representation, reasoning and supporting tools. AF-Cyber is multi-disciplinary and collaborative, bridging forensics in cyber attacks, theoretical computer science (logics and formal proofs), security, software engineering, and social science.

AF-Cyber received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 746667.

AF-Cyber: Logic-based Attribution and Forensics in Cyber Security

Dr Karafili is officially a Marie Curie Fellow at the Department of Computing, Imperial College. She will work on the project “AF-Cyber: Logic-based Attribution and Forensics in Cyber Security“.  The project was granted by the European Union’s Horizon 2020 Research and Innovation Programme under the Marie Sklodowska-Curie grant agreement No 746667.

 

AF-Cyber: Logic-based Attribution and Forensics in Cyber Security

The main goal of AF-Cyber is to investigate and analyse the problem of attributing cyber attacks. We plan to construct a logic-based framework for performing attribution of cyber attacks, based on cyber forensics evidence, social science approaches and an intelligent methodology for dynamic evidence collection. AF-Cyber will relieve part of the cyberattacks problem, by supporting forensics investigation and attribution with logical-based frameworks representation, reasoning and supporting tools. AF-Cyber is multi-disciplinary and collaborative, bridging forensics in cyber attacks, theoretical computer science (logics and formal proofs), security, software engineering, and social science.

Improving Data Sharing in Data Rich Environments

The paper “Improving Data Sharing in Data Rich Environments” was accepted at the IEEE Big Data International Workshop on Policy-based Autonomic Data Governance (PADG), part of the 15th IEEE International Conference on Big Data (Big Data 2017), December 11-14, 2017, Boston, MA, USA. This work was done in collaboration with our partners (BAE Systems, IBM UK and IBM US) from the DAIS International Technology Alliance (ITA). The paper can be found here.

Authors: Erisa Karafili, Emil C. Lupu, Alan Cullen, Bill Williams, Saritha Arunkumar, Seraphin Calo

Abstract: The increasing use of big data comes along with the problem of ensuring correct and secure data access. There is a need to maximise the data dissemination whilst controlling their access. Depending on the type of users different qualities and parts of data are shared. We introduce an alteration mechanism, more precisely a restriction one, based on a policy analysis language. The alteration reflects the level of trust and relations the users have, and are represented as policies inside the data sharing agreements. These agreements are attached to the data and are enforced every time the data are accessed, used or shared. We show the use of our alteration mechanism with a military use case, where different parties are involved during the missions, and they have different relations of trust and partnership.

The work was supported by EPSRC Project CIPART grant no. EP/L022729/1 and DAIS ITA (Sponsored by U.S. Army Research Laboratory and the U.K. Ministry of Defence under Agreement Number W911NF-16-3-0001).

 

Tracking the Bad Guys: An Efficient Forensic Methodology To Trace Multi-step Attacks Using Core Attack Graphs

Martín Barrère, Rodrigo Vieira Steiner, Rabih Mohsen, Emil C. Lupu, Tracking the Bad Guys: An Efficient Forensic Methodology To Trace Multi-step Attacks Using Core Attack Graphs, 13th IEEE/IFIP International Conference on Network and Service Management (CNSM’17), November 2017, in Tokyo, Japan.

In this paper, we describe an efficient methodology to guide investigators during network forensic analysis. To this end, we introduce the concept of core attack graph, a compact representation of the main routes an attacker can take towards specific network targets. Such compactness allows forensic investigators to focus their efforts on critical nodes that are more likely to be part of attack paths, thus reducing the overall number of nodes (devices, network privileges) that need to be examined. Nevertheless, core graphs also allow investigators to hierarchically explore the graph in order to retrieve different levels of summarised information. We have evaluated our approach over different network topologies varying parameters such as network size, density, and forensic evaluation threshold. Our results demonstrate that we can achieve the same level of accuracy provided by standard logical attack graphs while significantly reducing the exploration rate of the network.

Naggen: a Network Attack Graph GENeration Tool

Martin Barrere and Emil C. Lupu, Naggen: a Network Attack Graph GENeration Tool, IEEE Conference on Communications and Network Security (CNS’17), October 2017, in Las Vegas, USA.

Attack graphs constitute a powerful security tool aimed at modelling the many ways in which an attacker may compromise different assets in a network. Despite their usefulness in several security-related activities (e.g. hardening, monitoring, forensics), the complexity of these graphs can massively grow as the network becomes denser and larger, thus defying their practical usability. In this presentation, we first describe some of the problems that currently challenge the practical use of attack graphs. We then explain our approach based on core attack graphs, a novel perspective to address attack graph complexity. Finally, we present Naggen, a tool for generating, visualising and exploring core attack graphs. We use Naggen to show the advantages of our approach on different security applications.

Bayesian Attack Graphs for Security Risk Assessment

Attack graphs offer a powerful framework for security risk assessment. They provide a compact representation of the attack paths that an attacker can follow to compromise network resources from the analysis of the network topology and vulnerabilities. The uncertainty about the attacker’s behaviour makes Bayesian networks suitable to model attack graphs to perform static and dynamic security risk assessment. Thus, whilst static analysis of attack graphs considers the security posture at rest, dynamic analysis accounts for evidence of compromise at run-time, helping system administrators to react against potential threats. In this paper, we introduce a Bayesian attack graph model that allows to estimate the probabilities of an attacker compromising different resources of the network. We show how exact and approximate inference techniques can be efficiently applied on Bayesian attack graph models with thousands of nodes.

Luis Muñoz-González, Emil C. Lupu, “Bayesian Attack Graphs for Security Risk Assessment.” IST-153 NATO Workshop on Cyber Resilience, 2017.

Argumentation-based Security for Social Good

The paper “Argumentation-based Security for Social Good” presented at the AAAI Spring Symposia 2017 is now available at the AAAI Technical Report.

Title: Argumentation-Based Security for Social Good

Authors: Erisa Karafili, Antonis C. Kakas, Nikolaos I. Spanoudakis, Emil C. Lupu

Abstract: The increase of connectivity and the impact it has in ever day life is raising new and existing security problems that are becoming important for social good. We introduce two particular problems: cyber attack attribution and regulatory data sharing. For both problems, decisions about which rules to apply, should be taken under incomplete and context dependent information. The solution we propose is based on argumentation reasoning, that is a well suited technique for implementing decision making mechanisms under conflicting and incomplete information. Our proposal permits us to identify the attacker of a cyber attack and decide the regulation rule that should be used while using and sharing data. We illustrate our solution through concrete examples.

The paper can be found in the following link: https://aaai.org/ocs/index.php/FSS/FSS17/paper/view/15928/15306

A video of the presentation can be found in the workshop page AI for Social Good and also in following link: https://youtu.be/wYg8jaHPbyw?t=33m33s

An argumentation reasoning approach for data processing

The paper “An argumentation reasoning approach for data processing” is now published in the Elsevier Journal Computers in Industry.

Title: An argumentation reasoning approach for data processing

Authors: Erisa Karafili, Konstantina Spanaki, Emil C. Lupu

Abstract: Data-intensive environments enable us to capture information and knowledge about the physical surroundings, to optimise our resources, enjoy personalised services and gain unprecedented insights into our lives. However, to obtain these endeavours extracted from the data, this data should be generated, collected and the insight should be exploited. Following an argumentation reasoning approach for data processing and building on the theoretical background of data management, we highlight the importance of data sharing agreements (DSAs) and quality attributes for the proposed data processing mechanism. The proposed approach is taking into account the DSAs and usage policies as well as the quality attributes of the data, which were previously neglected compared to existing methods in the data processing and management field. Previous research provided techniques towards this direction; however, a more intensive research approach for processing techniques should be introduced for the future to enhance the value creation from the data and new strategies should be formed around this data generated daily from various devices and sources.

This work was supported by FP7 EU-funded project Coco Cloud grant no.: 610853, and EPSRC Project CIPART grant no. EP/L022729/1.

The paper can be found in the following link as Open Access: http://www.sciencedirect.com/science/article/pii/S016636151730338X

Towards Poisoning Deep Learning Algorithms with Back-gradient Optimization

Luis Muñoz-González, Battista Biggio, Ambra Demontis, Andrea Paudice, Vasin Wongrassamee, Emil C. Lupu, Fabio Roli. “Towards Poisoning Deep Learning Algorithms with Back-gradient Optimization.” Workshop on Artificial Intelligence and Security (AISec), 2017.

A number of online services nowadays rely upon machine learning to extract valuable information from data collected in the wild. This exposes learning algorithms to the threat of data poisoning, i.e., a coordinate attack in which a fraction of the training data is controlled by the attacker and manipulated to subvert the learning process. To date, these attacks have been devised only against a limited class of binary learning algorithms, due to the inherent complexity of the gradient-based procedure used to optimize the poisoning points (a.k.a. adversarial training examples).[/ezcol_2third]

In this work, we first extend the definition of poisoning attacks to multi-class problems. We then propose a novel poisoning algorithm based on the idea of back-gradient optimization, i.e., to compute the gradient of interest through automatic differentiation, while also reversing the learning procedure to drastically reduce the attack complexity. Compared to current poisoning strategies, our approach is able to target a wider class of learning algorithms, trained with gradient-based procedures, including neural networks and deep learning architectures. We empirically evaluate its effectiveness on several application examples, including spam filtering, malware detection, and handwritten digit recognition. We finally show that, similarly to adversarial test examples, adversarial training examples can also be transferred across different learning algorithms.

This work has been done in collaboration with the PRA Lab in the University of Cagliari, Italy.

Efficient Attack Graph Analysis through Approximate Inference

Luis Muñoz-González, Daniele Sgandurra, Andrea Paudice, Emil C. Lupu. “Efficient Attack Graph Analysis through Approximate Inference.” ACM Transactions on Privacy and Security, vol. 20(3), pp. 1-30, 2017.

Attack graphs provide compact representations of the attack paths an attacker can follow to compromise network resources from the analysis of network vulnerabilities and topology. These representations are a powerful tool for security risk assessment. Bayesian inference on attack graphs enables the estimation of the risk of compromise to the system’s components given their vulnerabilities and interconnections and accounts for multi-step attacks spreading through the system. While static analysis considers the risk posture at rest, dynamic analysis also accounts for evidence of compromise, for example, from Security Information and Event Management software or forensic investigation. However, in this context, exact Bayesian inference techniques do not scale well. In this article, we show how Loopy Belief Propagation—an approximate inference technique—can be applied to attack graphs and that it scales linearly in the number of nodes for both static and dynamic analysis, making such analyses viable for larger networks. We experiment with different topologies and network clustering on synthetic Bayesian attack graphs with thousands of nodes to show that the algorithm’s accuracy is acceptable and that it converges to a stable solution. We compare sequential and parallel versions of Loopy Belief Propagation with exact inference techniques for both static and dynamic analysis, showing the advantages and gains of approximate inference techniques when scaling to larger attack graphs.

Exact Inference Techniques for the Analysis of Bayesian Attack Graphs

Luis Muñoz-González, Daniele Sgandurra, Martín Barrere, and Emil C. Lupu. “Exact Inference Techniques for the Analysis of Bayesian Attack Graphs.” IEEE Transactions on Dependable and Secure Computing (TDSC), 16(2), pp. 231-234, 2019.

Attack graphs are a powerful tool for security risk assessment by analysing network vulnerabilities and the paths attackers can use to compromise network resources. The uncertainty about the attacker’s behaviour makes Bayesian networks suitable to model attack graphs to perform static and dynamic analysis. Previous approaches have focused on the formalization of attack graphs into a Bayesian model rather than proposing mechanisms for their analysis. In this paper we propose to use efficient algorithms to make exact inference in Bayesian attack graphs, enabling the static and dynamic network risk assessments. To support the validity of our approach we have performed an extensive experimental evaluation on synthetic Bayesian attack graphs with different topologies, showing the computational advantages in terms of time and memory use of the proposed techniques when compared to existing approaches.

SECRIS: Security Risk Assessment of IoT Environments with Attack Graph Models

IoT environments are vulnerable: many devices can be accessed physically and are not designed with security in mind. It is often impractical to patch all the vulnerabilities or to eliminate all possible threats. Unlike more traditional computing systems IoT environments bring together the physical, human and cyber aspects of a system. Each can be used to compromise the other and each can contribute towards monitoring and protecting the other.

Given the complexity of possible attacks, techniques for identifying and assessing the security risk are needed. In traditional networked environments attack graphs have been proven as a powerful tool for representing the different paths through which a system can be compromised. In this project we propose to design a new generation of attack graph models capable of describing the attack surface of modern IoT infrastructures for smart buildings. We are investigating new mechanisms to reduce the complexity of the attack graph representations and efficient algorithms for their analysis.

 

Ransomware Dataset

Ransomware has become one of the most prominent threats in cyber-security and recent attacks has shown the sophistication and impact of this class of malware. In essence, ransomware aims to render the victim’s system unusable by encrypting important files, and then, ask the user to pay a ransom to revert the damage. Several ransomware include sophisticated packing techniques, and are hence difficult to statically analyse. In our previous work, we developed EldeRan, a machine learning approach to analyse and classify ransomware dynamically. EldeRan monitors a set of actions performed by applications in their first phases of installation checking for characteristics signs of ransomware.

You can download here the dataset we collected and analysed with Cuckoo sandbox, which includes 582 samples of ransomware and 942 good applications.

Further details about the dataset can be found in the paper:

Daniele Sgandurra, Luis Muñoz-González, Rabih Mohsen, Emil C. Lupu. “Automated Analysis of Ransomware: Benefits, Limitations, and use for Detection.” In arXiv preprints arXiv:1609.03020, 2016.

Please, if you use our data set don’t forget to reference our work. You can copy the BIBTEX link here.

Jukka Soikkeli

Jukka is currently a PhD student in the RISS group participating in the HiPEDS CDT. His current research interests are in ways to measure and improve the cyber-resilience of partially compromised networks, and in network risk analysis using attack graphs. Jukka holds an MSc in Computing Science from Imperial College London, and degrees in economics from the University of Oxford and the University of St Andrews.