Resilience

Identifying safety-critical attacks targeting cyber-physical systems: a systems theoretic approach

Luca Maria Castiglione

Abstract

Over the last decades, society has witnessed a sharp increase in the use of complex and interconnected computer systems to monitor and assist in several aspects of everyday life. As operators of safety-critical systems deploy network-enabled devices aiming to enhance connectivity and streamline remote operations, the attack surface of these systems has increased. Whilst more attacks are becoming possible, only some of them will impact safety. Identifying such critical attacks is our priority; unfortunately, the complexity of modern cyber-physical systems (CPSs) renders this task challenging. Sophisticated attacks often rely on the effect of apparently legitimate commands, which can trigger cascading effects within the CPS itself, rendering it vulnerable to further attacks and causing harm. To help prevent these scenarios, tools and methodologies need to be developed that support integrated safety and security analysis in the context of CPS also considering their behaviours and internal dynamics. We present Cassandra, a novel methodology to identify safety-critical threat scenarios and reason about their risk and applicable security measures in specific deployment contexts. Unlike other methodologies, Cassandra leverages existing relations between high-level threats and the system architecture to identify safety-critical attack paths. The qualitative and quantitative analysis of the paths found allows us to estimate the risk associated with safety-critical attacks, identify applicable security controls, and evaluate their effectiveness. Cassandra offers an integrated set of tools that enable the automated derivation of safety-critical sequences of threats and their respective attack paths. This provides an important step towards making integrated safety and security analyses less subjective, more reproducible and thus more suitable for applications in safety-critical contexts. We have applied Cassandra to analyse the safe operation of safety-critical systems in three distinct use cases, including railway traffic control, power grid, and avionics. The scenarios analysed progressively increase in complexity and mirroring of real-world conditions.

RESICS : Resilience and Safety to attacks in Industrial Control and Cyber-Physical Systems

We all critically depend on and use digital systems that sense and control physical processes and environments. Electricity, gas, water, and other utilities require the continuous operation of both national and local infrastructures. Industrial processes, for example for chemical manufacturing, production of materials and manufacturing chains similarly lie at this intersection of the digital and the physical. This intersection also applies in other CPS such as robots, autonomous cars, and drones. Ensuring the resilience of such systems, their survivability and continued operation when exposed to malicious threats requires the integration of methods and processes from security analysis, safety analysis, system design and operation that have traditionally been done separately and that each involve specialist skills and a significant amount of human effort. This is not only costly, but also error prone and delays response to security events. 

RESICS aims to significantly advance the state-of-the-art and deliver novel contributions that facilitate:

  • Risk analysis in the face of adversarial threats taking into account the impact of security events across cascading inter-dependencies
  • Characterising attacks that can have an impact on system safety and identifying the paths that make such attacks possible
  • Identifying countermeasures that can be applied to mitigate threats and contain the impact of attacks
  • Ensuring that such countermeasures can be applied whilst preserving the system’s safety and operational constraints and maximising its availability.

These contributions will be evaluated across several test beds, digital twins, a cyber range and a number of use-cases across different industry sectors.

To achieve these goals RESICS will combine model-driven and empirical approaches across both security and safety analysis, adopting a systems-thinking approach which emphasises Security, Safety and Resilience as emerging properties of the system. RESICS leverages preliminary results in the integration of safety and security methodologies with the application of formal methods and the combination of model-based and empirical approaches to the analysis of inter-dependencies in ICSs and CPSs.

Funded by DSTL, this is a joint project between the Resilient Information Systems Security (RISS) Group at Imperial College and the Bristol Cyber Security Group. The work will be conducted in collaboration with: Adelard (part of NCC Group), Airbus, Qinetiq, Reperion, Siemens, Thales as industry partners and CMU, University of Naples and SUTD as academic partners. The project is affiliated with the Research Institute in Trustworthy Inter-Connected Cyber-Physical Systems (RITICS)

Project Publications

  • L. M. Castiglione, S. Guerra, E. C. Lupu, Automated Identification of Safety-Critical Attacks against CPS and Generation of Assurance Case Fragments. To be presented at Safety Critical Systems Symposium SSS’25.
  • Mathuros, Kornkamon, Sarad Venugopalan, and Sridhar Adepu. “WaXAI: Explainable Anomaly Detection in Industrial Control Systems and Water Systems.” Proceedings of the 10th ACM Cyber-Physical System Security Workshop. 2024. Awarded Best paper Award.
  • Ruizhe Wang, Sarad Venugopalan and Sridhar Adepu. “Safety Analysis for Cyber-Physical Systems under Cyber Attacks Using Digital Twin” in IEEE Cyber Security and Resilience 2024.

Other relevant publications

Presentations

Ensuring the resilience of wireless sensor networks to malicious data injections through measurements inspection

Vittorio Illiano

Malicious data injections pose a severe threat to the systems based on Wireless Sensor Networks (WSNs) since they give the attacker control over the measurements, and on the system’s status and response in turn. Malicious measurements are particularly threatening when used to spoof or mask events of interest, thus eliciting or preventing desirable responses. Spoofing and masking attacks are particularly difficult to detect since they depict plausible behaviours, especially if multiple sensors have been compromised and collude to inject a coherent set of malicious measurements. Previous work has tackled the problem through measurements inspection, which analyses the inter-measurements correlations induced by the physical phenomena. However, these techniques consider simplistic attacks and are not robust to collusion. Moreover, they assume highly predictable patterns in the measurements distribution, which are invalidated by the unpredictability of events. We design a set of techniques that effectively detect malicious data injections in the presence of sophisticated collusion strategies, when one or more events manifest. Moreover, we build a methodology to characterise the likely compromised sensors. We also design diagnosis criteria that allow us to distinguish anomalies arising from malicious interference and faults. In contrast with previous work, we test the robustness of our methodology with automated and sophisticated attacks, where the attacker aims to evade detection. We conclude that our approach outperforms state-of-the-art approaches. Moreover, we estimate quantitatively the WSN degree of resilience and provide a methodology to give a WSN owner an assured degree of resilience by automatically designing the WSN deployment. To deal also with the extreme scenario where the attacker has compromised most of the WSN, we propose a combination with software attestation techniques, which are more reliable when malicious data is originated by a compromised software, but also more expensive, and achieve an excellent trade-off between cost and resilience.

Compositional behaviour and reliability models for adaptive component-based architectures

Pedro Rodrigues Fonseca

The increasing scale and distribution of modern pervasive computing and service-based platforms makes manual maintenance and evolution difficult and too slow. Systems should therefore be designed to self-adapt in response to environment changes, which requires the use of on-line models and analysis. Although there has been a considerable amount of work on architectural modelling and behavioural analysis of component-based systems, there is a need for approaches that integrate the architectural, behavioural and management aspects of a system. In particular, the lack of support for composability in probabilisitic behavioural models prevents their systematic use for adapting systems based on changes in their non-functional properties. Of these non-functional properties, this thesis focuses on reliability. We introduce Probabilistic Component Automata (PCA) for describing the probabilistic behaviour of those systems. Our formalism simultaneously overcomes three of the main limitations of existing work: it preserves a close correspondence between the behavioural and architectural views of a system in both abstractions and semantics; it is composable as behavioural models of composite components are automatically obtained by combining the models of their constituent parts; and lastly it is probabilistic thereby enabling analysis of non-functional properties. PCA also provide constructs for representing failure, failure propagation and failure handling in component-based systems in a manner that closely corresponds to the use of exceptions in programming languages. Although PCA is used throughout this thesis for reliability analysis, the model can also be seen as an abstract process algebra that may be applicable for analysis of other system properties. We further show how reliability analysis based on PCA models can be used to perform architectural adaptation on distributed component-based systems and evaluate the computational cost of decentralised adaptation decisions. To mitigate the state-explosion problem associated with composite models, we further introduce an algorithm to reduce a component’s PCA model to one that only represents its interface behaviour. We formally show that such model preserves the properties of the original representation. By experiment, we show that the reduced models are significantly smaller than the original, achieving a reduction of more than 80\% on both the number of states and transitions. A further benefit of the approach is that it allows component profiling and probabilistic interface behaviour to be extracted independently for each component, thereby enabling its exchange between different organisations without revealing commercially sensitive aspects of the components’ implementations. The contributions and results of this work are evaluated both through a series of small scale examples and through a larger case study of an e-Banking application derived from Java EE training materials. Our work shows how probabilistic non-functional properties can be integrated with the architectural and behavioural models of a system in an intuitive and scalable way that enables automated architecture reconfiguration based on reliability properties using composable models.

Improving resilience to cyber-attacks by analysing system output impacts and costs

Jukka Soikkeli

Abstract

Cyber-attacks cost businesses millions of dollars every year, a key component of which is the cost of business disruption from system downtime. As cyber-attacks cannot all be prevented, there is a need to consider the cyber resilience of systems, i.e. the ability to withstand cyber-attacks and recover from them.

Previous works discussing system cyber resilience typically either offer generic high-level guidance on best practices, provide limited attack modelling, or apply to systems with special characteristics. There is a lack of an approach to system cyber resilience evaluation that is generally applicable yet provides a detailed consideration for the system-level impacts of cyber-attacks and defences.

We propose a methodology for evaluating the effectiveness of actions intended to improve resilience to cyber-attacks, considering their impacts on system output performance, and monetary costs. It is intended for analysing attacks that can disrupt the system function, and involves modelling attack progression, system output production, response to attacks, and costs from cyber-attacks and defensive actions.

Studies of three use cases demonstrate the implementation and usefulness of our methodology. First, in our redundancy planning study, we considered the effect of redundancy additions on mitigating the impacts of cyber-attacks on system output performance. We found that redundancy with diversity can be effective in increasing resilience, although the reduction in attack-related costs must be balanced against added maintenance costs. Second, our work on attack countermeasure selection shows that by considering system output impacts across the duration of an attack, one can find more cost-effective attack responses than without such considerations. Third, we propose an approach to mission viability analysis for multi-UAV deployments facing cyber-attacks, which can aid resource planning and determining if the mission can conclude successfully despite an attack. We provide different implementations of our model components, based on use case requirements.

Analyzing the Viability of UAV Missions Facing Cyber Attacks

With advanced video and sensing capabilities, un-occupied aerial vehicles (UAVs) are increasingly being usedfor numerous applications that involve the collaboration andautonomous operation of teams of UAVs. Yet such vehiclescan be affected by cyber attacks, impacting the viability oftheir missions. We propose a method to conduct mission via-bility analysis under cyber attacks for missions that employa team of several UAVs that share a communication network.We apply our method to a case study of a survey mission ina wildfire firefighting scenario. Within this context, we showhow our method can help quantify the expected missionperformance impact from an attack and determine if themission can remain viable under various attack situations.Our method can be used both in the planning of themission and for decision making during mission operation.Our approach to modeling attack progression and impactanalysis with Petri nets is also more broadly applicable toother settings involving multiple resources that can be usedinterchangeably towards the same objective.

J. Soikkeli, C. Perner and E. Lupu, “Analyzing the Viability of UAV Missions Facing Cyber Attacks,” in 2021 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), Vienna, Austria, 2021 pp. 103-112.
doi: 10.1109/EuroSPW54576.2021.00018