Abstract: The frequency and harmfulness of cyber-attacks are increasing every day, and with them also the amount of data that the cyber-forensics analysts need to collect and analyze. In this paper, we propose a formal analysis process that allows an analyst to filter the enormous amount of evidence collected and either identify crucial information about the attack (e.g., when it occurred, its culprit, its target) or, at the very least, perform a pre-analysis to reduce the complexity of the problem in order to then draw conclusions more swiftly and efficiently. We introduce the Evidence Logic EL for representing simple and derived pieces of evidence from different sources. We propose a procedure, based on monotonic reasoning, that rewrites the pieces of evidence with the use of tableau rules, based on relations of trust between sources and the reasoning behind the derived evidence, and yields a consistent set of pieces of evidence. As proof of concept, we apply our analysis process to a concrete cyber-forensics case study.
Building trustworthy systems that themselves rely on, or integrate, semi-trusted information sources is a challenging aim, but doing so allows us to make good use of floods of information continuously contributed by individuals and small organisations. This paper addresses the problem of quickly and efficiently acquiring high quality meta-data from human contributors, in order to support crowdsensing applications.
Crowdsensing (or participatory sensing) applications have been used to sense, measure and map a variety of phenomena, including: individuals’ health, mobility & social status; fuel & grocery prices; air quality & pollution levels; biodiversity; transport infrastructure; and route-planning for drivers & cyclists. Crowdsensing applications have an on-going requirement to turn raw data into useful knowledge, and to achieve this, many rely on prompt human generated meta-data to support and/or validate the primary data payload. These human contributions are inherently error prone and subject to bias and inaccuracies, so multiple overlapping labels are needed to cross-validate one another. While probabilistic inference can be used to reduce the required label overlap, there is a particular need in crowdsensing to minimise the overhead and improve the accuracy of timely label collection. This paper presents three general algorithms for efficient human meta-data collection, which support different constraints on how the central authority collects contributions, and three methods to intelligently pair annotators with tasks based on formal information theoretic principles. We test our methods’ performance on challenging synthetic data-sets, based on r eal data, and show that our algorithms can significantly lower the cost and improve the accuracy of human meta-data labelling, with a corresponding increase in the average novel information content from new labels.