Dataset Characteristics

Secure Water Treatment (SWaT) Dataset

Secure Water Treatment testbed to address this need. The data collected from the testbed consists of 11 days of continuous operation. 7 days’ worth of data was collected under normal operation while 4 days’ worth of data was collected with attack scenarios*.  During the data collection, all network traffic, sensor and actuator data were collected.

Characteristics of dataset

  1. Network Traffic and Cyber Physical Properties: The dataset contains all the network traffic captured during this time. The dataset also consists of all the values obtained from all the 51 sensors and actuators available in SWaT.
  2. Labelled: All the data acquired during this process are labelled according to normal and abnormal behaviours
  3. Attack Scenarios: The attacks generated for this dataset were derived through the attack models developed by our research team. The attack model considers the intent space of a CPS as an attack model. 36 attacks were launched during the 4 days and are described in the PDF.

Technical information of SWaT here.

Click here to find out how to request for the dataset.

* [Updated 24 Sep 18: Two sets of “SWaT_Dataset_Normal” – versions 0 and 1 – are provided. The datasets capture the normal state of the SWaT testbed running for seven days. In Version 0, we started recording the data when the plant was emptying the water storage tank for 30 minutes. In general, in an ICS environment, this is part of the maintenance outside normal operations. As a result of this drainage, the first 30 minutes of LIT101 data exhibits change even though there was no water in/outflow. Version 1 is derived from version 0 by removing the first 30 minutes of data.]

 *********************************************************

S317 Dataset

SUTD Security Showdown (S3) has been organised consecutively for two years since 2016. S3 has enabled researchers and practitioners to assess the effectiveness of methods and products aimed at detecting cyber attacks launched in real-time on an operational water treatment plant, namely, Secure Water Treatment (SWaT). In S3 independent attack teams design and launch attacks on SWaT while defence teams protect the plant passively and raise alarms upon attack detection. Attack teams are scored according to how successful they are in performing attacks based on specific intents while the defence teams are scored based on the effectiveness of their methods to detect the attacks.

Characteristics of dataset

  1. Network ‘pcap’ files for three days during the S3 in 2017 (i.e. S317)
  2. Historian data for three days during S317
  3. Attack scenarios performed by the participants

Click here to find out how to request for the dataset.

********************************************************* 

Water Distribution (WADI) Dataset

Similar to the SWaT dataset, the data collected from the Water Distribution testbed consists of 16 days of continuous operation, of which 14 days’ worth of data was collected under normal operation and 2 days with attack scenarios. During the data collection, sensor and actuator data were collected.

Characteristics of dataset

  1. Cyber Physical Properties: The dataset consists of all the values obtained from all the 103 sensors and actuators available in WADI.
  2. Attack Scenarios: The attacks generated for this dataset were derived from the attack models developed by our research team. The attack model considers the intent space of a CPS as an attack model. 15 attacks were launched during the 2 days.

Click here to find out how to request for the dataset.

*********************************************************

BATADAL dataset *

In addition to the WADI dataset, our faculty and researchers, in collaboration with colleagues from Israel and Cyprus, organised the BATtle of Attack Detection Algorithms (BATADAL), a competition to objectively compare the performance of algorithms for the detection of cyber attacks on water distribution systems. More information of the competition can be found here

Characteristics of dataset

  1. Training Dataset 1: This dataset was generated from a one-year long simulation. The dataset does not contain any attacks, i.e. all the data pertains to C-Town normal operations.
  2. Training Dataset 2: This dataset is around 6 months long and contains several attacks, some of which are approximately labelled.

Click here to find out how to request for the dataset.

********************************************************* 

EPIC dataset

EPIC dataset is collected from the EPIC testbed consists of eight scenarios under normal operating condition. Data was collected for 30 mins of operation under each scenario. Sensor and actuator data were collected and stored as a CSV file, and the network traffic is collected in ‘pcap’ files.

Characteristics of dataset

Scenario 1

  • Synchronization without load
  • Angle difference between two generators G1 & G2 from -180 to 0 to 180 degree

Scenario 2

  • Synchronization with 10kW resistive load
  • Angle difference between two generators G1 & G2 from -180 to 0 degree

Scenario 3

  • Two generators G1 & G2 running
  • With 10kW resistive load

Scenario 4

  • Two generators G1 & G2 running with PV system switched on
  • With 10kW resistive load

 

Scenario 5

  • Two generators G1 & G2 running with PV system switched on
  • With 7kW resistive load

Scenario 6

  • Three generators G1 to G3 running
  • With 14kW resistive load

Scenario 7

  • Two generators G1 & G2 running
  • Supplying power to iTrust’s Secure Water Treatment (SWaT) testbed

Scenario 8

  • Two generators G1 & G2 running
  • Supplying power to iTrust’s SWaT and Water Distribution (WADI) testbeds

Click here to find out how to request for the dataset.

********************************************************* 

Blaq_0 Dataset

Blaq_0 Hackathon was first organised in January 2018 for SUTD undergraduate students. Independent attack teams design and launch attacks on the EPIC testbed. Attack teams were scored according to how successful they are in performing attacks based on specific intents.

Characteristics of dataset
Network ‘pcap’ files for three days during the Blaq_0.

Technical information of EPIC can be found here.

Click here to find out how to request for the dataset.

********************************************************* 

Complete Set of Invariants based on Design Centric and Data Centric Approaches

1. Set of Rules with antecedent 1
2. Set of Rules with antecedent 2
3. Set of Rules with antecedent 3
4. Set of Rules with antecedent 4
5. Set of Rules with antecedent 5
6. Set of Rules with antecedent 6
7. Set of Rules with antecedent 7
8. Comparison

Publication

Goh J., Adepu S., Junejo K. N., and Mathur A., “A Dataset to Support Research in the Design of Secure Water Treatment Systems,” The 11th International Conference on Critical Information Infrastructures Security.

* If you are using the BATADAL datasets in your work, please cite the following paper as reference:

Riccardo Taormina and Stefano Galelli and Nils Ole Tippenhauer and Elad Salomons and Avi Ostfeld and Demetrios G. Eliades and Mohsen Aghashahi and Raanju Sundararajan and Mohsen Pourahmadi and M. Katherine Banks and B. M. Brentan and Enrique Campbell and G. Lima and D. Manzi and D. Ayala-Cabrera and M. Herrera and I. Montalvo and J. Izquierdo and E. Luvizotto and Sarin E. Chandy and Amin Rasekh and Zachary A. Barker and Bruce Campbell and M. Ehsan Shafiee and Marcio Giacomoni and Nikolaos Gatsis and Ahmad Taha and Ahmed A. Abokifa and Kelsey Haddad and Cynthia S. Lo and Pratim Biswas and M. Fayzul K. Pasha and Bijay Kc and Saravanakumar Lakshmanan Somasundaram and Mashor Housh and Ziv Ohar; “The Battle Of The Attack Detection Algorithms: Disclosing Cyber Attacks On Water Distribution Networks.” Journal of Water Resources Planning and Management, 144 (8), August 2018. (doi linkbib)

Dataset collection credits

The following personnel were responsible for the dataset collection:

  1. SWaT – Sridhar Adepu
  2. WADI – Venkata Reddy
  3. S317 – Nils Tippenhauer, Hamid Reza Ghaeini
  4. EPIC – Ding Liqun, Kandasamy Nandha Kumar, Chuadhry Mujeeb Ahmed
  5. BATADAL – Riccardo Taormina
  6. Blaq_0 – Francisco Furtado, Lauren Goh, Jonathan Heng