Dataset Characteristics

Secure Water Treatment (SWaT) Dataset

Secure Water Treatment testbed to address this need. The data collected from the testbed consists of 11 days of continuous operation. 7 days’ worth of data was collected under normal operation while 4 days’ worth of data was collected with attack scenarios*.  During the data collection, all network traffic, sensor and actuator data were collected.

Characteristics of dataset

  1. Network Traffic and Cyber Physical Properties: The dataset contains all the network traffic captured during this time. The dataset also consists of all the values obtained from all the 51 sensors and actuators available in SWaT.
  2. Labelled: All the data acquired during this process are labelled according to normal and abnormal behaviours
  3. Attack Scenarios: The attacks generated for this dataset were derived through the attack models developed by our research team. The attack model considers the intent space of a CPS as an attack model. 41 attacks were launched during the 4 days and are described in the PDF.

Technical information of SWaT here.

Click here to find out how to request for the dataset.

* [Updated 24 Sep 18: Two sets of “SWaT_Dataset_Normal” – versions 0 and 1 – are provided. The datasets capture the normal state of the SWaT testbed running for seven days. In Version 0, we started recording the data when the plant was emptying the water storage tank for 30 minutes. In general, in an ICS environment, this is part of the maintenance outside normal operations. As a result of this drainage, the first 30 minutes of LIT101 data exhibits change even though there was no water in/outflow. Version 1 is derived from version 0 by removing the first 30 minutes of data.]

[Updated 14 Aug 19: A new set of SWaT dataset, collected in Jul 2019, is available for downloading. This set includes 3 hours of SWaT running under normal operating condition and 1 hour in which 6 attacks were carried out. Those who have previously received SWaT dataset download link can use the same link to access this new dataset.]

[Updated 23 Oct 19: We received queries on the SWaT dataset, collected in Jul 2019, e.g., under LS 201 the fields were recorded as “{u’IsSystem’: False, u’Name’: u’Inactive’, u’Value’: 0}”. These fields have been updated to “Active” or “Inactive” and the dataset saved as version 2. The fields’ definitions are provided in the “readme.docx” document that was shared along with the dataset. Those who were given the link to download the dataset previously can download the new files using the same link.]

This set includes 3 hours of SWaT running under normal operating condition and 1 hour in which 6 attacks were carried out. Those who have previously received SWaT dataset download link can use the same link to access this new dataset.]

 *********************************************************

S317 Dataset

SUTD Security Showdown (S3) has been organised consecutively for two years since 2016. S3 has enabled researchers and practitioners to assess the effectiveness of methods and products aimed at detecting cyber attacks launched in real-time on an operational water treatment plant, namely, Secure Water Treatment (SWaT). In S3 independent attack teams design and launch attacks on SWaT while defence teams protect the plant passively and raise alarms upon attack detection. Attack teams are scored according to how successful they are in performing attacks based on specific intents while the defence teams are scored based on the effectiveness of their methods to detect the attacks.

Characteristics of dataset

  1. Network ‘pcap’ files for three days during the S3 in 2017 (i.e. S317)
  2. Historian data for three days during S317
  3. Attack scenarios performed by the participants

Click here to find out how to request for the dataset.

********************************************************* 

Water Distribution (WADI) Dataset

Similar to the SWaT dataset, the data collected from the Water Distribution testbed consists of 16 days of continuous operation, of which 14 days’ worth of data was collected under normal operation and 2 days with attack scenarios. During the data collection, sensor and actuator data were collected.

Characteristics of dataset

  1. Cyber Physical Properties: The dataset consists of all the values obtained from all the 103 sensors and actuators available in WADI.
  2. Attack Scenarios: The attacks generated for this dataset were derived from the attack models developed by our research team. The attack model considers the intent space of a CPS as an attack model. 15 attacks were launched during the 2 days.

Click here to find out how to request for the dataset.

*********************************************************

BATADAL dataset *

In addition to the WADI dataset, our faculty and researchers, in collaboration with colleagues from Israel and Cyprus, organised the BATtle of Attack Detection Algorithms (BATADAL), a competition to objectively compare the performance of algorithms for the detection of cyber attacks on water distribution systems. More information of the competition can be found here

Characteristics of dataset

  1. Training Dataset 1: This dataset was generated from a one-year long simulation. The dataset does not contain any attacks, i.e. all the data pertains to C-Town normal operations.
  2. Training Dataset 2: This dataset is around 6 months long and contains several attacks, some of which are approximately labelled.

Click here to find out how to request for the dataset.

********************************************************* 

EPIC dataset

The EPIC Dataset was collected by operating the EPIC testbed for 30 minutes under each of eight scenarios. Sensor measurements and actuator states were recorded in a CSV file. The network traffic data was recorded in pcap files.

Characteristics of dataset

Scenario 1

  • Synchronization without load
  • Angle difference between two generators G1 & G2 from -180 to 0 to 180 degree

Scenario 2

  • Synchronization with 10kW resistive load
  • Angle difference between two generators G1 & G2 from -180 to 0 degree

Scenario 3

  • Two generators G1 & G2 running
  • 10kW resistive load

Scenario 4

  • Two generators G1 & G2 running with PV system switched on
  • 10kW resistive load

 

Scenario 5

  • Two generators G1 & G2 running with PV system switched on
  • 7kW resistive load

Scenario 6

  • Three generators G1 to G3 running
  • 14kW resistive load

Scenario 7

  • Two generators G1 & G2 running
  • Supplying power to iTrust’s Secure Water Treatment (SWaT) testbed

Scenario 8

  • Two generators G1 & G2 running
  • Supplying power to iTrust’s SWaT and Water Distribution (WADI) testbeds

Click here to find out how to request for the dataset.

********************************************************* 

Blaq_0 Dataset

Blaq_0 Hackathon was first organised in January 2018 for SUTD undergraduate students. Independent attack teams design and launch attacks on the EPIC testbed. Attack teams were scored according to how successful they are in performing attacks based on specific intents.

Characteristics of dataset
Network ‘pcap’ files for three days during the Blaq_0.

Technical information of EPIC can be found here.

Click here to find out how to request for the dataset.

********************************************************* 

Complete Set of Invariants based on Design Centric and Data Centric Approaches

1. Set of Rules with antecedent 1
2. Set of Rules with antecedent 2
3. Set of Rules with antecedent 3
4. Set of Rules with antecedent 4
5. Set of Rules with antecedent 5
6. Set of Rules with antecedent 6
7. Set of Rules with antecedent 7
8. Comparison

Publication

Goh J., Adepu S., Junejo K. N., and Mathur A., “A Dataset to Support Research in the Design of Secure Water Treatment Systems,” The 11th International Conference on Critical Information Infrastructures Security.

* If you are using the BATADAL datasets in your work, please cite the following paper as reference:

Riccardo Taormina and Stefano Galelli and Nils Ole Tippenhauer and Elad Salomons and Avi Ostfeld and Demetrios G. Eliades and Mohsen Aghashahi and Raanju Sundararajan and Mohsen Pourahmadi and M. Katherine Banks and B. M. Brentan and Enrique Campbell and G. Lima and D. Manzi and D. Ayala-Cabrera and M. Herrera and I. Montalvo and J. Izquierdo and E. Luvizotto and Sarin E. Chandy and Amin Rasekh and Zachary A. Barker and Bruce Campbell and M. Ehsan Shafiee and Marcio Giacomoni and Nikolaos Gatsis and Ahmad Taha and Ahmed A. Abokifa and Kelsey Haddad and Cynthia S. Lo and Pratim Biswas and M. Fayzul K. Pasha and Bijay Kc and Saravanakumar Lakshmanan Somasundaram and Mashor Housh and Ziv Ohar; “The Battle Of The Attack Detection Algorithms: Disclosing Cyber Attacks On Water Distribution Networks.” Journal of Water Resources Planning and Management, 144 (8), August 2018. (doi linkbib)

Dataset collection credits

The following personnel were responsible for the dataset collection:

  1. SWaT – Sridhar Adepu
  2. WADI – Venkata Reddy
  3. S317 – Nils Tippenhauer, Hamid Reza Ghaeini
  4. EPIC – Ding Liqun, Kandasamy Nandha Kumar, Chuadhry Mujeeb Ahmed
  5. BATADAL – Riccardo Taormina
  6. Blaq_0 – Francisco Furtado, Lauren Goh, Jonathan Heng