Releasing spatiotemporal measurements of air quality from 30 indoor sites over six months
during the summer and winter seasons (89.1M samples, totaling 13646 hours of air quality data and 3957
activity annotations from 24 participants among 46 occupants)
Key Features
Uniqueness: This dataset offers extensive indoor air quality
data from 30 indoor sites in developing Indian communities, annotated with daily activities and
real-time pollution dynamics, filling a gap in large-scale datasets for developing nations.
- Multi-device: Contains pollution measurements
from multiple devices per site, with up to six devices in residential households, offering unique
observations of spread.
- Indoor types: Captures data from five types
of locations (residential households, studio apartments, food canteens, classrooms, research labs)
across 30 sites.
- Frequent pollutants: Includes readings for
indoor temperature, humidity, and eight pollutants (CO2, VOC, PM1, PM2.5, PM10, NO2, C2H5OH, CO).
- Human annotations: Real-time activity labels
collected via a speech-to-text app, providing necessary context for interpreting pollution
readings.
- Multi-city deployment: Data from four regions
in India, covering rural, suburban, and urban populations.
- Dataset duration: Data collected over six
months (Summer and Winter), capturing seasonal pollution dynamics and behavioral variations.
Potential Applications: In general, the dataset can be used in
the following applications.
- Pollution Source Identification and Activity
Monitoring: Records instances of pollution patterns with specific
activities, aiding in source and activity classification.
- Analysis of Spreading and Accumulation Patterns in
Different Floor Plans: Useful for studying pollutant spread and accumulation in
varied room structures.
- Healthy Home Characterization and Indoor Design
Improvement: Helps identify features to mitigate pollution spread, supporting
healthier indoor designs.
- Smart Device Control: Enables design of
control policies for ACs, exhausts, air purifiers, and other ventilation devices.
Benchmarking & ML Applications: Enables research in pollution
source detection, activity classification, pollutant spread in varied floor plans.
Dataset Size: Includes 89.1 million samples over six months,
with 13,646 hours of air quality data and 3,957 activity annotations.
Licensing:The dataset is free to download and can be used with GNU Affero General Public License for non-commercial purposes. All participants signed forms consenting to the use of collected pollutant
measurements and activity labels for non-commercial research purposes. The institute’s ethical review
committee has approved the field study (Order No: IIT/SRIC/DEAN/2023, Dated July 31, 2023).
Prasenjit Karmakar
IIT Kharagpur, India
Swadhin Pradhan
Cisco Systems, US
Sandip Chakraborty
IIT Kharagpur, India
Publications
- Karmakar, P., Pradhan, S. and Chakraborty, S., Indoor Air Quality
Dataset with Activities of Daily Living in Low to Middle-income Communities. In Thirty-eighth
Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2024
- Karmakar, P., Pradhan, S. and Chakraborty, S., September. Exploiting
Air Quality Monitors to Perform Indoor Surveillance: Academic Setting. In Adjunct Proceedings of the
26th International Conference on Mobile Human-Computer Interaction, 2024
- Karmakar, P., Pradhan, S. and Chakraborty, S., Exploring Indoor Air
Quality Dynamics in Developing Nations: A Perspective from India. In ACM Journal on Computing and
Sustainable Societies, 2(3), 2024