Optimizing Air Quality Monitoring: Comparative Analysis of Linear Regression and Machine Learning in Low-Cost Sensor Calibration

Aerosol and Air Quality Research, 25(1-4), 2025

Abstract

Background Low-cost sensors (LCS) are widely used for air quality monitoring, but their accuracy depends on proper calibration. This study compares linear regression (LR) and machine learning (ML) techniques, particularly random forest (RF), to determine optimal calibration strategies. Objectives This study aims to compare the effectiveness of LR and RF models in calibrating the Plantower PMS 3003 sensor under different environmental conditions. It also explores ways to streamline calibration efforts while maintaining accuracy. Methods Sensor data were collected in a controlled laboratory setting, with measurements compared against a reference monitor. LR and RF models were developed to calibrate the sensor, and their performance was evaluated based on RMSE, R2, and bias. Additionally, the study examined whether using fewer sensors for training could still produce reliable calibration models. Results Both LR and RF models demonstrated strong calibration performance. LR models were effective for low to moderate PM2.5 concentrations and required fewer computational resources, making them suitable for large-scale monitoring with limited resources. RF models captured nonlinear relationships, showing superior accuracy at high PM concentrations and in conditions with high relative humidity. The findings suggest that LR models trained on smaller datasets can achieve practical accuracy, reducing the need for extensive individual sensor calibration. Conclusions The selection of a calibration model should be guided by study-specific requirements, including environmental conditions and resource availability. LR models are recommended for large-scale studies with constrained resources, while RF models may offer advantages in high-exposure environments due to their ability to model complex interactions. This study is the first to explore reducing sensor calibration efforts while maintaining accuracy, highlighting the potential for optimized strategies in resource-limited settings. Future research should validate these findings in real-world deployments to further refine calibration models for LCS applications. Graphical Abstract

Fang, R., Collingwood, S., Zhang, Y., Stanford, J. B., Porucznik, C., & Sleeth, D. (2025). Optimizing Air Quality Monitoring: Comparative Analysis of Linear Regression and Machine Learning in Low-Cost Sensor Calibration. *Aerosol and Air Quality Research*, *25*(1-4). https://doi.org/10.1007/s44408-025-00009-x