Authors: Yaghoobian S, An J, Jeong DW, Hwang JH
Artificial intelligence (AI) and machine learning (ML) are increasingly integrated into Per- and polyfluoroalkyl substances (PFAS) research; however, the field remains fragmented with substantial variation in modeling objectives. This review provides one of the most comprehensive and detailed syntheses to date of AI/ML methods across the PFAS contamination management pipeline, comparing input features, dataset structure and scale, algorithmic choices, performance metrics, and interpretability strategies reported from 2019 to 2025. At the molecular level, advances in ML-based quantitative structure-activity relationship (QSAR) modeling, physics-informed descriptors, graph learning, transfer learning, and generative modeling for PFAS classification, toxicity screening, and chemical-space expansion are summarized. For PFAS detection and non-target identification, ML frameworks for spectral interpretation are evaluated. In source allocation, supervised and unsupervised models applied to concentration profiles across water, groundwater, and sediments, are compared, highlighting how model design depends on the availability of labeled data. ML-driven PFAS occurrence and risk prediction across diverse aqueous matrices are reviewed, including multilabel, multistage, and semi-supervised frameworks that capture cross-PFAS dependencies. PFAS removal processes are also assessed in terms of the ML models used for predicting removal efficiencies, interpreting mechanistic behavior, and optimizing operational conditions. Across all domains, tree-based ensembles, and neural networks achieve superior performance, while uncertainty quantification, classifier chains, transfer learning, and generative models address challenges related to sparse labels, chemical diversity, and analytical limitations. This review offers a practical reference for researchers and regulators and identifies priority directions for developing robust, and generalizable AI/ML frameworks to support PFAS contamination management.
Keywords: Artificial intelligence; Data-driven modeling; Emerging contaminants; Machine learning; Per- and polyfluoroalkyl substances (PFAS); Water and wastewater treatment;
PubMed: https://pubmed.ncbi.nlm.nih.gov/41483514/
DOI: 10.1016/j.jhazmat.2025.140934