Synthetic Data for Predictive Maintenance: A Systematic Review and Framework for Industry 4.0 Applications
Walter Nieminen et al.
Abstract
In industrial Predictive Maintenance (PdM), effective data-driven models are often limited by a scarcity of data, dataset imbalance, and the high costs of collecting failure data. By simulating realistic failure scenarios and enhancing model training, the synthetic data generation has emerged as a promising strategy to overcome these challenges. This article is a systematic literature review of 86 peer-reviewed articles published since 2020 that focus on synthetic data applications in medium-to-heavy machinery and industrial processes. Data generation techniques fall into four key categories: data augmentation, generative models, physics-based simulations and hybrid approaches, and feature-based transformations. This review analyzes the strengths, limitations, and adoption trends of each method. Findings reveal that hybrid and physics-informed models are particularly valuable in safety-critical domains where model transparency and adherence to physical laws are essential and industrial contexts demand higher reliability and contextual accuracy. To address these needs, the Synthetic Data-Enhanced PdM (SD-PdM) framework, a five-phase methodology for integrating synthetic data into maintenance strategies, is proposed. This framework supports scalable, explainable, and economically viable smart maintenance solutions.
1 citation
Evidence weight
Balanced mode · F 0.40 / M 0.15 / V 0.05 / R 0.40
| F · citation impact | 0.16 × 0.4 = 0.06 |
| M · momentum | 0.53 × 0.15 = 0.08 |
| V · venue signal | 0.50 × 0.05 = 0.03 |
| R · text relevance † | 0.50 × 0.4 = 0.20 |
† Text relevance is estimated at 0.50 on the detail page — for your query’s actual relevance score, open this paper from a search result.