Generative Adversarial Network (GAN)-assisted data quality monitoring approach for out-of-distribution detection of high dimensional data
View/ Open
Date
2023-04-18Author
Slater, Kent
Li, Yuxuan
Shan, Yongwei
Metadata
Show full item recordAbstract
Data quality monitoring plays a critical role in various real-world engineering system inspection problems. Anomalous or invalid inspection data commonly exist due to computer/human recording errors, sensor faults, etc. Thus, an efficient tool to detect data anomalies is critically needed. However, it is challenging due to high dimensionality, unknown underlying distribution, insufficient sample size, and high level of noise. To address these challenges, an effective approach that can learn the underlying distribution of normal data with anomaly detection rules was developed. In this approach, the Generative Adversarial Network (GAN) was employed to identify the underlying distribution of normal data and filter out noise. After using the trained GAN to generate points of the learned distribution, a k-nearest neighbor-based approach is used to define the anomaly detection rules. In the proposed approach, the normal records are used to train the GAN and establish the control rule. Specifically, after training the GAN using the normal records, the pairwise distances over all the GAN-generated data points are calculated, and the k-nearest neighbors for every single data point are accordingly determined. Then, the average distance from each single data point to its k-nearest neighbors is calculated as the statistics to indicate the data quality and establish a control chart. When a new record comes in, its similarity to the GAN-generated distribution can be evaluated by the established control chart to identify whether the new record is anomalous or not.
Citation
Slater, K., Li, Y., Shan, Y., & Liu, C. (2023, April 18). A Generative Adversarial Network (GAN)-assisted data quality monitoring approach for out-of-distribution detection of high dimensional data. Poster session presented at the Oklahoma State University Undergraduate Research Symposium, Stillwater, OK.