Project Details

The Challenge | Chasers of the Lost Data

Help find ways to improve the performance of machine learning and predictive models by filling in gaps in the datasets prior to model training. This entails finding methods to computationally recover or approximate data that is missing due to sensor issues or signal noise that compromises experimental data collection. This work is inspired by data collection during additive manufacturing (AM) processes where sensors capture build characteristics in-situ, but it has applications across many NASA domains.

Data Imputation with a Generative Adversarial Network for Combined Cycle Power Plant UCI Dataset

Missing data imputation on a Combined Cycle Power Plant dataset using a Generative Adversarial Network.

Galileo

I applied a recent imputation method using a Generative Adversarial Network. This method is called GAIN, and was developed by Jinsung Yoon. I applied this method for testing on the Combined Power Plant Dataset.

The goal is predict the net hourly electrical energy output (EP) of the plant.

We use Keras and Tensorflow to compare the efficiency of the method.

I took the source dataset and i deleted some values randomly, next i applied the GAIN imputation method to get a new dataset with sintetic missing values. Next creating two regression models, the first one using the original dataset and the second one using the sintetic values, i compare the results between models.

We expect to get nearly the same accuracy among models. If the two models give us similar MSE and MAE on a test set, this would be meaning that GAIN method works well generating similar data for filling empty spaces for CCPower Plant dataset.

Source Code:

https://github.com/victorsergio/nasa-space-challenge-2019

References:

GAIN model:

Date: Jan 29th 2019

Generative Adversarial Imputation Networks (GAIN) Implementation on Spam Dataset

Reference: J. Yoon, J. Jordon, M. van der Schaar, "GAIN: Missing Data Imputation using Generative Adversarial Nets," ICML, 2018.

Paper Link: http://medianetlab.ee.ucla.edu/papers/ICML_GAIN.pd...

Appendix Link: http://medianetlab.ee.ucla.edu/papers/ICML_GAIN_Su...

Contact: jsyoon0823@g.ucla.edu


Dataset:

Pınar Tüfekci, Prediction of full load electrical power output of a base load operated combined cycle power plant using machine learning methods, International Journal of Electrical Power & Energy Systems, Volume 60, September 2014, Pages 126-140, ISSN 0142-0615, [Web Link].
([Web Link])

Heysem Kaya, Pınar Tüfekci , Sadık Fikret Gürgen: Local and Global Learning Methods for Predicting Power of a Combined Gas & Steam Turbine, Proceedings of the International Conference on Emerging Trends in Computer and Electronics Engineering ICETCEE 2012, pp. 13-18 (Mar. 2012, Dubai)