Project Details

The Challenge | Chasers of the Lost Data

Help find ways to improve the performance of machine learning and predictive models by filling in gaps in the datasets prior to model training. This entails finding methods to computationally recover or approximate data that is missing due to sensor issues or signal noise that compromises experimental data collection. This work is inspired by data collection during additive manufacturing (AM) processes where sensors capture build characteristics in-situ, but it has applications across many NASA domains.

Bayesian Imputation

MCMC and Bayesian Analysis to impute missing data

Faulty surveys, noise in sensor data, dysfunctional instruments - all result is missing data. You can't make good decisions when data is missing.

The commonly utilized methods are using mean, median or mode of the data to fill in the missing values. These preserve the central tendency of the data, but reduce the variance. This is not desirable in many cases.

Our approach is to use bayesian analysis to infer the distribution of the missing data. We use the distribution to generate the missing values. Our code for this approach can be found on github.

https://github.com/scimas/imputed_spaceapps

Presentation