Project Details

The Challenge | Chasers of the Lost Data

Help find ways to improve the performance of machine learning and predictive models by filling in gaps in the datasets prior to model training. This entails finding methods to computationally recover or approximate data that is missing due to sensor issues or signal noise that compromises experimental data collection. This work is inspired by data collection during additive manufacturing (AM) processes where sensors capture build characteristics in-situ, but it has applications across many NASA domains.

Data Imputation

Motivation:

The efficiency and effectiveness of any AI algorithm depends on how rigorous the data is, data of high quality ensures the training algorithm to give the best results in terms of accuracy.

So to ensure producing a reliable complete data, it was tempting to go through this challenge and apply state of art techniques addressing the solution of this problem plus discovering and innovating new methodologies to handle this serious problem.

Solution:

Revolving around data imputation by using advanced statistical approaches and machine learning algorithms.


Idea:

Creating a reliable precise and efficient solution that can be quietly generalized to handle mostly all kinds of missed data from various resources to facilitate the training process done by any machine learning algorithm.


Resources used from NASA:


Used Packages:

  • mice
  • VIM
  • openxlsx
  • softImpute
  • xgboost


Checkout:

Github

description video

Presentation