Project Details

The Challenge | Chasers of the Lost Data

Help find ways to improve the performance of machine learning and predictive models by filling in gaps in the datasets prior to model training. This entails finding methods to computationally recover or approximate data that is missing due to sensor issues or signal noise that compromises experimental data collection. This work is inspired by data collection during additive manufacturing (AM) processes where sensors capture build characteristics in-situ, but it has applications across many NASA domains.

ImpleData

Nowday, the data have converted in a field very important to find information that aren't visible to ours eyes at simply sigth. Sometimes missing data are presented, in our challenge we analyze how to handle them to improve the result of the analysis.

SpaceSWAT

we create methods to approximate the missing data and evaluate that method by building a ML model and describing the improvement in that model’s performance before and after the data recovery method was applied.

In order to acomplish our objective, we used several MachineLearning models such as KNN(K-Nearest Neighbors) to fill geoespacial data, using coordinates to stablish similar features between geographic areas.

another method used is Polinomial Regresion used to estimate an specific data over a period of time.

In order to estimate missing values within time series, two types of interpolation was used: linear interpolation and spline interpolation. This approach tries to find a curve that better fits to the actual data.

Resources

-Imputation of missing longitudinal data: a comparison of methods Jean Mundahl Engels*, Paula Diehr Departments of Biostatistics and Health Services, University of Washington, 1959 Northeast Pacific Avenue, Bo

- ST-MVL: Filling Missing Values in Geo-sensory Time Series Data

Project GitHub Repository:

https://github.com/rodemore/SpaceSWAT_project