Project Details

The Challenge | Chasers of the Lost Data

Help find ways to improve the performance of machine learning and predictive models by filling in gaps in the datasets prior to model training. This entails finding methods to computationally recover or approximate data that is missing due to sensor issues or signal noise that compromises experimental data collection. This work is inspired by data collection during additive manufacturing (AM) processes where sensors capture build characteristics in-situ, but it has applications across many NASA domains.

lost_spaghetti

lost_spaghetti is our innovative general solution of interpolating and identifying noise in data that is computationally light and uses a running median to determine the quality of data.

Lost Boys

Our team chose this challenge because we weren't familiar with the subject matter and space apps was a great opportunity to get thrown out of the nest and fly into the world of interpolation and noise! Lost_Spaghetti is Team Lost Boys method to estimate gaps and identify noise in .csv data. This program identifies trends in data to interpolate missing areas, and uses a spaghetti model to create a running median that helps identify noise. Lost_Spaghetti is a pure python method and uses these non-standard open source python libraries: numpy, sklearn for advanced math functions, pandas to work with csv files, and matplotlib for visual graphs. Desmos was also used to create simpler visuals for our presentation. Sample .csv files were provided by catalog.data.gov, we used the Meteorite Landings and Near-Earth Landings examples to test the method. We selected these csv files because they both include time, which is a traditional choice to plot on the x axis. In the future, we'd like to find more powerful graphing libraries to display our data.

https://github.com/Caleb-Shepard/lost_spaghetti/

#machine learning #python #csv #meteorites #spaghetti #data science #planets near and far #chasers of lost data #interpolation jones