Project Details

The Challenge | Chasers of the Lost Data

Help find ways to improve the performance of machine learning and predictive models by filling in gaps in the datasets prior to model training. This entails finding methods to computationally recover or approximate data that is missing due to sensor issues or signal noise that compromises experimental data collection. This work is inspired by data collection during additive manufacturing (AM) processes where sensors capture build characteristics in-situ, but it has applications across many NASA domains.

Asteroid Builders

Our project creates new Meteorite Landings data based on NASA's data by using SMOTE, an algorithm that can create new data based on smaller examples. Through the Unity Engine, we simulate the real landings and the new ones in order to compare them..

Asteroid Builders

We are Asteroid Builders and our challenge is “Chasers of the Lost Data”, this challenge centers on the problem of the data we miss and tries to recreate it using machine learning, our job is to improve the performance of the machine learning to obtain accurate new data.

The branch we focused on is Meteorite Landings, there’s a lot of meteorite landings we miss because is difficult to access to them, some other times they disintegrate or are way too small to find them. A lot of meteorites are found in Antarctica because is easier to find them in there, but as a result, data tends to only that region.

Our solution is the use of SMOTE (Synthetic Minority Over-sampling TEchnique), created by Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall and W. Philip Kegelmeyer; this machine learning technique creates new data based on small, unrepresented examples and generates a more balanced dataset.

Our project divides in two, a SMOTE Analysis and an Asteroid Simulator.

SMOTE Analysis section contains the algorithm we utilized for the machine learning in Python, it uses numpy, pandas and plotly.graph_objects libraries. In there you can find the results of the processed Meteorite Landings data from NASA, this are also represented in pie charts where you can appreciate how uneven and unbalanced is the data.

After that it shows the new synthetic data that was created where you observe the new balanced data.

You can also compare their locations in the graphs.

The way SMOTE works is by finding the nearest neighbor of each data point and then randomly fills the space between until it finally gets a more equally distributed data.

The second section contains an Asteroid Simulator where you can observe the NASA’s Meteorite Landings data in the Earth and the results of SMOTE synthetic new data.

The simulator engine is Unity, the simulator is developed by us, it shoots different types of meteorites and lands them in different Earth locations provided by the NASA and SMOTE’s algorithm.

You can find the resources in:

https://drive.google.com/open?id=1s20B7D0UdogUMWG0...

And the project in:

https://github.com/diegofc27/asteroid-builder.git