
In real world, there are various reasons such as sensor issues or signal noise may cause data losing. Missing data has a huge amount of influence in data description, so it's the reason why handling lost data always is a hot topic in data science.
Traditionally, the ways we handle missing data are normally based on statistics. Dropping the observations which have missing data is the most intutive approach, but we may lose some curcial values by using this method. Replacing Nahs(Blank) with the mean values or the values selected from the Regression equation are two common methods as well. However both have some disadvantage of smoothness or complexity. Thus, we chose to try doing data recovery with ML method.
How well the model is depends on how much data we feed. It's quite insufficient data for training a model. As a result, producing training data is the first priority of our work. And then try different DNN structures to sketch the distribution of data. Fully connected network, the simplest neuro network, surprisingly beat other architecture at stability and precision.
Our aim is to make an app that can help to estimate missing data. Hence there must be some additional steps to do in order to integrate the model into a mobile phone. Applying google Tensorflow lite and MTK NeuroPilot help us to reduce the complexity of the system significantly, after all, without losing too much precision.
https://data.nasa.gov/api/views/gh4g-9sfh/rows.csv?accessType=DOWNLOAD
https://data.nasa.gov/api/views/b67r-rgxc/rows.csv?accessType=DOWNLOAD
https://data.nasa.gov/api/views/mc52-syum/rows.csv?accessType=DOWNLOAD
https://data.nasa.gov/api/views/9ns5-uuif/rows.csv?accessType=DOWNLOAD
Language - Python 3, Java
Libraried - Pandas, Sklearn, IPython, Tensorflow, Numpy
API - NeuroPilot
https://github.com/howardlee1995/NASA-HACKERSON.gi...
https://drive.google.com/open?id=1dy7Wkiv1U1CnLmVB...