Project Details

The Challenge | Chasers of the Lost Data

Help find ways to improve the performance of machine learning and predictive models by filling in gaps in the datasets prior to model training. This entails finding methods to computationally recover or approximate data that is missing due to sensor issues or signal noise that compromises experimental data collection. This work is inspired by data collection during additive manufacturing (AM) processes where sensors capture build characteristics in-situ, but it has applications across many NASA domains.

Data Loss Expecting

We build an algorithm to help data scientists to expect the missing features and data by doing a combination between “AI” algorithms and methods like: “KNN” k-nearest neighbor, linear regression and linear imputation to predict the output.

Chasers
  • Background :

-We had have a background about how to do an expert systems but it was an old school and not efficient so we have to learn and update our knowledge to be in touch with world.

-Resources like : hand on machine learning with scikit-learn and tensorflow , Kuggle website and Mathworks(MATLAB).

  • Datasets Resources :

https://www.kaggle.com/nasa/asteroid-impacts
http://neo.jpl.nasa.gov(nasa_asteroid)

https://catalog.data.gov/dataset/near-earth-comets-orbital-elements

https://www.kaggle.com/shrutimehta/nasa-asteroids-classification

  • Challenges:

1-learning new AI algorithms.

2-Learning how to use MATLAB.

3-We had to revision on Algebra,Calculs1,Calculs2 and Numerical Mathematics.

  • IDEA Validation :

1-Data Set Analysis.

2-Data Set Reformation.

3-choosing a Suitable Algorithms.

4-Predict The Missing Feature.

5-Combination Between AI Methods And Algorithms.

6-Predict The Final Output.

  • Refernces :
    1. Graham JW. Missing data analysis: making it work in the real world. A Annu Rev Psychol 2009; 60: 549-76.
    2. Little RJ, D'Agostino R, Cohen ML, Dickersin K, Emerson SS, Farrar JT, et al. The prevention and treatment of missing data in clinical trials. N Engl J Med 2012; 367: 1355-60
    3. O'Neill RT, Temple R. the prevention and treatment of missing data in clinical trials: an FDA perspective on the importance of dealing with it. Clin Pharmacol ther 2012; 91: 550-4.
    4. Rubin DB. Inference and missind data. Biometrika 1976; 63: 581- 92.5. DeSarbo S, Green PE, Carroll JD. An alternat ing least-squares.
    5. DeSarbo S, Green PE, Carroll JD. An alternat ing least-squares procedure for estimating missing preference data in product-concept testing. Decision Sciences 1986; 17 : 163-85.
    6. Wisniewski SR, Leon AC, Otto MW, Trivedi MH. Prevention of missing data in clinical research studies. Biol Psychiatry 2006; 59: 997-1000.
    7. Scharfstein DO, Hogan J, Herman A. On the prevention and analysis of missing data in randomized clinical trials: the state of the art. J Bone Joint Surg Am 2012; 94 Suppl 1: 80-4.
    8. Wilcox S, Shumaker SA, Bowen DJ, Naughton MJ, Rosal MC, Ludlam SE, et al. Promoting adherence and retention to clinical trials in special populations: a women's health initiative workshop. Control Clin Trials 2001; 22: 279-89.
    9. Donner A. the relative ethectiveness of procedures commonly used in multiple regression analysis for dealing with missing values. Am Stat 1982; 36: 378-81.
    10. Kim JO, Curr y J. The treatment of missing data in multivariate analysis. Sociol Methods Res 1977; 6: 215-41.
    11. Malhotra N. Analyzing marketing research data with incomplete information on the dependent variable. J Mark Res 1987; 24: 74-84.
    12. Hamer RM, Simpson PM. Last observation carried forward versus mixed models in the analysis of psychiatric clinical trials. Am J Psychiatry 2009; 166: 639-41.
    13. Panel on Missing Data in Clinical Trials. the prevention and treatment of missing data in clinical trials. 2nd ed. Washington DC, National Academies Press. 2010, pp 107-14.
    14. Dempster AP, Laird N M, Rubin DB. Maximum likelihood from incomplete data via the EM algorithm. JRSSB 1997; 39: 1-38.
    15. Sinharay S, Stern HS, Russell D. the use of multiple imputation for the analysis of missing data. Psychol Methods 2001; 6: 317-29.
    16. Rubin DB. Multiple imputation after 18 years (with discussion). J Am Stat Assoc 1996; 91: 473-89.
    17. Acock AC. Working with missing values. J Marriage Fam 2005; 67: 1012-28.


    Github Link :https://github.com/Omarsesa/Chasers
    Presentation Link :https://drive.google.com/open?id=13LpxxpLBJxU-9tMg...