Surface-to-Air (Quality) Mission

    The Challenge

    Your challenge is to integrate NASA data, ground-based air quality data, and citizen science data to create an air quality surface that displays the most accurate data for a location and time. Create algorithms that select or weight the best data from several sources for a specific time and location, and display that information.

    Background

    Air pollution is associated with almost 5 million annual deaths worldwide as well as harm to countless others through negative health impacts, from asthma and diabetes to cardiovascular disease and cancer. The World Bank estimates that air pollution costs the global economy $225 billion in lost labor income and $5 trillion in welfare losses annually.

    Yet real-time, reliable air quality data are not available in most of the world, leaving people with a significant and harmful knowledge gap. Confusion about what air quality data mean is only growing as low-cost sensors enter the market, companies create their own predictive algorithms, privatized messaging on satellite data streams to mobile applications grows, and countries and companies come up with various ways to present the data, which may not be understood by users. Similar to the U.S. Department of State’s global air quality monitoring system, which has stimulated research and awareness around the world, a transparent model is needed to automatically synthesize data from the ground to satellites into actionable information for the public. While you can focus on any city, there is pre-packaged data available for three cities: Addis Ababa, Ethiopia; Delhi, India; and Los Angeles, United States. Find this data in the link for “Sample Data for the Mission” in the Example Resources section of this page.

    Potential Considerations

    Specific solution requirements:

    • Ideally, information would be displayed on a map so people can understand and use it.
    • Provide Application Programming Interfaces (APIs) so people can combine your data with other data.
    • Provide cues for visualization, and documentation about data quality and the logic of the business rules, such as weighting criteria.
    • You must use data from satellites. If open data from official reference monitors and citizen science sensors and models are available then you must use them or provide a justification for not using them. You can also use open data from regional sensor groups.
    • Allow integration of research and new data streams

    Other considerations:

    • There may be different approaches for different types of land use (e.g., urban, suburban, desert, forest, other). You can develop air quality surfaces in one or more locations or type of land use.
    • You can include one or more outdoor air pollutants.
    • How can you extrapolate predictive values from intermittent satellite data for areas that have sparse or no ground-based air quality data? How does your forecast fill or handle data gaps?
    • How can you use machine learning or artificial intelligence to enhance predictive and forecast values over time?
    • How can you use data from networks of sensors to understand air quality at a single location?
    • How could you apply land use regression techniques, since these datasets provide more global coverage?
    • How can you ensure that the methodology for the algorithms and surface are open, transparent and iterative?
    • How can you enhance public understanding of what air quality data mean and the differences in data quality from different sources?