Project Details

The Challenge | To Bloom or Not to Bloom

Your challenge is to solve the mystery behind algal blooms! What factors cause blooms in some water bodies but not others, and how can we better predict their occurrence to prevent harm to aquatic and human life?

Algae-rhythm

Our team created a methodology and a tool to further expand the research of algal blooms. It is a software based on machine learning algorithms, written in JAVA.

About Algal blooms:

An algal bloom is the rapid increase or accumulation in the population of algae such as phytoplankton in freshwater or seawater and oftentimes disrupts the ecosystem.Freshwater algal blooms are the result of an excess of nutrients, which may originate from fertilizers that are applied to land for agricultural purposes. Algae is short-lived, and the result is a high concentration of dead organic matter which starts to decay. The decay process consumes oxygen in the water, resulting in hypoxic conditions. Without sufficient oxygen in the water, animals and plants may die off in large numbers. In seawater, besides hypoxia, algal blooms may cause the production of neurotoxins lethal to fish, seabirds, sea turtles, and marine mammals. Human illness or death via consumption of seafood or drinkable water contaminated by toxic algae is also possible.

The severity of this problem can be seen in the news.

List of toxic microalgae: http://www.marinespecies.org/hab/aphia.php?p=taxlist


Our Solution

Our algorithm is based on the ID3 algorithm ( invented by J. Ross Quinlan ) through normalization of data by bucketing and the use of random forest which is an ensemble learning method.

Technically, our tool can predict the amount of toxic phytoplanktons populations when there are many different factors which affect the quality of water. Then it visualizes in form of decision trees (with the help of an API) which of the factors are most important for the previous prediction. And also it gives a percentage of accuracy of these decision trees.

We believe that after the data analysis from our algorithm, we can find links to be researched between factors that were never found before, if we regularly add different variables each time, even from data we thought previously that were unrelatable. This way human actions that can create the ideal environment for algal blooms can be more easily detected and therefore prevented. As a result our ecosystem can be preserved more effectively.


For example, after using our software written in JAVA, we found a direct link:

Between Akashiwo Sanguinea and Phaeophytin
https://drive.google.com/file/d/1rFfIyij0Mfrjid4VD...

Between Prorocenrtum spp. and Water Temperature
https://drive.google.com/file/d/1rFfIyij0Mfrjid4VD...

And between Alexandrium spp. and Water Temperature
https://drive.google.com/file/d/1K7hIWuHfjZoRbc5Kp...

The dataset included 1930 lines and 22 columns.
The algorithm learned from a part of the dataset and it could predict the other part with an accuracy of greater than 80 %.

The factors we tested are:

  • Seasonality, Water Temperature (°C), Location, Volume Settled for counting (mL), Depth (m),
  • Chlorophyll (mg/m3),Chlorophyll-a (mg/m3), Chlorophyll-b (mg/m3), Phaeophytin (mg/m3), Phaeophytin1 (mg/m3), Phaeophytin2 (mg/m3), Domoic Acid (ng/mL)
  • Silicate (uM), Nitrate (uM), Nitrite (uM), Ammonia (uM), Phosphate (uM), Silicate (uM),
  • Akashiwo sanguinea (cells/L), Alexandrium spp. (cells/L), Dinophysis spp. (cells/L), Lingulodinium polyedrum (cells/L), Other Diatoms (cells/L), Other Dinoflagellates (cells/L), Prorocentrum spp. (cells/L), Pseudo-nitzschia delicatissima group (cells/L), Pseudo-nitzschia seriata group (cells/L)

https://drive.google.com/file/d/1bDLfAGX9J-fahZpBl...

The dataset used in our algorithm included metrics from Cal Poly Pier, Santa Cruz Wharf and Scripps Pier in Southern California.


Datasets retrieved from:

http://marine.copernicus.eu/services-portfolio/access-to-products/?option=com_csw&view=details&product_id=INSITU_GLO_NRT_OBSERVATIONS_013_030
https://neo.sci.gsfc.nasa.gov/view.php?datasetId=MY1DMM_CHLORA
http://www.sccoos.org/query/?project=Harmful%20Algal%20Blooms&study[]=Scripps%20Pier
https://neo.sci.gsfc.nasa.gov/view.php?datasetId=MYD28M
https://modis.gsfc.nasa.gov/data/dataprod/chlor_a.ph
https://open.canada.ca/data/en/dataset/ccbffcfc-d81d-438a-ab1c-08df0e7525d8


Other references:

https://dblp.uni-trier.de/pers/hd/q/Quinlan:J=_Ros...

https://earth.esa.int/web/guest/missions/esa-opera...

https://www.epa.gov/enviroatlas/forms/enviroatlas-...

https://www.epa.gov/water-research/cyanobacteria-a...

https://www.nasa.gov/feature/goddard/2019/nasa-hel...

https://earthdata.nasa.gov/learn/sensing-our-plane...

https://earthobservatory.nasa.gov/images/88311/blo...

https://myfwc.com/research/redtide/general/cyanobacteria/