Marine Debris: Finding the Plastic Needles

Given that a large haystack can contain over 7,000 cubic feet of hay, the aptness of the needle-in-the-haystack cliche becomes clear. Now imagine finding that proverbial needle when searching across the 139.7 million square miles of ocean that covers our planet. This was the task facing the IMPACT team working on an automated marine debris detection solution using machine learning.

An estimated eight million tons of plastic waste enters the ocean every year due to a lack of proper waste management. Runoff from land and river outflows lead to the accumulation of debris in coastal areas while ocean currents are the main drivers of debris transportation in open oceans (Jambeck et al. 2015). Ocean plastic pollution has a devastating impact on marine biodiversity. Marine debris is the cumulative term describing this human-created litter which was accidentally or intentionally released into the oceans. However, once seaborne the debris is usually intermixed with natural materials or vegetation such as algae, seaweed, and driftwood.

“Marine Debris” NOAA’s National Ocean Service (CC BY 2.0)

Applying the robust internal feature selection of artificial intelligence (AI) and the abstraction powering deep learning to Earth observation data can revolutionize the detection of marine debris and facilitate the on-going efforts to clean the world’s oceans. This strategic combination can greatly inform targeted monitoring and cleanup efforts, as well as contribute to scientific research on marine debris transport dynamics.

Current approaches to detecting marine debris rely mostly on lower resolution satellite imagery (which is very coarse) or aerial data (which is expensive to acquire). However, the high spatio-temporal resolution and broad coverage of small satellite imagery make it uniquely advantageous for the dynamics of marine debris monitoring, where targets are influenced by many shifting variables such as currents, weather, and human activities. In addition, the spatial resolution of small satellite imagery is high in comparison to imagery from most open access satellites. This difference of spatial resolution may confer further improvements with respect to the granularity, or scale, of feasibly detectable ocean plastics.

The IMPACT team first conducted an extensive literature review of marine debris and plastic detection to colocate validated marine debris events. These locations and the respective timestamps of validated marine debris observations served as the synoptic reference for where to search for overlapping imagery. The observations they sourced were mainly from observations around the Bay Islands in Honduras, augmented with observations from Accra, Ghana and Mytilene, Greece (Kikaki et al. 2020, Biermann et al. 2020, Topouzelis et al. 2019).

Next, they searched the 3-meter Planet imagery archive for the corresponding dates from the literature and manually verified the presence of marine debris for creating the labeled dataset. The combination of moderately high spatial resolution, high temporal resolution, availability of a near-infrared channel (where plastic reflects strongly in the electromagnetic spectrum) and global coverage of coastlines made this imagery advantageous for the purposes of this research.

After having identified an imagery dataset, the next step was to take inventory of the feasibly detectable features as constrained by the specifications of the imagery. With Planetscope, the team anticipated the model would be capable of detecting aggregated debris flotsam as well as some mega plastics including medium to large size ghost (i.e. lost or abandoned) fishing nets. Eventually, the custom labeled dataset consisted of 1370 polygons containing marine debris. The labeled portion of each Planetscope scene was divided into smaller square (256 x 256) tiles which were used for training an object detection model.

Labeled marine debris training data

To achieve automated detection of marine debris within imagery, the team needed a computer vision model capable of learning the characteristic features of marine debris in imagery captured over a marine environment. For this, they developed a deep-learning object detection model. This model requires imagery and spatial annotations for marine debris, in the form of bounding boxes. The model examines the imagery and annotations during training, learns what is and is not valid marine debris and proposes candidates based on the learned schema. After training is complete (i.e. when the model’s error is as close to zero as possible), the model can be applied to any imagery with the same specifications to infer or predict instances of marine debris. The predicted candidates, which are bounding boxes just like the training annotations, can then be used to locate, quantify and extract approximate measurements of marine debris in a scalable manner over time and space.

Automated marine debris predictions

The performance of the model was evaluated by computing intersection over union (IoU), precision, recall, and f1-scores of predicted bounding boxes relative to ground truth bounding boxes on a hold out test dataset. A final f1-score of 0.85 was obtained on an IoU score of 0.5. This score indicates that the model performed fairly well in detecting floating marine debris.

IMPACT team member Ankur Shah explains the broader applicability of this machine learning effort:

The primary significance of this method is that it is replicable for identifying any Earth science feature as long as there is sufficient labeled data for training the model. Planetscope imagery has a spatial resolution of approximately 3 meters making it ideal to detect relatively small features such as marine debris, buildings, roads, airplanes, etc. The same workflow can be repeated for object detection problems.

This machine learning solution uniquely combines multiple open source tools to produce the overall workflow. For image annotation, the solution uses ImageLabeler, which was developed and published by NASA IMPACT. For imagery tiling and bounding box annotation encoding, the solution employed Label Maker, which was developed and published by Development Seed. Data serialization, modeling and evaluation was enabled by TensorFlow, which was developed and published by Google.

Beyond the technical accomplishments, this effort has implications for the future of the planet, as team member Lilly describes:

I have spent much of my life in and around the ocean, surfing and ocean lifeguarding in southern California. As a significant component of my life and environment, our oceans are something that personally I care deeply for. I have hoped to conduct research intersecting remote sensing of the ocean and computer vision for a very long time, so I am grateful to be fulfilling that passion now. Someday I hope the results we derive will help propel solutions to a cleaner environment.

More information about IMPACT can be found at NASA Earthdata and the IMPACT project website.