IMPACT Datasets Enhance Students Learning
When planning the curriculum for the Data Analytics course at Rensselaer Polytechnic Institute (RPI) in the fall of 2020, lecturer Thilanka Munasinghe knew that access to real-world datasets was critical for maximizing student learning. Through conversations with late RPI professor Peter Fox, then the director of the Information and Technology and Web Science program, Munasinghe learned about the Earth observation datasets being generated by IMPACT researchers. He reached out to IMPACT project manager Rahul Ramachandran and requested access to some datasets his students could use to practice applying advanced data analysis techniques. Ramachandran and his team welcomed this collaborative opportunity and provided the students with several datasets.
Because they harbor an abundance of data, satellite image collections provide a natural opportunity to use machine learning (ML) models to identify individual events and patterns. In particular, students sought to harness the deep learning mechanisms of neural networks and other image processing methods to classify images into categories. They used IMPACT-provided ML training datasets to select and study a variety of meteorological and atmospheric phenomena including cloud formations, dust deposition, and cyclones. The datasets were created using the ImageLabeler tool developed by IMPACT to streamline how Earth science phenomena are identified in images.
Munasinghe’s students capitalized on the chance to work with real data. Throughout the semester they trained and tested ML models and analyzed their relative success. Several student research projects generated publication-worthy results that were included in proceedings from the Institute of Electrical and Electronics Engineers (IEEE) International Conferences on Big Data in 2020 and 2021.
Student Ariane Maharaj used images with hand-curated labels created and stored in ImageLabeler as the basis for her project. IMPACT researchers originally sourced the data from NASA’s Global Imagery Browse Service (GIBS) using the Web Map Tile Service (WMTS). Studying the images allowed Maharaj to assess how high-latitude dust in parts of Alaska, Iceland, and Patagonia affects precipitation quantity and frequency. Combining Moderate Resolution Imaging Spectroradiometer (MODIS) images with the data received from IMPACT and NASA Global Precipitation Measurement Mission (GPM) data, modeling outputs from correlation and decision tree analysis indicated a possible negative correlation between dust events and precipitation.
A student group that included Sharmad Joshi, Jessie Ann Owens, and Shlok Shah chose to work with IMPACT’s cloud street satellite image dataset. Cloud streets are distinctive rows of cumulus clouds that form parallel to the local wind direction. To detect cloud streets in the image data, the students tested several deep learning convolutional neural network (CNN) models. Achieving a maximum detection accuracy of over 80%, they concluded that CNNs may be a valuable tool in identifying cloud street formations. Gregory Saini and Hefu (Kevin) Pan studied a similar phenomenon called transverse cirrus band (TCB) clouds that appear at higher altitudes and often indicate turbulent atmospheric conditions. Comparing the performance of CNN models trained with original, unaltered images versus those trained with modified, grayscale images showed that TCB detection accuracy was best when using the original, colored images.
Another student, Brendan Donnelly, opted to use image files provided by IMPACT from NASA Worldview to examine cyclone data. He also implemented CNN models to achieve up to 75% accuracy in determining if true cyclones appeared in the images. Classmate Chau-Lin Charly Huang presented a literature review of advances in using neural network-based object detection to identify smoke in GOES 16 satellite images. His further research will include analyzing images from NASA’s Aqua and Terra satellites.
When asked how students benefited from access to IMPACT data, Munasinghe expressed that using the datasets augmented their learning:
Our RPI Data Analytics course students benefited significantly from using NASA IMPACT datasets during their class projects. Students were able to use datasets that were related to real-world problems. The IMPACT team provided background information on how to use the NASA datasets for students. The guidance and mentorship from the IMPACT team helped our students tremendously to do their class projects effectively. Collaboration with the IMPACT team provided a new pedagogical environment within the classroom for our students to thrive and excel in their studies.
NASA’s ImageLabeler tool can be accessed on the ImageLabeler website.