Greetings from the Big Geodata Newsletter!
In this issue you will find information on recent foundational models leveraging EO datasets, new meta-learning frameworks, a global 30 m land-cover product and on an interesting challenge on crop yield prediction!
F. Campomanes (Enzo), L. Trento Oliveira (Lorraine), Dr. Mariana Belgiu, Dr. Monika Kuffer, and Dr. Anne M. Dijkstra share their experience in using our Geospatial Computing Platform for Efficient Training of Slum Mapping Models using Data Shapley Approximations under the SPACE4ALL project at the ITC Faculty. Don't miss the Big Geodata Story!
Happy reading!
You can access the previous issues of the newsletter on our web portal. If you find the newsletter useful, please share the subscription link below with your network.
Foundational Models in Weather and Climate
Image credits: IBM Research, 2024
Building new AI models to take advantage of the vast EO datasets available usually takes a long time and is constrained also by data storage availability. IBM Research predicts the next wave in AI will replace task specific models with Foundational Models (FM) that are pre-trained on a broad set of unlabeled data which can then be used for specific applications with minimal fine-tuning. NASA and IBM Research have collaboratively worked on one such model in last year - Prithvi Geospatial FM. It is trained on 250.000 TB NASA Harmonized Landsat and Sentinel-2 (HLS) dataset on the IBM Cloud Vela supercomputer. The model is a self-supervised Vision Transformer model that leverages both spatial and temporal attention mechanism to process satellite images. This model can then be fine-tuned for diverse applications. The Prithvi-Weather-Climate FM is a recent update from this collaboration. Using 40 years of data from NASA's Modern-Era Retrospective analysis for Research and Applications, Version 2 (MERRA-2), the model aims to aid applications, such as storm tracking, forecasting, and historical analysis. It is said to better represent small-scale physical processes in numerical weather and climate models, and can generate targeted forecasts using hyper-localized observations such as in wind turbine locations.
The Prithvi Geospatial FM model is available on Hugging Face with examples of fine-tuned applications in flood mapping and fire hazard detection. Prithvi Weather Climate FM is yet to be made public.
Meta-Learning Improves Earth Observation Analysis
Image credits: Rußwurm et al., 2024
METEOR, is a innovative meta-learning framework designed to handle a broad spectrum of Earth Observation challenges. This novel transfer learning methodology utilizes model-agnostic meta-learning (MAML) to create problem-specific neural networks from a global meta-model with minimal labeled data. Key innovations include replacing transductive batch normalization with instance normalization, ensembling binary classifiers for varying class numbers, and dynamically adjusting input channels for different spectral bands. These modifications enabled the meta-model to address diverse remote sensing tasks, from urban analysis to deforestation mapping. METEOR is said to outperform existing methods like SparseMAML and TaskNorm-I on realistic, class-imbalanced problems, offering a versatile, open-source tool for Earth science applications, highlighting its potential in tackling critical environmental challenges.
For detailed insights, read the full article here or check out this interesting talk on Meteor. For trying it out yourself, head on to this repo.
The FutureCrop Challenge
Image credits: AgML, 2024
Can recent historical data help forecast the impacts of climate change on agriculture? The FutureCrop Challenge seeks to answer this by inviting participants to predict maize and wheat yields based on soil and weather data under a high-emissions scenario. Participants are to use data from 1980 to 2020 and forecast yields from 2021 to 2100. Input data is provided in the form of two-dimensional files such that each row corresponds to a unique combination of latitude, longitude, year and crop. Since actual future yields are unknown, the challenge uses simulated outputs from a validated crop model, providing a realistic basis for testing predictive models against potential future scenarios.
Deadline: 7th September 2024. Don't miss rules of this competition here. For more details and to participate, visit FutureCrop Challenge.
Upcoming Meetings
- SURF Training: Introduction to Supercomputing, part I
6 August 2024, Amsterdam - CRIB Training: Introduction to Geospatial Raster and Vector with R
21-22 August 2024, ITC, Enschede - QGIS User Conference
9-10 September 2024, Bratislava - National Open Science Festival
22 October 2024, Maastricht University
Recent Releases
- Open Data Cube: Open-source geospatial data management and analysis software
1.8.19 (2/7/2024) - GeoMesa: Suite of tools that enables large-scale geospatial querying and analytics
5.0.1 (12/7/2024) - NetCDF-Java: Java framework for reading netCDF and other file formats
5.6.0 (16/7/2024) - XGBoost: Optimized distributed gradient boosting library
2.1.1 (30/7/2024)
The "Big" Picture
Image credits: Zhang et al., 2024
Achieving a thorough understanding and quantitative assessment of long-term changes in Land Cover requires a consistent dataset featuring multi-class time-series classification. The new GLC_FCS30D product is a Global Land Cover product that captures changes in 35 classes every 5 years from 1985-2000, and annually from 2001-2022, at a granular 30-meter scale. The availability of such highly granular dataset consistently across time is greatly useful in studies of land change dynamics. The dataset is built using atmospherically corrected and radiometric normalized Landsat imagery from 1984 to 2022 available on the Google Earth Engine (GEE) cloud platform. Created using a continuous change detection algorithm, stable areas were identified and used as training samples for the locally adaptive classification of changed areas. The dataset achieved an accuracy of 73-81% when validated with over 84000 globally distributed validation samples from 2020 and two other land cover products covering the United States and Europe Union.
The dataset can be freely accessed via Zenodo (Liu et al., 2023) or alternatively, from the Awesome GEE Community catalog. The dataset can be compared with previously available Land Cover products like the ESA CCI Land Cover, MODIS Land Cover Type or the UMD GLAD LULC using the OpenLandMap global data portal. Make sure to use the ‘turn on comparison’ button!
Zhang, X., Zhao, T., Xu, H., Liu, W., Wang, J., Chen, X., and Liu, L.: GLC_FCS30D: The first global 30 m land-cover dynamics monitoring product with a fine classification system for the period from 1985 to 2022 generated using dense-time-series Landsat imagery and the continuous change-detection method, Earth Syst. Sci. Data, 16, 1353–1381, doi: https://doi.org/10.5194/essd-16-1353-2024