News Significant milestone reached in Sentinel-2 Collection 3 migration

Collection 3 uses the same raw data as the previous collection, but it's processed and packaged in a more accurate and accessible way.

Last updated:18 June 2024

Digital Earth Australia recently “pushed the big red button” to activate the large migration of Sentinel-2 data to its next collection: Sentinel-2 Collection 3.

Over many months, DEA has been laying the groundwork to shift our collection of Sentinel-2 data packages from the first-generation Collection 1 to the new Collection 3. (And that is not a typo; we’ve intentionally skipped Collection 2 so we can align our Sentinel-2 sequencing with the current Landsat data, which completed its migration to Collection 3 12 months ago).

“Collection 3 offers improved metadata for users working with either Landsat or Sentinel data,” said Amanda Norton, Senior Data Quality Specialist at DEA.

“Using the same standards and metadata makes it easier for users familiar with one of the products to understand and use the other, and leads to easier comparisons and combination of the data.

“This is a significant milestone as the ability to combine data from multiple satellites enables us to improve some of Digital Earth Australia's existing products. For example, the next generation of the DEA Intertidal Elevation product will include data from both Landsat and Sentinel-2, capitalising on more frequent observations and higher pixel resolution, at 10m.”

A collection upgrade is where the same raw data used by the previous collection is processed and packaged in a more accurate and accessible way. It is not as simple as a reissue of, say, a new volume of Encyclopaedia Britannica, where a new edition might include a few new pages and some updated facts. Instead, the collection upgrade is akin to a movie franchise reboot. From a user experience point of view, it's the same overall story but it has evolved in fundamentally different ways.

The groundwork was laid throughout 2022, with the migration registering the Analysis Ready data (ARD) component of DEA’s Sentinel-2 data collection.

Fortunately DEA was able to utilise the same approach and methods taken with DEA’s Landsat collections: the storage was the same, it used the same software, the same input ancillary data used in the model. By following the same path, DEA was able to get alignment on the Sentinel-2 and Landsat ARD, as well as alignment on the category and catalogue numbers, and methodology.

WATCH - Before and after: Sentinel-2 animation over Seaham, New South Wales, demonstrating how DEA's ARD processing takes noisier raw Level 1C data (0-12 sec) and improves its accuracy and consistency across time (12-26 sec), allowing users to more easily gain insights from Sentinel-2 data.

Essentially, this is what DEA users would expect from the way these products are put together. Plus, the Landsat and Sentinel collections now have matching naming conventions.

With the final stages reached in January 2023, Amanda said that for DEA users who “had processes pointing to the old collection, the data has been left in place until February”. That provided more than a month’s grace to shift.

“Some of the web services such as DEA Maps were transitioned to use the Collection 3 data late last year,” she said.

“For users, this means that when accessing these services they will already be viewing the new collection, which includes the newest (Near Real Time) data and the final data all in the same collection.”

For users encountering issues, support is also available. Geoscience Australia staff are available to answer questions on Slack or by email, and by providing example code and documentation for using the new collection.

What comes with the new and improved Collection 3?

One of DEA’s aims is continuous improvement of data quality and accessibility, with the aim of bringing more people to Earth observation information and technology. The Sentinel-2 Collection 3 upgrade will provide data faster and cleaner than the previous collection, making it more user friendly. There’s less chaff, less noise (and hopefully fewer headaches).

The migration means DEA’s Landsat and Sentinel ARD processes will be set to a consistent standard, making it much easier for end users to move between datasets. It also means DEA can begin upgrading derivative products – such as DEA Land Cover, DEA Coastlines and DEA Water Observations – to incorporate Sentinel data where they currently only use Landsat.

Sentinel-2 Collection 3 uses the same raw data used by the previous collection is processed and packaged in a more accurate and accessible way.

Here are some key upgrades:

  • Sentinel-A and Sentinel-B have a revisit frequency of 10 days, but as there are two satellites working in tandem that means a new fly over every 5 days.
  • Spatial resolution is 10 metres for optical and near-infrared.

Collection 3 has the same data as Collection 1, but it’s been processed and packaged differently with updates on:

  • Upgraded spectral response function, BRDF
  • New cloud probability layer and cloud mask
  • New metadata for filtering by cloud cover and geometric accuracy

Much of the user feedback on Sentinel-2 Collection 1 focused on poor quality cloud classifications provided by the 'Fmask' cloud mask, which made it difficult to reliably obtain cloud-free data over certain environments containing bright white features (e.g. urban areas, coastal beaches).

Sentinel-2 Collection 3 addresses these issues by providing a powerful new Sentinel-2 specific cloud mask (s2cloudless), and new consistent metadata that allows users to filter the entire Sentinel-2 archive by dataset processing levels and data quality metrics to obtain the exact subset of data required for their work.

Three levels of maturity:

  • Near Real Time (NRT) – fast but with lower quality
  • Interim
  • Final – slowest but with highest quality

The division of Sentinel-2 Collection 1 data into separate "definitive" and "Near Real Time" products also made finding and loading data overly complicated, while limited metadata for individual satellite images made it difficult to search for and obtain the high-quality data required for specific applications.

DEA Earth Observation Data Scientist Aman Chopra explained how the new collection’s levels of “maturity” were a standout feature that eliminated various processes.

“Before Collection 3, users had to load multiple products to obtain Sentinel-2 data of all maturity levels” he said. “Now, users can filter datasets by their maturity level from within the one product.”

Datasets in the Sentinel-2 Collection 3 product are associated with one of three different levels of maturity (from lowest to highest quality): Near Real Time (NRT), Interim or Final.

The groundwork was laid throughout 2022, with the migration registering the Analysis Ready data (ARD) component of DEA’s Sentinel-2 data collection.

NRT provides images in less than 48 hours, which is useful for emergency management. But the trade-off is lower quality images. At the other end of the scale, the more rigorously corrected Final data (available when all the supplementary data used for processing becomes available) can feed “scientific papers or operational data products”, Aman said.

Another new feature of Sentinel-2 Collection 3 is that users can filter out poorly geo-referenced scenes thanks to the extra geometric quality assessment (GQA) metadata fields.

“Roughly speaking, the GQA measures the offset in position of the satellite image against the known position of hundreds of Ground Control Points,” Aman said.

“This is particularly useful for helping to address the poor geometric accuracy of Sentinel-2 data early in the satellite record,” such as for images collected before improvements to geolocation accuracy made possible by ESA's Sentinel-2 Global Reference Image (GRI).

The move to Collection 3 has been absolutely essential based on feedback from DEA users. It also aligns DEA’s data with best practices and where the industry is headed at this moment. Collection 3 not only aligns DEA’s data with the EO community, it also brings users capabilities they have not had access to before now. Having this array of new features – high quality ARD, s2cloudless, geometric metadata, dataset maturity – together in one package is an exciting development for EO in Australia, with immense benefits to be shared by government, industry and the scientific community.

Users who encounter errors or issues are encouraged to contact Geoscience Australia staff who are available to answer questions on Slack or by email, and by providing example code and documentation for using the new collection.

This image shows how the Sentinel-2 Collection 3 has improved cloud masking with S2Cloudless​: (a): Original RGB image. (b): Image (a) after s2cloudless masking. Masked pixels are white. (c) s2cloudless probability layer. (d) Image (a) after masking pixels where the s2cloudless probability layer is above or equal to 90%. Masked pixels are black.