Unlocking the Power of Satellite Imagery with SeeFar: A Comprehensive Multi-Resolution Dataset
James Lowman and Mojtaba Valipour
June 10th, 2024
Dataset Link:
Paper Link:
In the rapidly evolving field of geospatial analysis, access to high-quality satellite imagery is paramount. The SeeFar dataset represents a significant leap forward by providing a multi-resolution, satellite-agnostic dataset designed to train geospatial foundation models. This blog post delves into the creation, significance, and potential applications of the SeeFar dataset, highlighting its contributions to the geospatial community.
Introduction to SeeFar
Satellite imagery has transformed our understanding of the Earth, offering invaluable data for environmental monitoring, urban planning, disaster response, and agricultural management. Advances in satellite technology have dramatically increased the accessibility and quality of this imagery. However, challenges such as high costs and limited historical availability of commercial satellite data persist, impeding the training of comprehensive geospatial models.
The SeeFar dataset addresses these challenges by curating a collection of multi-resolution satellite images from both public and commercial sources. This dataset is designed to be satellite-agnostic, enabling the integration of historical low-resolution data with modern high-resolution imagery, thus providing greater flexibility and accuracy during model inference.
Figure 1
The Problem with Current Satellite Datasets
Existing satellite datasets face several limitations that hinder their effective integration and analysis:
  1. Multi-Format Data: The diversity in data formats makes it challenging to combine and analyze images from different sources.
  2. Inconsistent Metadata: Variability in metadata across datasets complicates tasks such as temporal analysis and multi-sensor fusion.
  3. Temporal Scale Differences: Differences in temporal resolution limit consistent time-series analysis.
  4. Non-Uniform Spectral Bands: Variations in spectral bands make data standardization difficult.
  5. Variable Resolutions: Discrepancies in spatial resolution affect the accuracy and reliability of multi-resolution analysis.
SeeFar aims to overcome these obstacles by standardizing data from diverse satellite sources, thereby enhancing its applicability and facilitating more robust and scalable research and applications.
Figure 2
Key Contributions of SeeFar
The SeeFar dataset offers several critical contributions to the field of geospatial analysis:
  1. Facilitating Foundation Model Training: By integrating images from multiple resolutions and harmonizing multispectral bands, SeeFar supports the development of satellite-agnostic foundation models. This approach enhances the adaptability of models to various satellite imagery, both public and commercial.
  2. Standardization: A robust framework was developed to standardize data from various satellite sources, ensuring consistency and enhancing interoperability across different imagery.
  3. Benchmarking: SeeFar includes well-curated training, validation, and test sets, enabling the fine-tuning and evaluation of geospatial models. These benchmarks support the development of models capable of handling diverse satellite data.
  4. Multi-Resolution and Multi-Satellite Usability: The dataset consists of images at four different resolutions (30 meters, 10 meters, 1.5 meters, and 1 meter) from satellites such as Landsat 8 and 9, Sentinel 2, NewSat IV, and Spot 6 and 7. This multi-resolution approach supports a wide range of geospatial analysis tasks.
  5. Metadata Augmentation: Comprehensive metadata is provided, including Ground Sample Distance (GSD), spectral wavelength bounds, and georeferencing information. This metadata ensures data transparency and reliability, crucial for precise scientific analysis.
Methodology Behind SeeFar
The creation of the SeeFar dataset involved several meticulous steps to ensure consistency and quality:
  1. Dataset Selection: A subset of public and commercial datasets was selected based on licensing, content richness, alignment possibilities, and unique resolutions.
  2. Raw Image Processing: Each data source was pre-processed to ensure consistent information across different formats.
  3. Patch Creation: Consistent patch sizes (384x384 pixels) were created to focus on semantic differences rather than aspect ratios or other biases.
  4. Channel Selection: Blue, Green, Red, and Near-Infrared (NIR) spectral bands were selected for consistency across datasets.
  5. Normalization: Histogram matching was used to ensure consistent color, brightness, and contrast across different samples.
  6. Metadata Augmentation: Metadata was augmented to include GSD, spectral bands, and georeferencing information.
  7. Quality Control: Automated mechanisms were used to check for consistency and quality, including cloud and no-data pixel identification.
Conclusion and Future Directions
The SeeFar dataset represents a significant advancement in the field of geospatial analysis, offering a comprehensive and standardized dataset that bridges the gap between different satellite sources. By providing high-quality, multi-resolution satellite data, SeeFar enables more informed decision-making and supports a broad spectrum of geospatial research and applications.
As SeeFar continues to evolve, future updates and expansions will aim to incorporate a broader range of resolutions and further improve data quality and consistency. This ongoing development ensures that SeeFar will remain a valuable resource for researchers, policymakers, and industry professionals, fostering innovation and advancing the field of satellite imagery analysis.
In summary, SeeFar provides the geospatial community with a powerful tool for unlocking the full potential of satellite imagery, paving the way for more accurate, flexible, and scalable geospatial models and analyses.
  • Presented by Coastal Carbon AI Team.
  • AI assistants has been utilized for enhancing the quality of writing.
We're on a mission to enable ocean restoration and climate action, profitably.
Interested in joining us or learning more? We would love to hear from you!
Find out more