Skip to content

API Reference

MagGeo provides a comprehensive API for geomagnetic field analysis and GPS trajectory annotation.

API Reference

Core Functions

annotate_gps_with_geomag(*args, **kwargs)

SwarmDataManager

SwarmDataManager

Manages Swarm satellite data download, storage, and retrieval operations.

This class provides a high-level interface for working with Swarm data independently from the main MagGeo pipeline.

Functions

__init__(data_dir='swarm_data', file_format='csv', chunk_size=10, token=None)

Initialize SwarmDataManager.

Parameters

data_dir : str, default "swarm_data" Directory to store downloaded Swarm data file_format : str, default "parquet" File format for saving data. Options: "csv", "parquet" chunk_size : int, default 10 Number of dates to process in each batch token : str, optional VirES token for authentication

download_for_trajectory(gps_df, save_individual_files=True, save_concatenated=True, resume=True)

Download Swarm data for an entire GPS trajectory.

Parameters

gps_df : pd.DataFrame GPS trajectory data with datetime information save_individual_files : bool, default True Whether to save individual daily files save_concatenated : bool, default True Whether to save concatenated files for each satellite resume : bool, default True Whether to skip already downloaded files

Returns

tuple Tuple of concatenated DataFrames for satellites A, B, C

download_for_dates(dates, save_individual_files=True, save_concatenated=True, resume=True)

Download Swarm data for specific dates.

Parameters

dates : List[dt.date] List of dates to download data for save_individual_files : bool, default True Whether to save individual daily files save_concatenated : bool, default True Whether to save concatenated files for each satellite resume : bool, default True Whether to skip already downloaded files

Returns

tuple Tuple of concatenated DataFrames for satellites A, B, C

load_data_for_dates(dates, satellites=['A', 'B', 'C'])

Load previously downloaded Swarm data for specific dates.

Parameters

dates : List[dt.date] List of dates to load data for satellites : List[str], default ['A', 'B', 'C'] Which satellites to load data for

Returns

dict Dictionary with satellite names as keys and concatenated DataFrames as values

load_concatenated_data(satellites=['A', 'B', 'C'])

Load previously saved concatenated Swarm data.

Parameters

satellites : List[str], default ['A', 'B', 'C'] Which satellites to load data for

Returns

dict Dictionary with satellite names as keys and DataFrames as values

get_data_summary()

Get summary of available downloaded data.

Returns

pd.DataFrame Summary of available data files with metadata

cleanup_data(older_than_days=None, quality_threshold='poor')

Clean up downloaded data files.

Parameters

older_than_days : int, optional Remove files older than this many days quality_threshold : str, default 'poor' Remove files with data quality below this threshold

Returns

int Number of files removed

Methods

download_for_trajectory(gps_df, save_individual_files=True, save_concatenated=True, resume=True)

Download Swarm data for an entire GPS trajectory.

Parameters

gps_df : pd.DataFrame GPS trajectory data with datetime information save_individual_files : bool, default True Whether to save individual daily files save_concatenated : bool, default True Whether to save concatenated files for each satellite resume : bool, default True Whether to skip already downloaded files

Returns

tuple Tuple of concatenated DataFrames for satellites A, B, C

load_concatenated_data(satellites=['A', 'B', 'C'])

Load previously saved concatenated Swarm data.

Parameters

satellites : List[str], default ['A', 'B', 'C'] Which satellites to load data for

Returns

dict Dictionary with satellite names as keys and DataFrames as values

Utility Functions

download_swarm_data_for_trajectory(gps_df, data_dir='swarm_data', file_format='csv', token=None, resume=True)

Convenience function to download Swarm data for a GPS trajectory.

Parameters

gps_df : pd.DataFrame GPS trajectory data data_dir : str, default "swarm_data" Directory to store data file_format : str, default "csv" File format for saving data token : str, optional VirES authentication token resume : bool, default True Whether to resume from existing downloads

Returns

tuple Tuple of concatenated DataFrames for satellites A, B, C

load_swarm_data(data_dir='swarm_data', file_format='csv', satellites=['A', 'B', 'C'])

Convenience function to load previously downloaded Swarm data.

Parameters

data_dir : str, default "swarm_data" Directory containing the data file_format : str, default "csv" File format of the data satellites : List[str], default ['A', 'B', 'C'] Which satellites to load

Returns

dict Dictionary with satellite names as keys and DataFrames as values

Quick Reference

Core Functions

Function Description
annotate_gps_with_geomag Main function for annotating GPS trajectories
download_swarm_data_for_trajectory Download Swarm data for specific trajectory
load_swarm_data Load previously downloaded Swarm data

Classes

Class Description
SwarmDataManager Manages Swarm data downloading and storage

Modules

Module Description
parallel_processing Parallel processing utilities
indices Geomagnetic indices and calculations

Architecture Overview

GPS Trajectory → MagGeo Core → SwarmDataManager → VirES API
     ↓              ↓              ↓
Input Data → Processing Pipeline → Local Storage
     ↓              ↓              ↓
Validation → Interpolation → Persistent Files
     ↓              ↓              ↓
Quality Check → CHAOS Model → Annotated Output

Data Flow

  1. Input: GPS trajectory with coordinates and timestamps
  2. Data Acquisition: Download Swarm satellite data via VirES API
  3. Storage: Persist data locally for reuse (SwarmDataManager)
  4. Processing: Interpolate magnetic field values to GPS locations
  5. Enhancement: Add CHAOS model data and geomagnetic indices
  6. Output: Annotated trajectory with comprehensive magnetic field information

Performance Considerations

  • Use SwarmDataManager for repeated analysis of the same time periods
  • Enable parallel processing for large datasets (>10,000 GPS points)
  • Choose appropriate file formats: Parquet for performance, CSV for compatibility
  • Batch process multiple trajectories when possible

Error Handling

All MagGeo functions implement comprehensive error handling, you need to activate the DEBUG mode to see the error messages. Use the --debug flag when running scripts or set the environment variable MAGGEO_DEBUG=1.

MAGGEO_DEBUG=1 python -m maggeo.annotate_gps_with_geomag --debug