API Reference

MagGeo provides a comprehensive API for geomagnetic field analysis and GPS trajectory annotation.

API Reference

Core Functions

`annotate_gps_with_geomag(*args, **kwargs)`

SwarmDataManager

`SwarmDataManager`

Manages Swarm satellite data download, storage, and retrieval operations.

This class provides a high-level interface for working with Swarm data independently from the main MagGeo pipeline.

Functions

`init(data_dir='swarm_data', file_format='csv', chunk_size=10, token=None)`

Initialize SwarmDataManager.

Parameters

data_dir : str, default "swarm_data" Directory to store downloaded Swarm data file_format : str, default "parquet" File format for saving data. Options: "csv", "parquet" chunk_size : int, default 10 Number of dates to process in each batch token : str, optional VirES token for authentication

`download_for_trajectory(gps_df, save_individual_files=True, save_concatenated=True, resume=True)`

Download Swarm data for an entire GPS trajectory.

Parameters

gps_df : pd.DataFrame GPS trajectory data with datetime information save_individual_files : bool, default True Whether to save individual daily files save_concatenated : bool, default True Whether to save concatenated files for each satellite resume : bool, default True Whether to skip already downloaded files

Returns

tuple Tuple of concatenated DataFrames for satellites A, B, C

`download_for_dates(dates, save_individual_files=True, save_concatenated=True, resume=True)`

Download Swarm data for specific dates.

Parameters

dates : List[dt.date] List of dates to download data for save_individual_files : bool, default True Whether to save individual daily files save_concatenated : bool, default True Whether to save concatenated files for each satellite resume : bool, default True Whether to skip already downloaded files

Returns

tuple Tuple of concatenated DataFrames for satellites A, B, C

`load_data_for_dates(dates, satellites=['A', 'B', 'C'])`

Load previously downloaded Swarm data for specific dates.

Parameters

dates : List[dt.date] List of dates to load data for satellites : List[str], default ['A', 'B', 'C'] Which satellites to load data for

Returns

dict Dictionary with satellite names as keys and concatenated DataFrames as values

`load_concatenated_data(satellites=['A', 'B', 'C'])`

Load previously saved concatenated Swarm data.

Parameters

satellites : List[str], default ['A', 'B', 'C'] Which satellites to load data for

Returns

dict Dictionary with satellite names as keys and DataFrames as values

`get_data_summary()`

Get summary of available downloaded data.

Returns

pd.DataFrame Summary of available data files with metadata

`cleanup_data(older_than_days=None, quality_threshold='poor')`

Clean up downloaded data files.

Parameters

older_than_days : int, optional Remove files older than this many days quality_threshold : str, default 'poor' Remove files with data quality below this threshold

Returns

int Number of files removed

Methods

`download_for_trajectory(gps_df, save_individual_files=True, save_concatenated=True, resume=True)`

Download Swarm data for an entire GPS trajectory.

Parameters

gps_df : pd.DataFrame GPS trajectory data with datetime information save_individual_files : bool, default True Whether to save individual daily files save_concatenated : bool, default True Whether to save concatenated files for each satellite resume : bool, default True Whether to skip already downloaded files

Returns

tuple Tuple of concatenated DataFrames for satellites A, B, C

`load_concatenated_data(satellites=['A', 'B', 'C'])`

Load previously saved concatenated Swarm data.

Parameters

satellites : List[str], default ['A', 'B', 'C'] Which satellites to load data for

Returns

dict Dictionary with satellite names as keys and DataFrames as values

Utility Functions

`download_swarm_data_for_trajectory(gps_df, data_dir='swarm_data', file_format='csv', token=None, resume=True)`

Convenience function to download Swarm data for a GPS trajectory.

Parameters

gps_df : pd.DataFrame GPS trajectory data data_dir : str, default "swarm_data" Directory to store data file_format : str, default "csv" File format for saving data token : str, optional VirES authentication token resume : bool, default True Whether to resume from existing downloads

Returns

tuple Tuple of concatenated DataFrames for satellites A, B, C

`load_swarm_data(data_dir='swarm_data', file_format='csv', satellites=['A', 'B', 'C'])`

Convenience function to load previously downloaded Swarm data.

Parameters

data_dir : str, default "swarm_data" Directory containing the data file_format : str, default "csv" File format of the data satellites : List[str], default ['A', 'B', 'C'] Which satellites to load

Returns

dict Dictionary with satellite names as keys and DataFrames as values

Quick Reference

Core Functions

Function	Description
`annotate_gps_with_geomag`	Main function for annotating GPS trajectories
`download_swarm_data_for_trajectory`	Download Swarm data for specific trajectory
`load_swarm_data`	Load previously downloaded Swarm data

Classes

Class	Description
`SwarmDataManager`	Manages Swarm data downloading and storage

Modules

Module	Description
`parallel_processing`	Parallel processing utilities
`indices`	Geomagnetic indices and calculations

Architecture Overview

GPS Trajectory → MagGeo Core → SwarmDataManager → VirES API
     ↓              ↓              ↓
Input Data → Processing Pipeline → Local Storage
     ↓              ↓              ↓
Validation → Interpolation → Persistent Files
     ↓              ↓              ↓
Quality Check → CHAOS Model → Annotated Output

Data Flow

Input: GPS trajectory with coordinates and timestamps
Data Acquisition: Download Swarm satellite data via VirES API
Storage: Persist data locally for reuse (SwarmDataManager)
Processing: Interpolate magnetic field values to GPS locations
Enhancement: Add CHAOS model data and geomagnetic indices
Output: Annotated trajectory with comprehensive magnetic field information

Performance Considerations

Use SwarmDataManager for repeated analysis of the same time periods
Enable parallel processing for large datasets (>10,000 GPS points)
Choose appropriate file formats: Parquet for performance, CSV for compatibility
Batch process multiple trajectories when possible

Error Handling

All MagGeo functions implement comprehensive error handling, you need to activate the DEBUG mode to see the error messages. Use the --debug flag when running scripts or set the environment variable MAGGEO_DEBUG=1.

MAGGEO_DEBUG=1 python -m maggeo.annotate_gps_with_geomag --debug

API Reference

API Reference

Core Functions

annotate_gps_with_geomag(*args, **kwargs)

SwarmDataManager

SwarmDataManager

Functions

__init__(data_dir='swarm_data', file_format='csv', chunk_size=10, token=None)

Parameters

download_for_trajectory(gps_df, save_individual_files=True, save_concatenated=True, resume=True)

Parameters

Returns

download_for_dates(dates, save_individual_files=True, save_concatenated=True, resume=True)

Parameters

Returns

load_data_for_dates(dates, satellites=['A', 'B', 'C'])

Parameters

Returns

load_concatenated_data(satellites=['A', 'B', 'C'])

Parameters

Returns

get_data_summary()

Returns

cleanup_data(older_than_days=None, quality_threshold='poor')

Parameters

Returns

Methods

download_for_trajectory(gps_df, save_individual_files=True, save_concatenated=True, resume=True)

Parameters

Returns

load_concatenated_data(satellites=['A', 'B', 'C'])

Parameters

Returns

Utility Functions

download_swarm_data_for_trajectory(gps_df, data_dir='swarm_data', file_format='csv', token=None, resume=True)

Parameters

Returns

load_swarm_data(data_dir='swarm_data', file_format='csv', satellites=['A', 'B', 'C'])

Parameters

Returns

Quick Reference

Core Functions

Classes

Modules

Architecture Overview

Data Flow

Performance Considerations

Error Handling

`annotate_gps_with_geomag(*args, **kwargs)`

`SwarmDataManager`

`init(data_dir='swarm_data', file_format='csv', chunk_size=10, token=None)`

`download_for_trajectory(gps_df, save_individual_files=True, save_concatenated=True, resume=True)`

`download_for_dates(dates, save_individual_files=True, save_concatenated=True, resume=True)`

`load_data_for_dates(dates, satellites=['A', 'B', 'C'])`

`load_concatenated_data(satellites=['A', 'B', 'C'])`

`get_data_summary()`

`cleanup_data(older_than_days=None, quality_threshold='poor')`

`download_for_trajectory(gps_df, save_individual_files=True, save_concatenated=True, resume=True)`

`load_concatenated_data(satellites=['A', 'B', 'C'])`

`download_swarm_data_for_trajectory(gps_df, data_dir='swarm_data', file_format='csv', token=None, resume=True)`

`load_swarm_data(data_dir='swarm_data', file_format='csv', satellites=['A', 'B', 'C'])`