Skip to content

Data Format Reference

This section contains detailed reference documentation for OpenPlanetData datasets formats.

All JSON datasets follow consistent formatting:

{
"version": "1.0.0",
"generated_at": "2024-01-15T00:00:00Z",
"license": "CC-BY-4.0",
"data": [
// Array of records
]
}

CSV files include a header row with field names matching the JSON field names.

alpha2,alpha3,name,capital,region
US,USA,United States,Washington D.C.,Americas
FR,FRA,France,Paris,Europe

Parquet files use the same schema as JSON with appropriate type mappings:

JSON TypeParquet Type
stringUTF8
number (int)INT64
number (float)DOUBLE
booleanBOOLEAN
arrayLIST
objectSTRUCT

Datasets follow semantic versioning:

  • MAJOR - Breaking changes to schema or data format
  • MINOR - New fields or data additions
  • PATCH - Bug fixes and corrections

All releases include SHA256 checksums:

Terminal window
# Verify download integrity
sha256sum -c checksums.txt

When accessing releases via GitHub:

  • Unauthenticated: 60 requests/hour
  • Authenticated: 5000 requests/hour

For high-volume access, download datasets and host locally.