Skip to contents

Downloads and stacks EPA air quality data for specified parameters with progress tracking.

Usage

get_epa_airdata(
  analyte,
  start_year,
  end_year,
  freq,
  output_dir = "data/",
  prompt_download = F,
  archive = FALSE,
  archive_id = NULL
)

Arguments

analyte

Character string specifying the EPA analyte code (e.g., "88101" for PM2.5)

start_year

Numeric value for the starting year of data collection

end_year

Numeric value for the ending year of data collection

freq

Character string specifying the frequency of analysis (e.g., "daily", "hourly", "annual")

output_dir

Character string specifying the directory for downloaded files. Defaults to "data/"

prompt_download

Boolean indicating whether to prompt user before downloading (default: False)

archive

Logical. If TRUE, the function retrieves data from the Wayback Machine (Internet Archive) rather than the live EPA AirData website. Defaults to FALSE.

archive_id

Character. The timestamp ID for the archived version of the EPA AirData website on the Wayback Machine (only used if archive = TRUE). Defaults to "20250126115248".

Value

A data frame containing the stacked EPA air quality data or NULL if no data is found or download is cancelled

Details

The function includes interactive confirmation before downloading and displays a progress bar during the download process. It will create the output directory if it doesn't exist.

Examples

if (FALSE) {
# Download PM2.5 data
pm25_data <- get_epa_data(
  analyte = "88101",
  start_year = 2020,
  end_year = 2023,
  freq = "daily",
  output_dir = "path/to/my/data/"
)
}