USGS

The United States Geological Survey’s (USGS) National Water Information System (NWIS) provides different categories of water data for sites all across the US. This includes information about both surface and ground water, and for physical, chemical, and pollution variables. searvey uses NWIS REST API through dataretrieval package to access this data. Currently only data about elevation and flow rate are exposed in searvey.

A list of USGS stations is provided with the get_usgs_stations() function with various subsetting options.

searvey.usgs.get_usgs_stations(region=None, lon_min=None, lon_max=None, lat_min=None, lat_max=None, bbox=None, site_nos=None, include_parameter_availability=False, api_key=None)

Return USGS station metadata using the modernized Water Data API.

Three query modes:

  1. By site numbers (site_nos): Fast direct query for specific stations.

  2. By bounding box (bbox): Fast direct query for a geographic region. Format: [lon_min, lat_min, lon_max, lat_max].

  3. By region/lon/lat (legacy): Fetches all US stations (cached), then filters in-memory by region or bounding box corners.

These modes are mutually exclusive.

Note: The longitudes of the USGS stations are in the [-180, 180] range.

Parameters:
  • region (MultiPolygon | Polygon | None) – Polygon or MultiPolygon denoting region of interest

  • lon_min (float | None) – The minimum Longitude of the Bounding Box (legacy mode).

  • lon_max (float | None) – The maximum Longitude of the Bounding Box (legacy mode).

  • lat_min (float | None) – The minimum Latitude of the Bounding Box (legacy mode).

  • lat_max (float | None) – The maximum Latitude of the Bounding Box (legacy mode).

  • bbox (list[float] | None) – Direct API bounding box as [lon_min, lat_min, lon_max, lat_max].

  • site_nos (list[str] | None) – List of USGS site numbers for direct lookup.

  • include_parameter_availability (bool) – If True, query which parameters (water_level, temperature, salinity, currents) are available at each station. Adds columns: has_water_level, has_temperature, has_salinity, has_currents. This requires additional API calls. Default False.

  • api_key (str | None) – USGS API key for higher rate limits when querying parameter availability.

Returns:

GeoDataFramegeopandas.GeoDataFrame with the station metadata

The station data can be retrieved with

searvey.usgs.get_usgs_data(usgs_metadata, endtime='now', period=1, rate_limit=None, api_key=None)

Return the data of the stations specified in usgs_metadata as an xr.Dataset.

Uses the modernized Water Data API which returns continuous/instantaneous data (typically 15-minute interval measurements).

Parameters:
  • usgs_metadata (DataFrame) – A pd.DataFrame returned by get_usgs_stations.

  • endtime (str | date | datetime | Timestamp) – The date of the “end” of the data. Defaults to today.

  • period (float) – The number of days to be requested.

  • rate_limit (RateLimit | None) – Rate limit to apply. If None, auto-configures based on API key.

  • api_key (str | None) – USGS API key for higher rate limits. Falls back to API_USGS_PAT env var.

Returns:

Datasetxr.Dataset of station measurements