IOC¶
The Sea level station monitoring facility website is focused on operational monitoring of sea level measuring stations across the globe on behalf of the Intergovernmental Oceanographic Commission (IOC) aggregating data from more than 170 providers.
A DataFrame with the IOC station metadata can be retrieved with get_ioc_stations()
while the station data can be fetched with fetch_ioc_station():
- searvey.get_ioc_stations(region=None, lon_min=None, lon_max=None, lat_min=None, lat_max=None)¶
Return IOC station metadata from: http://www.ioc-sealevelmonitoring.org/list.php?showall=all
If region is defined then the stations that are outside of the region are filtered out.. If the coordinates of the Bounding Box are defined then stations outside of the BBox are filtered out. If both
regionand the Bounding Box are defined, then an exception is raised.Note: The longitudes of the IOC stations are in the [-180, 180] range.
- Parameters:
region (
MultiPolygon|Polygon|None) –PolygonorMultiPolygondenoting region of interest.lon_min (
float|None) – The minimum Longitude of the Bounding Box.lon_max (
float|None) – The maximum Longitude of the Bounding Box.lat_min (
float|None) – The minimum Latitude of the Bounding Box.lat_max (
float|None) – The maximum Latitude of the Bounding Box.
- Returns:
GeoDataFrame–pandas.DataFramewith the station metadata.
- searvey.fetch_ioc_station(station_id, start_date=None, end_date=None, *, rate_limit=None, http_client=None, multiprocessing_executor=None, multithreading_executor=None, progress_bar=False)¶
Make a query to the IOC API for tide gauge data for
station_idand return the results as apandas.Dataframe.fetch_ioc_station("acap2") fetch_ioc_station("acap2", start_date="2023-01-01", end_date="2023-01-02")
start_dateandend_datecan be of any type that is valid forpandas.to_datetime(). Ifstart_dateorend_dateare timezone-aware timestamps they are coersed to UTC. The returned data are always in UTC.Each query to the IOC API can request up to 30 days of data. When we request data for larger time spans, multiple requests are made. This is where
rate_limit,multiprocessing_executorandmultithreading_executorcome into play.In order to make the data retrieval more efficient, a multithreading pool is spawned and the requests are executed concurrently, while adhering to the
rate_limit. The parsing of the JSON responses is a CPU heavy process so it is made within a multiprocessing Pool.If no arguments are specified, then sensible defaults are being used, but if the pools need to be configured, an executor instance needs to be passed as an argument. For example:
executor = concurrent.futures.ProcessPoolExecutor(max_workers=4) df = fetch_ioc_station("acap", multiprocessing_executor=executor)
- Parameters:
station_id (
str) – The station identifier. In IOC terminology, this is calledioc_code.start_date (
str|date|Timestamp|datetime|datetime64|None) – The starting date of the query. Defaults to 7 days ago.end_date (
str|date|Timestamp|datetime|datetime64|None) – The finishing date of the query. Defaults to “now”.rate_limit (
RateLimit|None) – The rate limit for making requests to the IOC servers. Defaults to 5 requests/second.http_client (
Client|None) – Thehttpx.Client. Can be used to setup e.g. an HTTP proxy.multiprocessing_executor (
ExecutorProtocol|None) – An instance of a class implementing theconcurrent.futures.ExecutorAPI.multithreading_executor (
ExecutorProtocol|None) – An instance of a class implementing theconcurrent.futures.ExecutorAPI.progress_bar (
bool) – IfTruethen a progress bar is displayed for monitoring the progress of the outgoing requests.
- Returns:
DataFrame–pandas.DataFramewith the station data.
Deprecated API¶
- searvey.get_ioc_data(ioc_metadata, endtime='now', period=1, truncate_seconds=True, rate_limit=<searvey.rate_limit.RateLimit object>, disable_progress_bar=False)¶
Deprecated since version 0.4.0: Use
fetch_ioc_station()instead.Return the data of the stations specified in
ioc_metadataas anxr.Dataset.truncate_secondsneeds some explaining. IOC has more than 1000 stations. When you retrieve data from all (or at least most of) these stations, you end up with thousands of timestamps that only contain a single datapoint. This means that the returnedxr.Datasetwill contain a huge number ofNaNwhich means that you will need a huge amount of RAM.In order to reduce the amount of the required RAM we reduce the number of timestamps by truncating the seconds. This is how this works:
2014-01-03 14:53:02 -> 2014-01-03 14:53:00 2014-01-03 14:53:32 -> 2014-01-03 14:53:00 2014-01-03 14:53:48 -> 2014-01-03 14:53:00 2014-01-03 14:54:09 -> 2014-01-03 14:54:00 2014-01-03 14:54:48 -> 2014-01-03 14:54:00
Nevertheless this approach has a downside. If a station returns multiple datapoints within the same minute, then we end up with duplicate timestamps. When this happens we only keep the first datapoint and drop the subsequent ones. So potentially you may not retrieve all of the available data.
If you don’t want this behavior, set
truncate_secondstoFalseand you will retrieve the full data.- Parameters:
ioc_metadata (
DataFrame) – Apd.DataFramereturned byget_ioc_stationsendtime (
str|date|datetime|Timestamp) – The date of the “end” of the data. Defaults todatetime.date.today()period (
float) – The number of days to be requested. IOC does not support values greater than 30truncate_seconds (
bool) – IfTruethen timestamps are truncated to minutes (seconds are dropped)rate_limit (
RateLimit) – The default rate limit is 5 requests/second.disable_progress_bar (
bool) – IfTruethen the progress bar is not displayed.
- Returns:
Dataset– Anxr.Datasetwith the station data.