Access statistics at European level through the Eurostat API.
Eurostat is the statistical office of the European Union. While statistic authorities in Member States collect and analyse data, Eurostat’s role is to consolidate the data and ensure they are comparable. It provides statistics at European level that enable comparisons between countries and regions. From EU policies, economy and finance to social conditions and environment, Eurostat is a powerful tool that consolidate the data using a harmonized methodology.
Eurostat: https://ec.europa.eu/eurostat/fr/about/overview
Each of these functions are detailed in this course and some examples are provided.
The function get_eurostat_toc() downloads a table of contents of eurostat datasets.
# Load the package
library(eurostat)
library(rvest)
# Get Eurostat data listing
toc <- get_eurostat_toc()
title | code | type | last update of data | last table structure change | data start | data end | values |
---|---|---|---|---|---|---|---|
Database by themes | data | folder | NA | NA | NA | NA | NA |
General and regional statistics | general | folder | NA | NA | NA | NA | NA |
European and national indicators for short-term analysis | euroind | folder | NA | NA | NA | NA | NA |
Business and consumer surveys (source: DG ECFIN) | ei_bcs | folder | NA | NA | NA | NA | NA |
Consumer surveys (source: DG ECFIN) | ei_bcs_cs | folder | NA | NA | NA | NA | NA |
Consumers - monthly data | ei_bsco_m | dataset | 29.10.2020 | 29.10.2020 | 1980M01 | 2020M10 | NA |
With search_eurostat() you can search the table of contents for particular patterns, e.g. all datasets related to passenger transport. Note that with the type argument of this function you could restrict the search to for instance datasets or tables.
# info about passengers
search_eurostat("passenger transport")
title | code | type | last update of data | last table structure change | data start | data end | values |
---|---|---|---|---|---|---|---|
Volume of passenger transport relative to GDP | tran_hv_pstra | dataset | 01.09.2020 | 31.08.2020 | 1990 | 2018 | NA |
Modal split of passenger transport | tran_hv_psmod | dataset | 01.09.2020 | 31.08.2020 | 1990 | 2018 | NA |
Air passenger transport by reporting country | avia_paoc | dataset | 20.11.2020 | 09.10.2020 | 1993 | 2020Q3 | NA |
Air passenger transport by main airports in each reporting country | avia_paoa | dataset | 20.11.2020 | 09.10.2020 | 1993 | 2020Q3 | NA |
Air passenger transport between reporting countries | avia_paocc | dataset | 28.10.2020 | 09.10.2020 | 1993 | 2020Q3 | NA |
Air passenger transport between main airports in each reporting country and partner reporting countries | avia_paoac | dataset | 20.11.2020 | 09.10.2020 | 1993 | 2020Q3 | NA |
Once you have found the datasets you are looking for, you can insert the specific id of the dataset in a variable of your choice.
id <- search_eurostat("Modal split of passenger transport",
type = "table")$code[1]
print(id)
[1] "t2020_rk310"
The function get_eurostat takes as an input the specific id of the dataset. It returns datas from the dataset The str() function allows you to investigate the structure of the downloaded data set.
dat <- get_eurostat(id)
str(dat)
tibble [2,798 × 5] (S3: tbl_df/tbl/data.frame)
$ unit : chr [1:2798] "PC" "PC" "PC" "PC" ...
$ vehicle: chr [1:2798] "BUS_TOT" "BUS_TOT" "BUS_TOT" "BUS_TOT" ...
$ geo : chr [1:2798] "AT" "BE" "CH" "DE" ...
$ time : Date[1:2798], format: "1990-01-01" ...
$ values : num [1:2798] 8.2 10.6 3.7 9.1 11.3 32.4 14.9 13.5 6 24.8 ...
unit | vehicle | geo | time | values |
---|---|---|---|---|
PC | BUS_TOT | AT | 1990-01-01 | 8.2 |
PC | BUS_TOT | BE | 1990-01-01 | 10.6 |
PC | BUS_TOT | CH | 1990-01-01 | 3.7 |
PC | BUS_TOT | DE | 1990-01-01 | 9.1 |
PC | BUS_TOT | DK | 1990-01-01 | 11.3 |
PC | BUS_TOT | EL | 1990-01-01 | 32.4 |
It is possible to add filters to only have a specific part of the dataset.
By default variables are returned as Eurostat codes, but to get human-readable labels instead, use a type = “label” argument.
datl <- get_eurostat(id, filters = list(geo = c("EU28", "FI"),
lastTimePeriod = 1),
type = "label", time_format = "num")
unit | vehicle | geo | time | values |
---|---|---|---|---|
Percentage | Motor coaches, buses and trolley buses | European Union - 28 countries (2013-2020) | 2018 | 8.7 |
Percentage | Motor coaches, buses and trolley buses | Finland | 2018 | 10.1 |
Percentage | Passenger cars | European Union - 28 countries (2013-2020) | 2018 | 83.3 |
Percentage | Passenger cars | Finland | 2018 | 84.2 |
Percentage | Trains | European Union - 28 countries (2013-2020) | 2018 | 8.0 |
Percentage | Trains | Finland | 2018 | 5.7 |
As we can see, we now have the percentage value of transport utilisation for the Finland compare to the rest of the European Union in 2017.
# Load the package
library(eurostat)
library(rvest)
# Get Eurostat data listing
toc <- get_eurostat_toc()
# Info about passengers
kable(head(search_eurostat("passenger transport")))
#id of the dataset
id <- search_eurostat("Modal split of passenger transport",
type = "table")$code[1]
#Raw data
dat <- get_eurostat(id)
str(dat)
# Filters addition
datl <- get_eurostat(id, filters = list(geo = c("EU28", "FI"),
lastTimePeriod = 1),
type = "label", time_format = "num")
Command | Detail |
---|---|
get_eurostat_toc() | Downloads a table of contents of eurostat datasets |
search_eurostat() | search the table of contents for particular patterns |
get_eurostat() | Read eurostat data from a specfic id of a dataset |
This course uses the Eurostat Tutorial
For attribution, please cite this work as
Warin (2020, Jan. 28). Thierry Warin, PhD: [API] eurostat. Retrieved from https://warin.ca/posts/api-eurostat/
BibTeX citation
@misc{warin2020[api], author = {Warin, Thierry}, title = {Thierry Warin, PhD: [API] eurostat}, url = {https://warin.ca/posts/api-eurostat/}, year = {2020} }