Thierry Warin, PhD: [API] eurostat

Thierry Warin

doi:10.6084/m9.figshare.11763759.v2

Database description

Eurostat is the statistical office of the European Union. While statistic authorities in Member States collect and analyse data, Eurostat’s role is to consolidate the data and ensure they are comparable. It provides statistics at European level that enable comparisons between countries and regions. From EU policies, economy and finance to social conditions and environment, Eurostat is a powerful tool that consolidate the data using a harmonized methodology.

Eurostat: https://ec.europa.eu/eurostat/fr/about/overview

Functions

get_eurostat_toc()
search_eurostat()
get_eurostat()

Each of these functions are detailed in this course and some examples are provided.

get_eurostat_toc()

The function get_eurostat_toc() downloads a table of contents of eurostat datasets.

# Load the package
library(eurostat)
library(rvest)

# Get Eurostat data listing
toc <- get_eurostat_toc()

title	code	type	last update of data	last table structure change	data start	data end	values
Database by themes	data	folder	NA	NA	NA	NA	NA
General and regional statistics	general	folder	NA	NA	NA	NA	NA
European and national indicators for short-term analysis	euroind	folder	NA	NA	NA	NA	NA
Business and consumer surveys (source: DG ECFIN)	ei_bcs	folder	NA	NA	NA	NA	NA
Consumer surveys (source: DG ECFIN)	ei_bcs_cs	folder	NA	NA	NA	NA	NA
Consumers - monthly data	ei_bsco_m	dataset	29.10.2020	29.10.2020	1980M01	2020M10	NA

search_eurostat()

With search_eurostat() you can search the table of contents for particular patterns, e.g. all datasets related to passenger transport. Note that with the type argument of this function you could restrict the search to for instance datasets or tables.

# info about passengers
search_eurostat("passenger transport")

title	code	type	last update of data	last table structure change	data start	data end	values
Volume of passenger transport relative to GDP	tran_hv_pstra	dataset	01.09.2020	31.08.2020	1990	2018	NA
Modal split of passenger transport	tran_hv_psmod	dataset	01.09.2020	31.08.2020	1990	2018	NA
Air passenger transport by reporting country	avia_paoc	dataset	20.11.2020	09.10.2020	1993	2020Q3	NA
Air passenger transport by main airports in each reporting country	avia_paoa	dataset	20.11.2020	09.10.2020	1993	2020Q3	NA
Air passenger transport between reporting countries	avia_paocc	dataset	28.10.2020	09.10.2020	1993	2020Q3	NA
Air passenger transport between main airports in each reporting country and partner reporting countries	avia_paoac	dataset	20.11.2020	09.10.2020	1993	2020Q3	NA

Once you have found the datasets you are looking for, you can insert the specific id of the dataset in a variable of your choice.

id <- search_eurostat("Modal split of passenger transport", 
                         type = "table")$code[1]
print(id)

[1] "t2020_rk310"

get_eurostat()

The function get_eurostat takes as an input the specific id of the dataset. It returns datas from the dataset The str() function allows you to investigate the structure of the downloaded data set.

dat <- get_eurostat(id)
str(dat)

tibble [2,798 × 5] (S3: tbl_df/tbl/data.frame)
 $ unit   : chr [1:2798] "PC" "PC" "PC" "PC" ...
 $ vehicle: chr [1:2798] "BUS_TOT" "BUS_TOT" "BUS_TOT" "BUS_TOT" ...
 $ geo    : chr [1:2798] "AT" "BE" "CH" "DE" ...
 $ time   : Date[1:2798], format: "1990-01-01" ...
 $ values : num [1:2798] 8.2 10.6 3.7 9.1 11.3 32.4 14.9 13.5 6 24.8 ...

unit	vehicle	geo	time	values
PC	BUS_TOT	AT	1990-01-01	8.2
PC	BUS_TOT	BE	1990-01-01	10.6
PC	BUS_TOT	CH	1990-01-01	3.7
PC	BUS_TOT	DE	1990-01-01	9.1
PC	BUS_TOT	DK	1990-01-01	11.3
PC	BUS_TOT	EL	1990-01-01	32.4

It is possible to add filters to only have a specific part of the dataset.

By default variables are returned as Eurostat codes, but to get human-readable labels instead, use a type = “label” argument.

datl <- get_eurostat(id, filters = list(geo = c("EU28", "FI"), 
                                         lastTimePeriod = 1), 
                      type = "label", time_format = "num")

unit	vehicle	geo	time	values
Percentage	Motor coaches, buses and trolley buses	European Union - 28 countries (2013-2020)	2018	8.7
Percentage	Motor coaches, buses and trolley buses	Finland	2018	10.1
Percentage	Passenger cars	European Union - 28 countries (2013-2020)	2018	83.3
Percentage	Passenger cars	Finland	2018	84.2
Percentage	Trains	European Union - 28 countries (2013-2020)	2018	8.0
Percentage	Trains	Finland	2018	5.7

As we can see, we now have the percentage value of transport utilisation for the Finland compare to the rest of the European Union in 2017.

tl;dr

# Load the package
library(eurostat)
library(rvest)

# Get Eurostat data listing
toc <- get_eurostat_toc()

# Info about passengers
kable(head(search_eurostat("passenger transport")))
#id of the dataset
id <- search_eurostat("Modal split of passenger transport", 
                         type = "table")$code[1]
#Raw data 
dat <- get_eurostat(id)
str(dat)

# Filters addition
datl <- get_eurostat(id, filters = list(geo = c("EU28", "FI"), 
                                         lastTimePeriod = 1), 
                      type = "label", time_format = "num")

Code learned this week

Command	Detail
get_eurostat_toc()	Downloads a table of contents of eurostat datasets
search_eurostat()	search the table of contents for particular patterns
get_eurostat()	Read eurostat data from a specfic id of a dataset

References

This course uses the Eurostat Tutorial

[API] eurostat