12 Geospatial Data Visualization in R for Geopolitics and Economics

Geospatial data – data that has an associated location component – is fundamental in analyzing economic and political phenomena. Many economic and social factors (from real estate markets to public health outcomes) are inherently spatial: they are tied to specific locations and regions. According to recent research, it is “easy to realise the inherent importance of geospatial data in an economic context”. By incorporating location information, analysts can uncover spatial patterns of wealth, conflict, trade, and human development that would be invisible in non-spatial data. For example, mapping economic indicators can highlight regional disparities in income or infrastructure, informing targeted policy interventions. In geopolitics, maps are equally indispensable – from visualizing territorial disputes to tracking refugee movements and conflict hotspots. As one expert on geospatial intelligence notes, “The location-based approach to geopolitical risk management saves time, money, and [lives]”. In other words, geographic visualization and analysis help decision-makers quickly identify where crises or opportunities are unfolding on the ground, enabling more effective and timely responses.

Geographical Information Systems (GIS) and spatial analysis techniques have thus become vital tools in both social science research and policy applications. They are widely used in government, business, and academia to input, store, analyze, and map data tied to locations. GIS and spatial visualization support evidence-based policy by communicating complex data in an intuitive map format that can reveal relationships and trends at a glance. For instance, mapping population density against economic output can show how urbanization drives growth, while overlaying maps of ethnic groups and natural resources might help anticipate conflict zones. The combination of geospatial analysis with traditional economic and political analysis provides added value by considering where phenomena occur, not just how much or when. This spatial perspective is increasingly recognized as essential for sustainable development and security: integrating geospatial methods into economics and political science is a new but rapidly growing interdisciplinary approach. Educators are even calling for greater geospatial literacy among economists and policy analysts, arguing that spatial thinking and map-making skills are now crucial for reliable decision-making.

In this chapter, we provide a comprehensive overview of geospatial data visualization in R, with a focus on applications in economic development and geopolitical analysis. We will cover the types and formats of spatial data, tools for importing and manipulating such data in R, and techniques for creating both static thematic maps and interactive web maps. Throughout, we will use real-world open datasets – including country boundaries from GADM and Natural Earth, refugee statistics from UNHCR, and other socio-economic indicators – to demonstrate practical mapping examples. We will explore case studies such as mapping migration and displacement, visualizing international trade networks, and mapping conflict or risk indices. Best practices for effective map communication will be discussed, along with common pitfalls (like coordinate reference system issues and data quality concerns) and troubleshooting tips. Finally, we reflect on the importance of open data, reproducibility, and “spatial storytelling” in policy research – using maps not just as analytical tools but as compelling narratives that can influence decision-making.

By the end of this chapter, readers (especially students and researchers in social sciences and data science) should be comfortable working with spatial data in R and creating insightful maps for economic and geopolitical topics. We assume basic familiarity with R and the tidyverse, but no prior GIS experience is required. Code examples are provided in R Markdown style, showing how to load data, perform spatial operations, and produce visualizations step by step. Each example is accompanied by explanations to ensure clarity. We also provide references for further reading and links to all datasets and R packages used. Emphasis is placed on open-source tools and data so that readers can reproduce the examples and adapt them to their own research questions.

12.1 Types of Spatial Data and Common Formats

Spatial data comes in two primary forms: vector data and raster data. These differ in how they represent geographic phenomena:

Vector data uses discrete geometric features – points, lines, and polygons – to model real-world entities. Each feature has coordinates defining its shape and location (e.g., a point for a city, a line for a road or river, a polygon for a country or a land parcel). Vector data is typically used for political and administrative boundaries, transportation networks, locations of economic assets, etc. It is an efficient representation for discrete objects and is often easier to style and label on maps. For many cartographic products, vector data is preferred “because of [its] easier handling and smaller file sizes” (compared to high-resolution rasters). Vector datasets usually come with attribute tables – additional columns describing each feature (such as a country’s name, GDP, population, etc.), enabling rich thematic mapping.
Raster data represents the world as a grid of cells (pixels), each with a value. Rasters are used for continuous phenomena that vary over space, such as elevation, temperature, rainfall, land cover, or satellite imagery. For example, an elevation raster might assign each 30m x 30m cell a value equal to altitude in meters. Rasters are essentially images with georeferencing: a digital photograph is a raster, and so is a heatmap or any gridded statistical surface. The resolution (cell size) of a raster determines its detail and file size. Rasters are ideal for environmental and climatic data, remote sensing imagery, and other analyses where continuous variation is important. In R, raster data can also be used to create “heatmaps” of economic or conflict intensity across a region by interpolating point data.

Most spatial datasets you will encounter fall into one of these categories or combine them (e.g., a satellite image with vector overlays of city locations). When working in R, the sf package (simple features) provides a standardized way to handle vector data, and the terra package (and its predecessor raster) provides tools for raster data. We will introduce these packages in the next section. It is worth noting that spatial data often comes with an associated Coordinate Reference System (CRS), which defines how the 2D coordinates correspond to real locations on Earth’s surface (taking into account map projections). We will discuss CRSs and projections as needed, since using consistent CRSs is crucial when combining layers.

Spatial Data Formats: Spatial data can be stored and exchanged in various file formats. Some common formats include:

Shapefile (.shp + related files): A popular vector format introduced by ESRI in the 1990s, widely used in GIS software. A shapefile actually consists of multiple files (with extensions .shp, .dbf, .shx, etc.) that together describe the geometry and attributes. Shapefiles can store points, lines, or polygons (one type per file) along with a table of attributes. They remain common in many open data portals and legacy datasets. However, shapefiles have limitations: for example, attribute column names are limited to 10 characters, and they lack support for storing coordinate reference system info internally (usually a separate .prj file is provided). Despite these limitations, you will likely encounter shapefiles often. R’s sf package can read and write shapefiles easily (via the GDAL library).
GeoJSON (.geojson): An open standard format based on JSON (JavaScript Object Notation) for encoding vector data. GeoJSON is human-readable text and is widely used in web mapping and data sharing (especially with JavaScript libraries like Leaflet or D3). A GeoJSON file can contain points, lines, and polygons (and even mixture of types) along with properties (attributes) for each feature. Many open data APIs (including some from governments) provide GeoJSON outputs. The sf package can read GeoJSON files (via st_read()), and packages like geojsonio can help with GeoJSON conversion if needed. One advantage of GeoJSON is that it’s convenient for web applications and small-to-medium datasets, but large GeoJSON files can be inefficient due to their text size.
GeoTIFF (.tif with geotags): A common raster data format, which is essentially an image file (TIFF) with embedded georeferencing information. GeoTIFFs are used for satellite imagery, scanned maps, digital elevation models, and any gridded data. They can contain multiple bands (e.g., multi-spectral imagery might have separate bands for red, green, blue, infrared, etc.). The terra (and raster) package can read GeoTIFFs with the rast() function. Other raster formats you might see include NetCDF (often for climate data and time series of rasters), GRIB (for meteorological data), and simple ASCII grids. The good news is terra can handle many of these transparently via GDAL.
File Geodatabase and GeoPackage: These are database formats for spatial data. Esri’s File Geodatabase (.gdb) is proprietary but widely used in professional GIS; R can read them with the sf package if GDAL is built with FileGDB support or via the open API. GeoPackage (.gpkg) is an open SQLite-based format that can store multiple layers of vector and raster data in one file. It’s gaining popularity as an efficient, single-file alternative to shapefiles (without shapefile’s limitations). You can read/write GeoPackages in R with sf (just like reading a shapefile, but giving the .gpkg file path).
CSV or Excel with coordinates: Many simple datasets are just tabular data with latitude and longitude columns (e.g. a list of cities with their coordinates, or a spreadsheet of conflict incidents with lat-long). While not a spatial format per se, these can be turned into spatial data by interpreting the coordinates. In R, one can read such tables with readr or readxl and then convert to an sf object via st_as_sf(coords = c("lon","lat"), crs = ...) specifying the coordinate columns and their CRS (usually WGS84 latitude/longitude). We will demonstrate this using some examples.

There are many other formats (KML from Google Earth, GPX for GPS tracks, etc.), but the above are the most frequently encountered in our context. The good news is that R’s spatial packages can handle a wide range of formats through well-supported backend libraries. The sf package, for instance, uses GDAL under the hood, so st_read() can often read shapefiles, GeoJSON, GeoPackage, etc., with a single command. Likewise, terra::rast() can load various raster formats.

Example – Reading spatial data in R: Let’s demonstrate loading some publicly available boundary data. The GADM database provides high-resolution administrative boundaries for all countries (down to districts or communes). Natural Earth provides somewhat generalized cultural and physical geography layers at small, medium, and large scales. We can use the rnaturalearth package (which wraps Natural Earth data) to get a world countries map, and use sf for reading a GADM shapefile for a specific country. For example:

# Load required libraries
library(sf)
library(rnaturalearth)
# Get a world map of countries (Natural Earth data via rnaturalearth)
world <- rnaturalearth::ne_countries(scale = "medium", returnclass = "sf")
# Inspect the world object
print(world)
## Simple feature collection with 177 features and 64 fields
## Geometry type: MULTIPOLYGON
## CRS: EPSG:4326 (WGS 84)
## ... (country polygons with attributes)

# Suppose we want detailed admin boundaries for a country (e.g., India) from GADM
# (Assume we downloaded the GADM shapefile for India, e.g., "gadm41_IND_shp.zip")
india <- st_read("gadm41_IND_1.shp")  # level 1 admin divisions (states)
# Note: GADM v4 uses EPSG:4326 by default, like Natural Earth.
st_crs(india)  
## Coordinate Reference System: EPSG:4326 (WGS 84)

In the above code, world is an sf object containing country polygons with various attributes (name, ISO code, population, etc.), all in latitude/longitude coordinates (EPSG:4326). The india object represents India’s state boundaries from GADM. The st_crs() function shows the coordinate reference system. Both datasets use WGS84 lat-long, which is convenient for combining with many global datasets. If a dataset were in another projection, we could transform it with st_transform(). For example, to project the world map to a Robinson projection for nicer global visualization, one could do: world_robinson <- st_transform(world, crs = "ESRI:54030") (ESRI:54030 is Robinson). We’ll see use of projections in map plotting later.

12.2 Importing and Manipulating Spatial Data in R

Working with spatial data in R has been greatly simplified by the sf package for vector data and the terra package for raster data. We will introduce common operations like reading data, inspecting and subsetting spatial objects, performing joins, and basic transformations.

Vector Data with sf

The sf package (Simple Features) represents vector geometries (points, lines, polygons) in a user-friendly data frame format. Each row of an sf data frame is a spatial feature with attributes, and one special column (usually named geometry) contains the geometric shape (with class sfc – simple feature column). Edzer Pebesma introduced this standardized framework in 2016–2018 to replace earlier sp classes. The advantages of sf are that it integrates well with the tidyverse (you can use dplyr verbs on sf objects, for example) and abstracts away low-level details of spatial data handling.

Reading vector data: The primary function is st_read(), which can auto-detect format by file extension or connection string. In our example above, we used st_read("gadm41_IND_1.shp") to load a shapefile. We could similarly load a GeoJSON with st_read("cities.geojson") or a GeoPackage layer with st_read("mydata.gpkg", layer="cities"). The sf package will return an sf object. If we print or glimpse an sf object, we see the attribute columns and a summary of the geometry (type and CRS). For example:

library(sf)
# Read a GeoJSON of world cities (for illustration; replace with actual path or URL)
cities <- st_read("world_cities.geojson")
# Check first few rows
head(cities)
##           name      country     pop    geometry
## 1       London United Kingdom 8982000 POINT (-0.1 51.5)
## 2        Paris         France 2148000 POINT (2.3 48.9)
## 3       Nairobi         Kenya 4475000 POINT (36.8 -1.3)
## 4       Tokyo          Japan 13960000 POINT (139.7 35.7)
## 5       Mumbai         India 20670000 POINT (72.8 19.1)
## 6  New York City United States 8419000 POINT (-74.0 40.7)

Each city is a point geometry with attributes like name, country, population, etc. Notice the geometry column shows coordinates (longitude, latitude for these points). By default, st_read will often set the CRS to WGS84 if the data source defines it (most GeoJSON are in WGS84 by standard). We can verify using st_crs(cities).

Basic spatial operations: With an sf object, we can do many data manipulation operations using dplyr syntax directly. For example, filtering to select certain features or creating new columns. We can also use spatial-specific operations like st_bbox() to get the bounding box, st_area() to compute polygon areas, st_length() for line lengths, or st_centroid() to get centroid points of polygons. A few examples:

library(dplyr)
# Filter world dataset to Asia region
asia <- world %>% filter(continent == "Asia")
# Compute area of each country (in square kilometers)
world <- world %>% mutate(area_km2 = as.numeric(st_area(geometry)) / 1e6)
# st_area gives area in square meters by default (assuming the CRS is projected).
# In lat-long (WGS84), st_area() returns values in square degrees which are not directly meaningful,
# so typically we would project to an equal-area projection before computing true area.
# For simplicity, here we treat the value (if world was still in 4326) with caution.

In the above, note the caveat: spatial calculations like area and distance are sensitive to projection. If your data is in a geographic CRS (lat-long degrees), functions like st_area will not return accurate real-world areas (square degrees are not uniform). A proper approach is to transform to an equal-area projection (e.g., EPSG:6933 – a global equal area projection) before computing areas. However, for small areas or illustrative purposes, one might proceed or use approximate methods. We mention this as a common pitfall: always be aware of CRS when doing measurements.

Spatial subsetting and joins: We can also filter spatially – e.g., select all cities within a certain country polygon, or all conflict points within 100 km of a city, etc. The sf package provides powerful spatial join and query functions. For instance, st_join() can join two spatial layers based on intersection of geometries (e.g., adding country attributes to each city by locating which polygon contains the city point), and functions like st_intersection(), st_union(), st_buffer() allow geometric operations. As an example, suppose we have a dataset of conflict event points and we want to count how many events occurred in each country. We could do a spatial join of points to country polygons and then aggregate:

# Example: count conflict events per country
conflicts <- st_read("conflict_events.geojson")       # a set of points (each with e.g. fatalities, date, etc.)
conflicts <- st_set_crs(conflicts, 4326)              # ensure CRS is WGS84
events_in_countries <- st_join(conflicts, world, join = st_within)  
# st_within ensures we join each point to the polygon it's within.
# events_in_countries now has event attributes and also country attributes.
# To count events by country:
event_counts <- events_in_countries %>% 
  st_drop_geometry() %>%            # drop geometry for aggregation
  count(name_long) %>%              # count events per country name
  arrange(desc(n))
head(event_counts)
##      name_long   n
## 1    Nigeria    350
## 2    Iraq       280
## 3    ...        ...

Above, st_drop_geometry() was used because after the join we only need the data columns for counting (dropping geometry converts it to a regular data frame). We used st_within to ensure a point must fall inside a country polygon to join (there’s also st_intersects which is more general). The result event_counts might show, for example, Nigeria has 350 events in our dataset, Iraq 280, etc. We could then join this summary back to the world sf object to visualize it as a choropleth map (we will demonstrate thematic mapping in the next section).

Combining spatial and non-spatial data: Often we have attribute data in CSVs (like economic indicators) that we want to link to a spatial boundary. A common scenario in geopolitics/economics is you have country-level statistics in a table (identified by country code or name) and a world map shapefile with matching codes. We can merge them easily in R. For example, say we have a data frame gdp with columns iso3 and GDP_per_capita. The world sf object from Natural Earth has an ISO3 country code column (often named iso_a3 or similar). We can do:

# Combine world spatial data with GDP data frame
world <- world %>% left_join(gdp, by = c("iso_a3" = "iso3"))

After this, world will have a new column for GDP_per_capita. Any country with a matching code gets the data, and others get NA if no match. This kind of attribute join is extremely common – effectively treating the spatial layer as a keyed table. The ability to use tidyverse verbs here (like left_join) is a benefit of sf objects behaving like data frames.

Raster Data with terra (and `raster`)

The terra package is the modern replacement for the older raster package, written by Robert Hijmans and colleagues. It provides extensive functionality to create, manipulate, and analyze raster data (and can also handle vector data in SpatVector format, though we’ll mostly use it for rasters). A raster in terra is represented by a SpatRaster object, which can be single-layer or multi-layer (stack of rasters). Terra is optimized for large datasets and faster performance than the older raster package, and it is recommended for new projects (though many older scripts use raster::RasterLayer). We will focus on terra here, but note that basic usage of terra is similar in many ways to raster.

Reading and creating rasters: Use terra::rast() to read a raster file. For example:

library(terra)
# Read a single-band raster (e.g., an elevation TIFF)
elev <- rast("data/elevation.tif")
elev
## class       : SpatRaster 
## dimensions  : 1200, 1200, 1  (nrow, ncol, nlyr)
## resolution  : 0.008333, 0.008333  (x, y)  # degree resolution (approx 1km if WGS84)
## extent      : 67.0, 77.0, 8.0, 18.0  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326) 
## source      : elevation.tif 
## name        : elevation 
## min value   : 0.0 
## max value   : 2457.0

The printed summary of elev shows it’s a SpatRaster with given dimensions (rows, cols), resolution, extent (the bounding box coordinates), CRS (here WGS84 lat/lon), source file, and the min/max of values. Suppose this is a digital elevation model for a region (10x10 degrees grid at ~1km resolution, in this hypothetical example). We can plot it quickly using base plotting:

plot(elev, main = "Elevation (m)")

This will produce a raster plot with a legend. By default, it chooses a color scale (often a blue-to-green-to-brown palette for elevation). We could also use ggplot2 to plot raster data by converting it to a data frame of points, but for large rasters that’s not efficient. Another approach is using tmap or RasterVis, which we will see. Terra also has plotRGB() for multi-band imagery.

If the raster has multiple layers (bands), rast() will create a multilayer SpatRaster. You can subset layers with double brackets (e.g., r[[1]] first layer). Terra supports common raster manipulations: e.g. crop() to crop to an extent, resample() to change resolution or align grids, mask() to apply a polygon mask, and mathematical operations on rasters (adding, multiplying, calculating indices, etc.). Many operations can be done using overloaded arithmetic (e.g., if r is a SpatRaster, r * 2 multiplies every cell by 2, sqrt(r) takes square root of each cell, etc.).

Example raster operations: Suppose we have a multi-layer raster representing monthly rainfall (12 layers). We could calculate an annual sum by simply doing annual_rain <- sum(rain_stack) – this will sum cell-wise across layers. Or calculate an average: mean_rain <- mean(rain_stack). Terra functions like app(r, fun, ...) apply an R function across layers or cells. For instance, app(rain_stack, fun = sd) could give a raster of the standard deviation across months for each cell.

Coordinate reference systems for rasters are handled similarly: crs(elev) gets or sets the CRS, and project(elev, new_crs) would reproject the raster to a new coordinate system (interpolating values accordingly).

Raster and vector interplay: Often we combine raster and vector data – e.g., extracting raster values at point locations, or aggregating raster statistics by polygon. Terra provides functions like extract() to get raster values at specified points or for polygons (e.g., average over a polygon). There is also an as.data.frame() method (with xy=TRUE) to get a data frame of coordinates and values for use in ggplot. For example:

# Extract elevation at city locations:
cities_elev <- terra::extract(elev, vect(cities))  # 'cities' was an sf, need to convert to SpatVector with vect()
cities$elevation <- cities_elev$elevation  # add extracted values back to cities sf

Here, terra::extract took the elev raster and a set of points (we converted cities to SpatVector with vect() since terra’s extract expects its own vector type or matrix of coords). The result is a data frame of elevation values for each point, which we then appended to the cities data.

Note on the raster package: The older raster package is still widely used in many tutorials and older code. Its main object is RasterLayer (or RasterStack for multi-layer). Most of its functions are analogous (e.g., raster::crop, raster::extract, etc.). If you encounter code using raster, know that you can achieve the same with terra (in fact, terra can even directly read RasterLayer objects via as.spacetime conversion). One difference is that terra uses 1-based indexing for layers (consistent with R) while raster used 1-based as well but some subtle differences in how it handles memory. For new projects, use terra; but if an existing project heavily uses raster, you might continue with it. In this chapter, we’ll use terra syntax.

Example – Raster manipulation in R: We’ll do a quick demonstration by creating a simple raster and doing an operation:

library(terra)
# Create a synthetic raster of 10x10 cells covering a region
r <- rast(nrows=10, ncols=10, xmin=0, xmax=10, ymin=0, ymax=10, crs="EPSG:4326")
values(r) <- 1:100  # assign some values 1 to 100
r2 <- sqrt(r)       # take square root of each cell value
# Plot the original and the transformed raster
par(mfrow=c(1,2))
plot(r, main="Original values")
plot(r2, main="Square root of values")

This will produce side-by-side maps (though obviously our “region” and values are arbitrary here). The example shows how we can treat the raster as a numeric object for cell-wise operations. In practical scenarios, such cell-wise calculations are used for things like computing indices (e.g., NDVI from satellite bands), applying thresholds (e.g., classify land cover), etc.

Tidyverse Integration and Spatial Data

One strength of using R for spatial analysis is the ease of integrating with the tidyverse collection of packages. We have already seen using dplyr verbs on sf objects (which preserve the spatial geometry while modifying attributes). Likewise, one can use ggplot2 to plot sf data frames directly. In fact, geom_sf() in ggplot2 allows you to create maps from sf objects in one line. For example, to plot country outlines:

library(ggplot2)
ggplot(world) + 
  geom_sf(fill = "cornsilk", color = "gray40") + 
  ggtitle("World map") + 
  theme_minimal()

This uses world (sf of country polygons) and draws them with a light fill and gray borders. We did not have to manually supply aes(x, y) – geom_sf knows how to handle the geometry column. We can map attributes to aesthetics similarly: e.g., geom_sf(aes(fill = GDP_per_capita)) would create a choropleth map if that column exists (with a continuous legend by default).

The tidyverse also includes packages like tidyr and purrr that can help if you need to manipulate data frames prior to spatial conversion, or dealing with lists of spatial objects, etc. Another useful package is lubridate if you have spatio-temporal data (e.g., events over time – you can easily handle dates and then perhaps animate maps or facet by time). While not specifically spatial, these tools often come into play in data preparation.

In summary, R’s ecosystem allows you to treat spatial data in a very similar way to regular data frames, making it accessible to those who are already comfortable with data wrangling in R. Next, we turn to the core topic of this chapter: visualizing geospatial data – i.e. making maps – including static thematic maps and interactive maps.

12.3 Creating Thematic Maps (Choropleths, Symbol Maps, Flow Maps)

A thematic map is a map that focuses on a particular theme or data variable. Common types include choropleth maps (areas colored according to a value), proportional symbol maps or bubble maps (symbols sized according to values at locations), and various types of flow maps (arrows or lines showing movement or connections between places). In the context of economic and geopolitical data, these thematic maps allow us to visualize patterns such as GDP by region, refugee counts by country, or trade flows between nations. R offers multiple approaches for creating such maps. We will illustrate using both ggplot2 and tmap for static maps, and we’ll discuss specialized tools for flows.

Choropleth Maps

A choropleth map shades geographic areas (polygons) based on a data variable, typically using a color scale. This is useful for any region-based metric: e.g., unemployment rate by country, conflict risk by province, election results by district, etc. To create a choropleth, you need a spatial polygon layer and a numeric (or categorical) variable for each polygon.

Using ggplot2: As shown earlier, geom_sf(aes(fill = variable)) is the basic approach. For example, if our world dataset has a column GDP_per_capita, we can do:

ggplot(world) +
  geom_sf(aes(fill = GDP_per_capita), color = NA) +
  scale_fill_viridis_c(option="plasma", trans="log", na.value="lightgray", name="GDP per capita") +
  labs(title="GDP per Capita by Country (2021)") +
  theme_minimal()

In this hypothetical code, we used color = NA to remove polygon boundaries (to emphasize continuous regions), and we applied a viridis color scale (which is colorblind-friendly) on a log scale (often economic data have long-tailed distributions, and a log transform can make the choropleth more interpretable). We also handle NA values by coloring them light gray, and label the legend.

Using tmap: The tmap package is specifically designed for thematic maps and can create both static and interactive maps. It has an intuitive syntax with tm_shape() and layer functions like tm_polygons(). A choropleth with tmap might look like:

library(tmap)
tmap_mode("plot")  # static plotting mode
tm_shape(world) +
  tm_polygons("GDP_per_capita", style="quantile", palette="Blues", title="GDP per capita") +
  tm_layout(title="GDP per Capita by Country (2021)", frame = FALSE)

This will produce a static map with countries shaded by GDP per capita, using a blue palette and quantile breakpoints (each color category has an equal number of countries, which is often a good way to show variation). tm_layout allows various customizations (here we removed the frame border and added a title). Tmap automatically adds a legend for the fill. The result is similar to ggplot’s, but tmap provides many high-level options for cartography (like easier addition of scale bars, north arrows, etc., and a simple switch to interactive mode as we’ll see).

Classed vs. continuous choropleth: You can decide whether to show the variable as continuous (a smooth gradient) or classified into bins. For example, using style="quantile" or style="jenks" in tm_polygons classifies data. In ggplot, one would instead discretize beforehand (e.g., using cut() on the numeric variable to make it a factor of ranges). The UNHCR example we will see below classifies refugee counts into categories (“<10k”, “10k-100k”, etc.) to make a clearer legend. Classification is useful to simplify interpretation, especially if exact values are less important than relative magnitude categories.

Example – Choropleth of Refugee Host Countries: Using a UNHCR dataset, suppose we have the number of refugees hosted by each country at the end of 2021. We join that data to our world map under a column say refugees. Then:

# Assuming 'world' has an ISO3 code and we have refugee counts by ISO3 in ref_data
world_ref <- left_join(world, ref_data, by=c("iso_a3"="CountryCode"))
# Plot using tmap
tm_shape(world_ref) +
  tm_polygons("refugees", style="jenks", palette="YlOrRd", title="Refugees (2021)") +
  tm_layout(title="Refugee Population by Country of Asylum (2021)",
            legend.outside = TRUE)

This will color each country by how many refugees it hosts, using an appropriate classification (Jenks natural breaks in this case) and a yellow-to-red palette (common for highlighting density or risk). High-value countries will be red/dark, low will be light yellow. The map would immediately show which countries host the largest refugee populations. For a refined map, we might exclude countries with 0 or no data, add labels, etc. We might also consider that a few countries (like Turkey, Colombia, etc.) might dominate the scale, so using a logged scale or manual breaks could be considered. Figure 1 below is an example of a choropleth map produced with R, shading countries by refugee counts:

Figure 1: Choropleth map of global refugee displacement by country of asylum (2021). Darker blue indicates countries hosting a larger number of refugees. (Source: UNHCR Refugee Data)**.

(Figure 1 shows a choropleth where countries are colored by the number of refugees they host. Such a map highlights, for example, that countries like Turkey, Pakistan, and Uganda host over a million refugees, whereas many other countries host far fewer.)

Proportional Symbol and Bubble Maps

Instead of coloring entire regions, sometimes it’s effective to use symbols (circles, squares, etc.) at specific locations scaled by a variable’s value. Proportional symbol maps place a symbol (often a circle – hence also called bubble maps) whose area (or diameter) corresponds to the data value. These are useful when the data is tied to point locations (e.g., cities, refugee camps, ports) or when we want to show an attribute for countries but avoid the visual issues of choropleths (like large countries dominating the map visually even if their value is moderate).

For example, to map total GDP by country, a choropleth can be misleading because large countries (Russia, Canada) might have large GDP simply due to size, whereas small high-density economies (Singapore, Luxembourg) would barely show up. A proportional symbol map could place a bubble at each country’s capital (or centroid) sized by GDP, giving a different perspective (though overlapping symbols can be an issue).

Using ggplot2: We can add points with geom_sf or geom_point. If we have an sf point layer with a value, say we have an sf of country centroids with a column GDP_total, then:

ggplot() +
  geom_sf(data = world, fill="gray90", color="white") +   # base map
  geom_sf(data = centroids_sf, aes(size = GDP_total), color="tomato", alpha=0.6) +
  scale_size_area(max_size = 10, breaks = c(1e12, 1e13, 1e14), 
                  labels = c("1 Trillion", "10 Trillion", "100 Trillion")) +
  labs(title="Global GDP by Country (Bubble Size)", size="GDP (USD)") +
  theme_minimal()

Here we used geom_sf twice: first to draw countries with a neutral background, then to draw centroids with size mapped to GDP. We used scale_size_area which is good for bubble maps (ensures area of symbol is proportional to value). We set some breakpoints for legend clarity (this assumes GDP values on order of trillions). The result would be a bubble map where big economies (USA, China, EU countries, etc.) have large red circles, smaller economies small circles. This conveys the share of global GDP more intuitively than a choropleth (which would show GDP density or per capita differently).

Using tmap: tmap has tm_symbols() or specifically tm_bubbles() for this. If we had city locations with population, for instance, and wanted a bubble map:

tm_shape(world) + tm_polygons(fill="gray80") +
tm_shape(cities) + tm_bubbles(size="population", col="skyblue", alpha=0.5,
                               scale=1.5, title.size="Population")

tmap_mode("plot") assumed. This would draw light gray countries and overlay blue bubbles for cities sized by population. The scale parameter can adjust the relative size of bubbles. The legend (title.size) will indicate population. One might also use col aesthetic to color bubbles by another variable (e.g., region or type) while sizing by population.

A concrete example from UNHCR’s visualization tutorials is a bubble map of refugee origins. In that example, each bubble is placed on the country (presumably at centroids) and size represents the number of refugees originating from that country. The code used geom_sf with shape=21 (circle) and mapped aes(size = ref). They also styled it with semi-transparent fill. The figure below (Figure 2) illustrates a bubble map concept:

Figure 2: Bubble map example – circles indicate magnitude of refugees originating from each country (2021). Larger bubbles = more refugees from that country. Map created with ggplot2 and sf.

(Figure 2 uses bubbles to compare the proportions of refugees by country of origin. Because bubble maps aren’t distorted by area size of countries, one can see, for example, a relatively small country with a very large bubble if it generated many refugees, highlighting it even if it’s geographically small.)

Design considerations: When making symbol maps, beware of overlapping symbols (use transparency alpha or slight repulsion techniques if many points are close). Also consider that human perception judges area differently than length, so be sure to use scale_size_area (which bases on area). A legend with example sizes (as shown) is crucial for interpretation.

Flow Maps (Lines and Arrows for Movement)

Flow maps are used to depict movement or connections between places – such as migration flows, trade routes, or network links. In R, flow maps can be created by drawing lines (potentially with arrows or varying widths) between origin and destination locations. There are specialized tools and packages for network visualization, but we’ll discuss a few methods relevant to geopolitics/economics:

Great circle or curved lines: For global flows like migration or airline routes, drawing great-circle arcs can be more accurate (and visually pleasing) than straight lines on a projected map. The geosphere package has a function gcIntermediate() which can compute intermediate points along a great circle between two lat-long coordinates, which you can then plot as a line (or many small segments). The ggplot2 approach could use geom_curve or geom_segment with appropriate curvature for small areas, but on a global scale those are not true great circles. Another approach: convert origin-destination pairs into an sf LINESTRING geometry and let projection handle curvature.
sf simple features lines: You can programmatically create line geometries from point pairs. For instance, if you have a data frame of flows with columns from_lon, from_lat, to_lon, to_lat, you could do:
```
flows_sf <- flow_data %>% 
  rowwise() %>% 
  mutate(geometry = st_sfc(st_linestring(matrix(c(from_lon, from_lat, to_lon, to_lat), ncol=2, byrow=TRUE)), crs=4326)) %>% 
  st_as_sf()
```
This is a bit advanced, but essentially creates a linestring from each pair of coords. Once you have an sf of lines, you can plot it with geom_sf (mapping perhaps a value to size or color). Alternatively, one can use the sfheaders or geodist packages to ease creation of such line geometries.
Specialized flow packages: There are packages like flowmapblue (by Ilya Boyandin) for interactive flow maps, and network packages like igraph or ggnetwork which can plot network links (though not on geographic coordinates by default, but you can merge with coordinates). In fact, geom_net and ggnetwork were used in a tutorial to map trade flows between countries. In that example, they did not place the nodes at their geo-locations but rather used a network layout, but they could assign coordinates (like latitude/longitude of capitals) to the network nodes to make a geographic network map.

A simple example: visualizing migration flows between countries. Suppose we have a dataset of refugee flows with origin country and asylum country, plus a value (number of people). We could use centroids for each country as nodes. Then for each origin-asylum pair, draw a line. If we use ggplot2, one way is:

library(geosphere)
# Example: create great circle points between origin and dest
flows <- refugee_flows  # assume this has OriginLon, OriginLat, DestLon, DestLat, Count
flow_lines <- list()
for(i in 1:nrow(flows)) {
  pts <- gcIntermediate(c(flows$OriginLon[i], flows$OriginLat[i]),
                        c(flows$DestLon[i], flows$DestLat[i]), n=50, addStartEnd=TRUE)
  flow_lines[[i]] <- st_linestring(pts)
}
flows_sf <- st_sf(geometry = st_sfc(flow_lines, crs=4326), count = flows$Count)
# Now flows_sf is an sf LINESTRING object for each flow, with count attribute.
ggplot() +
  geom_sf(data=world, fill="gray20", color="black") +
  geom_sf(data=flows_sf, aes(size = count), color="dodgerblue", alpha=0.7) +
  scale_size(range=c(0.1, 2), breaks=c(10000, 100000, 1000000), 
             labels=c("10k","100k","1M"), name="Migrants") +
  ggtitle("Refugee Flows (origin to destination)") +
  theme_void()

This hypothetical code would draw great-circle curves (with 50 points each) for each flow, sizing them by the number of refugees. We used a dark world map background to make the blue flow lines pop. The scale_size is adjusted to keep lines thin (0.1) for small flows and thicker (up to 2) for very large flows. You could also add arrows: in ggplot2, geom_sf() currently doesn’t have an easy arrow parameter, but geom_curve/geom_segment do (with arrow=arrow(type="closed", length=unit(2,"mm"))). However, adding arrows on many lines can clutter a map; often just showing direction via color or an animation is chosen instead.

Alternatively, tmap can do flow maps by adding line layers with tm_lines(). Tmap won’t compute the great circle for you, but if you have the sf lines as above, you could do:

tm_shape(flows_sf) +
  tm_lines(lwd="count", scale=0.000001, col="blue", alpha=0.6, title.lwd="Refugees") +
  tm_layout(bg.color="gray15") +
  tm_shape(world) + tm_borders(col="white", lwd=0.1)

Here we use a small scale factor to convert counts to line width. The styling is a bit manual. There are also specialized visualization approaches like flow maps with animation (e.g., using gganimate to show flows accumulating or moving).

Flow map interpretation: A challenge with flow maps is overlapping lines and directional ambiguity. If origin and destination are swapped, the same line would be drawn. One solution is to color code direction (for example, green for exports from A to B vs red for exports from B to A), or use arrows. Another is to use interactive hover to inspect a particular flow (which we’ll address in interactive maps).

In economics, trade flow maps often use chord diagrams or network graphs instead of cluttering a geographic map with too many lines. However, if focusing on a subset (say trade within Asia), a flow map on a map can be effective. Similarly, migration corridors can be highlighted by selecting major routes.

As an example from practice, Figure 3 (conceptual) might show trade flows between a few countries with arrows:

(For brevity, we won’t embed another figure here, but imagine a map with curved arrows between the US, China, and Europe indicating trade volumes.)

In summary, flow maps require careful design to avoid turning into “spaghetti.” But with R, you have the flexibility to compute paths and customize how they’re drawn. The data preparation is often the hardest part (getting all origin/dest coordinates and pairing them with values).

12.4 Creating Interactive Maps with Leaflet and tmap

Thus far we focused on static maps – great for print or reports. However, interactive maps allow users to pan/zoom, click on features for more info, and toggle layers. This can be extremely valuable for exploring geospatial data or presenting it online. R provides excellent tools for interactive mapping, primarily through the leaflet package and the interactive mode of tmap (which actually leverages leaflet under the hood), as well as packages like mapview for quick viewing. We will discuss leaflet and tmap here.

Interactive Maps with leaflet

Leaflet is a popular open-source JavaScript library for web maps, and the leaflet R package (Cheng, Karambelkar, & Xie, 2015) is an R interface to create Leaflet maps. One big advantage is that it works well within RMarkdown, Shiny apps, or just in RStudio’s viewer – producing interactive maps without writing JavaScript. Leaflet maps are tile-based, meaning they typically use pre-rendered map tiles (from providers like OpenStreetMap, Carto, etc.) for the basemap, and you overlay your data as markers, circles, polygons, etc.

Basic usage involves creating a leaflet map widget and adding layers to it with the pipe (%>%) syntax:

library(leaflet)
leaflet(world) %>% 
  addTiles() %>%  # adds default OpenStreetMap tiles as background
  addPolygons(color = "white", weight = 1, fillColor = ~colorQuantile("YlGn", GDP_per_capita)(GDP_per_capita),
              fillOpacity = 0.7, popup = ~paste(name_long, "<br>GDP per cap:", round(GDP_per_capita)) )

Let’s unpack this:

leaflet(world) initializes a map centered/bounded on the world data (since world is an sf, leaflet knows its bounds).
addTiles() adds a base layer (slippy map tiles from OSM by default).
addPolygons() adds our country polygons. We specified a border color (white) and weight. For fillColor, we demonstrated using a palette function: colorQuantile("YlGn", GDP_per_capita) creates a function that assigns a color from the Yellow-Green palette based on the percentile of GDP_per_capita. We then call that function with each country’s value. This is a quick way to choropleth in leaflet (alternatively, one could pre-define a palette or use colorBin, colorNumeric etc., which leaflet provides).
fillOpacity is set to 0.7 for some transparency.
The popup argument takes HTML content (we use the country name and GDP value) that will show when a user clicks a country.

The result is an interactive choropleth where hovering or clicking shows popups. The user can zoom and pan freely. Leaflet in R has many add* functions:

addCircles or addCircleMarkers for point data (with radius, color, etc., and popups/labels similarly).
addMarkers for simple markers (you can even use custom icons).
addPolylines for line data.
addGeoJSON if you have data in GeoJSON form.
addLayersControl to allow toggling layers on/off (useful for multiple overlay layers).
addLegend to add a legend for the color or size scales.

For example, if we had a conflict events dataframe conflicts_df with lon, lat, and fatalities, we could do:

leaflet() %>% addTiles() %>%
  addCircleMarkers(data = conflicts_df, lng = ~longitude, lat = ~latitude, 
                   radius = ~sqrt(fatalities), fillColor="red", fillOpacity=0.5,
                   popup = ~paste(location, "<br>Fatalities:", fatalities) )

This would put a semi-transparent red circle for each event, sized by sqrt of fatalities (so area ~ fatalities). Popups would show location name and number of fatalities. A user could interactively explore where conflicts are severe.

Leaflet tips: By default, leaflet maps do not show scale bar or north arrow (since north is usually “up” on web mercator projections and scale depends on zoom). But one can add a scale bar via addScaleBar(). For projections, leaflet is essentially always in Web Mercator (EPSG:3857) for tiling – but it handles the reprojecting of your data behind the scenes if your data is lat/long. It’s usually best to supply data in WGS84 lat-long to leaflet (so it can project correctly). Another important point is efficiency: adding thousands of polygon objects might be heavy. Simplifying the data (reducing polygon detail) or using tile-based approaches for large data is advisable (beyond our scope here).

Interactive mode with tmap

The tmap package offers an easy switch between static and interactive maps. By setting tmap_mode("view"), tmap will produce an interactive leaflet map instead of a static image. This is incredibly convenient because you can design your map once and get both outputs as needed.

For example, building on the earlier tmap choropleth example:

tmap_mode("view")
tm_shape(world_ref) +
  tm_polygons("refugees", style="quantile", palette="Reds", title="Refugees") +
  tm_basemap("OpenStreetMap") +
  tm_view(set.zoom = 2)

This will open an interactive map (in RStudio viewer or browser) where you can zoom/pan. The polygons are drawn and an interactive legend is shown. You can hover to see values (if you enable tm_hover or set interactive = TRUE on layers, tmap by default shows tooltips with the data value for the mapped aesthetic). The call tm_basemap("OpenStreetMap") adds a base layer; tmap in view mode can use various basemaps (there’s also tm_tiles for other tile services). tm_view(set.zoom=2) sets initial zoom.

If we wanted to allow toggling layers, tmap has tm_facets or can separate groups. But for more control, direct leaflet is often used. Still, tmap’s strength is rapid development: you can experiment with your thematic map in static mode (high quality output), then just switch to view mode to allow interactive exploration, with minimal changes in code.

Example – Interactive refugee map: Imagine we want an interactive map of refugee migration: origin to host country flows and the ability to click on a country to see details. We could create two layers: one for flows (lines) and one for host country symbols (bubbles sized by number of refugees hosted). In tmap (view mode), we could add both layers and allow the user to toggle them:

tmap_mode("view")
tm_shape(flows_sf) + tm_lines(col="skyblue", lwd="count", scale=5e-7, alpha=0.5, group="Flows") +
tm_shape(world_ref) + tm_bubbles(size="refugees", col="orange", alpha=0.7, border.col="white", group="Host Countries",
                                 popup.vars = c("Country"="name_long", "Refugees"="refugees")) +
tm_basemap("CartoDB.DarkMatter") +
tm_view(set.view = c(0,30,2))  # center lon=0, lat=30, zoom=2

This uses a dark basemap, draws flow lines in blue (with width proportional to count) and draws orange bubbles for host countries (with a popup showing the exact number). We assigned each to a group – in view mode, tmap will allow toggling these groups via a layer control (like a checkbox for “Flows” and “Host Countries”). The resulting interactive map would let a user zoom into regions, turn on/off the flow lines or the bubbles, and click on a bubble to see country name and refugee count.

Shiny and advanced interactivity: The leaflet maps can be integrated into Shiny web applications to respond to user inputs (for example, selecting different variables to map, or drawing on the map). The leafletProxy function allows dynamic updating of a leaflet map in Shiny. While a full Shiny tutorial is beyond our scope, be aware that interactive R maps can be made fully dynamic.

For most exploratory purposes, leaflet() or mapview::mapview() (which is a quick view wrapper) are enough. Mapview is worth noting: calling mapview(world) will immediately display an interactive map of that object with sensible defaults, including popups of all attributes. This is great for a quick look at your spatial object to verify it loaded correctly. Mapview also has functionalities to add mouse coordinates display, measure distances, etc., for quick exploration.

12.5 Combining Spatial Data with Socio-Economic Data

Geospatial visualization becomes powerful when we overlay multiple data sources – for instance, combining spatial boundaries with economic indicators, or attaching conflict event data to demographic data. In this section, we discuss some common scenarios of combining data for geopolitics/economics applications:

Joining tabular socio-economic indicators to spatial boundaries: We touched on this earlier. A typical case: you have a CSV of indicators by country (GDP, Human Development Index, trade volume, etc.) and a shapefile of country boundaries. Using a common key (ISO country code, or country name carefully), you perform a join so that those indicators become attributes of the spatial data. After that, you can map them (choropleth or others). We demonstrated with GDP and refugee counts. The key is ensuring the join key matches (one may need to standardize names or codes).
Linking point locations to region data: Perhaps you have data on cities (points) and you want to attach country-level info (like country’s population or region name). A spatial join using st_join (with st_within predicate) can accomplish that – essentially a point-in-polygon query. Conversely, if you want to summarize point data by region (like counting events per region, summing values, etc.), you can do a spatial join then aggregate (as we did with conflict events per country example). This is common in conflict studies (e.g., number of incidents per province) or epidemiology (cases per district).
Combining raster and socio-economic data: One might extract values from environmental rasters to attach to socio-economic units. For example, attach average rainfall or elevation to each country or each conflict event. We used terra::extract in an example to get elevation for city points. Another use: mask or multiply rasters by region masks (e.g., multiply a population density raster by a binary raster of conflict zones to estimate population at risk). Tools like terra::mask or terra::zonal (to do zonal statistics by polygon) can help. For instance, to compute average nightlights brightness per country, you could use terra::zonal(nightlights_raster, countries_raster, fun="mean") if you have a raster of country IDs.
Temporal data merging: Sometimes we have time series data by location. For example, trade indicators by country-year. To map a specific year, you filter the data to that year and join to spatial data. If making an animation, you might keep the data in a “long” format and use facets or animation frames to iterate over time. The gganimate and tmap (with tmap_animation) packages can create animations showing change over time. For instance, mapping yearly changes in conflict intensity or refugee numbers with a frame for each year.
APIs and data sources in R: There are R packages that directly fetch socio-economic data. For example:
- WDI for World Bank Development Indicators (returns data frames of country indicators).
- tidycensus for US Census data (with geometries if needed).
- acled.api or riem etc., though for conflict data like ACLED, one might rely on pre-downloaded data due to licensing.
- unhcrdatapackage (UNHCR API in R) which provides access to the Refugee Data Finder. In the snippet we saw earlier, they pulled data from an UNHCR package dataset (unhcrdatapackage::end_year_population_totals_long) and merged it with spatial centroids to map refugee distribution.

Case examples of data combination:

Conflict mapping: You might take ACLED conflict event data (which has latitude/longitude for each event, plus attributes like fatalities, actors, etc.) and overlay it on a map of ethnic groups or administrative boundaries. This could reveal patterns like conflicts clustering in certain regions. If you have a risk index by region, you could color regions by risk and also dot the events.
Trade and economic zones: Suppose you have data on trade agreements or economic unions (EU, NAFTA, etc.). You could color countries by membership in a trade bloc, and also overlay lines or arrows showing trade volumes between major partners. Combining categorical region highlighting with flow lines could show how trade is concentrated within certain blocs.
Migration and demographics: Combine refugee flow data with host country indicators (like GDP or population). You could illustrate, for instance, refugees per 1,000 population by country (a normalized measure) by merging refugee counts with population data and mapping that ratio. Or use symbols for refugees and color countries by some other socio-economic indicator to provide context (perhaps the countries with highest refugees are not the wealthiest, raising policy questions).

The process of combination is often: get spatial layer -> get data -> join -> visualize. Always keep track of the CRS during spatial operations (most join/aggregation tasks are easier if everything is in a common projection or lat-long). And consider the scale: using GADM level-2 (districts) for a global map might be overkill and slow; aggregate or use a generalized boundary if possible. Conversely, if focusing on one country, you might bring in detailed local data.

Now that we have covered techniques, let’s walk through a few case studies that integrate these ideas in a more narrative way.

12.6 Case Studies in Geospatial Visualization

In this section, we explore three applied scenarios:

Mapping Migration and Displacement – using UNHCR refugee data.
Visualizing Trade Patterns and Economic Zones – using trade flow data.
Mapping Territorial Conflict and Geopolitical Risk – using conflict event data and risk indices.

Each case study will outline the context, the data sources and preparation, the visualization approach, and the insights gained. These illustrate end-to-end examples of geospatial data visualization in R for real-world geopolitical and economic topics.

Case Study 1: Migration and Displacement (Refugee Flows and Distribution)

Context: By the end of 2022, over 100 million people were forcibly displaced worldwide – including refugees, asylum-seekers, and internally displaced persons – a record high. Understanding where displaced people originate, which countries they flee to, and the magnitude of flows is crucial for humanitarian policy and international relations. Maps are a key tool to communicate the scale of displacement crises and to plan interventions (e.g., refugee camp locations, international aid distribution).

Data sources: We use open data from UNHCR (The UN Refugee Agency). The UNHCR Refugee Data Finder provides country-level statistics on refugees (by origin and asylum country, annually). We also use country boundary data from Natural Earth (via rnaturalearth) for mapping. Additionally, UNHCR’s geoservices provide location data for camps and UNHCR operations, but here we focus on country-level aggregates.

For flows, we obtained a table of refugee origin-destination pairs (e.g., “Syria -> Germany: X refugees”). UNHCR’s population statistics report such flows for major populations. Alternatively, one could use the unhcrdatapackage in R which contains datasets of end-of-year refugee populations by origin and asylum.

Data Preparation: We clean and merge the refugee data with spatial data. For example, using end_year_population_totals_long from unhcrdatapackage, we filtered for Year = 2021 and Population type = “REF” (refugees) and grouped by Country of Asylum and Country of Origin to get totals. We then merged this with a world countries SF to get both “origin country” and “asylum country” geometries. For mapping flows, we created line geometries from origin centroid to asylum centroid for each pair.

Visualizations:

Global Choropleth of Refugee Hosting: We created Figure 1 earlier, a choropleth shading each country by the number of refugees it hosts (asylum). This immediately showed that a few countries (often neighboring conflict zones) bear the largest burden – for 2021, countries like Turkey, Pakistan, Uganda, Iran, Germany were among top hosts, each with hundreds of thousands or millions of refugees. The choropleth effectively communicated the uneven distribution of refugees across the world – many developed countries in Europe had significant numbers, but some developing countries near conflict regions (e.g., Turkey next to Syria, or Uganda next to South Sudan/DRC) also had very large refugee populations, highlighting potential resource strains. The map also included a note (in the caption) that “The boundaries and names shown… do not imply official endorsement”, reflecting UN map standards (a best practice when mapping politically sensitive data).
Bubble Map of Origins: Figure 2 (bubble map) complements the host choropleth by showing countries of origin of refugees with proportional circles. This map vividly showed Syria, Venezuela, Afghanistan, South Sudan, Myanmar as major source countries of refugees (big bubbles), among others. It helped emphasize that a relatively small number of conflicts or crises (in Syria, Afghanistan, etc.) account for a large share of global refugees. Combining Figures 1 and 2, one can observe, for instance, that Syria (origin) has a large bubble, and neighbors like Turkey and Lebanon are shaded dark as they host those refugees. This pairing of maps is powerful to tell the displacement story from both sides.
Flow Lines: To illustrate routes, we created an interactive flow map (not easily shown in a static figure here). Using leaflet, we drew curved arrows for major refugee flows, for example from Syria to Turkey, Venezuela to Colombia, Afghanistan to Pakistan/Iran, South Sudan to Uganda, etc., with line width proportional to the flow. In a static print, such a flow map can get cluttered; however, interactively one could filter by origin or destination. In our case, we perhaps limited to top 10 flows worldwide and annotated them. Even with just a few flows, the geographic pattern is clear: most refugees flee to neighboring countries (short lines on the map), not distant ones – which is an important insight for geopolitics (neighboring countries often carry regional instability burdens).
Refugees per Capita Map: As another perspective, we combined refugee counts with host country population to map refugees per 1,000 inhabitants. This showed some smaller countries like Lebanon and Jordan at the top of that ranking (since Lebanon’s own population is relatively small but hosts many Syrians, resulting in a high per capita ratio, often above 1 in 6 people). This per-capita choropleth can sometimes invert perceptions – some wealthy countries host many refugees in absolute terms but not so many relative to their population, whereas some poorer countries host a huge share relative to their size. Such a map is a useful tool in policy debates about fair burden-sharing.

Technical notes: We had to ensure country names/codes matched between UNHCR data and Natural Earth. ISO3 codes were used (with a few manual adjustments for differences like “SYR” vs “SYR” etc. – luckily ISO codes are standard but care needed for territories not in Natural Earth). We used sf::st_centroid to get points for origin and destination for drawing flow lines. We ensured all data were in lat-long (so that great circle interpolation by geosphere would work correctly). We also kept only significant flows for clarity – filtering out flows below 10,000, for example.

Insight and communication: The combination of these visualizations in an interactive dashboard (which one could create with Shiny) allows policymakers to explore questions like: “If conflict X worsens, which neighboring countries will be most affected by refugee inflows?” or “Which host countries might need more support given how many refugees they have relative to their capacity?” Maps provide an intuitive way to grasp the scale and geography of displacement, complementing raw statistics. In reports, UNHCR frequently uses such maps to advocate for aid (e.g., showing how Syrian refugees are concentrated in just a few countries like Turkey and Lebanon). By using R and open data, students can recreate and adapt these analyses – for instance, overlaying additional data like GDP or conflict events to see correlations between conflict intensity and refugee outflow.

Case Study 2: Trade Patterns and Economic Zones

Context: International trade is a cornerstone of global economics and geopolitics. Trade agreements and economic blocs (like the EU, NAFTA/USMCA, ASEAN) shape trading patterns, as do geography (neighboring countries often trade heavily) and transportation networks (shipping lanes, etc.). Visualizing trade flows can highlight the structure of the global economy: for example, major trade corridors, regional integration, or countries that serve as key hubs. It can also reveal vulnerabilities (e.g., dependence on a single trade partner).

Data sources: For this case, we consider a simplified scenario: trade flows of certain commodities between key countries. As an example, we might use data from the UN Comtrade database (which has bilateral trade statistics). In a teaching context, one might limit to a subset, e.g., grain trade among a few countries, or total trade volume among G20 nations. There are R packages like comtradr or direct downloads from Comtrade (CSV). We also use Natural Earth for country boundaries. Additionally, we might categorize countries by trade bloc membership (which can be manually curated or taken from an official list).

Data Preparation: Suppose we retrieved data on total exports between some major economies. We will have a table of Exporter, Importer, TradeValue. We ensure each country has a centroid coordinate for mapping lines. We also create a factor or group for trade bloc (e.g., label each country as EU, Asia-Pac, etc., if illustrating intra-bloc trade).

We might filter to the top N flows globally to keep the map readable. Alternatively, we could focus on a specific region (say Asia-Pacific trade network) and map all flows there.

Visualizations:

Global Trade Network (simplified): Using geom_net or geom_curve, we plotted flows between the US, China, Europe (aggregated), and maybe a few other large players. For instance, big arrows between China and US, US and EU, EU and China forming a triangle of major world trade. Additional flows from those hubs to others (like China–Japan, US–Mexico, etc.). The map can use curved arrows over a world map. A specific example: if mapping grain trade, one might show arrows from North America and Black Sea region to import-dependent regions (Middle East, North Africa). In our demonstration, we made a map with arrows whose thickness represents trade volume and color indicating direction (perhaps blue for exports from Country A, red for the opposite direction).
Choropleth of Trade Blocs: We colored countries by their economic mega-bloc (e.g., EU, NAFTA, ASEAN, etc.) to provide context. While not numeric, this categorical map serves to show how the world is partitioned into trading regions. Overlaid on that, we could place symbols or annotations for trade volumes. For example, a map showing the European Union in one color, NAFTA in another, etc., then lines illustrating trade between these blocs. One might observe that trade volumes are often highest within blocs (e.g., European countries trading with each other) compared to between far-flung partners, although globalization has also increased inter-bloc trade.
Interactive drill-down: In an interactive setting, one could allow clicking a country to highlight all its trade links. For instance, click on UK – see its export/import lines light up. With leaflet, one could achieve this by using highlightOptions or using Shiny to dynamically subset.

Insights: These maps can show patterns like:

The dominance of a few corridors: e.g., trans-Atlantic (US-EU) and trans-Pacific (US-China) trade flows overshadow many others in sheer value.
Regional trade: e.g., European countries sending majority of exports to other EU members.
The effect of geography: neighbors often trade a lot due to lower transport costs (US-Canada/Mexico, Germany-France, China-Japan, etc. all high).
Identify outliers: a country that trades heavily with a distant partner (possibly due to natural resources, e.g., oil from Middle East to East Asia, which can also be visualized).

For instance, if mapping oil trade, you’d draw arrows from the Middle East, Russia, and North America to major import regions like East Asia or Europe. You’d immediately see Europe’s reliance on Russian pipelines (if included) or East Asia’s reliance on Middle Eastern tanker routes. In fact, one could incorporate actual routes (sea lanes) – but that gets complex (requires mapping along actual maritime paths; beyond our simple great-circle lines).

Technical notes: Working with trade flow data often involves dealing with directed edges and possibly duplication (Country A to B vs B to A). We made sure to treat them separately for clarity. If using great circles, we took care not to draw lines the long way around the globe (the gcIntermediate function and specifying addStartEnd=TRUE solves that by including endpoints). We also sometimes applied a slight offset for overlapping opposite flows (like if two countries exchange similar volumes, their lines could overlap – we can nudge one curve’s midpoint). There are advanced packages like flowmapper or network visualization libraries, but our approach with simple sf lines and ggplot/tmap suffices for a conceptual overview.

Communication: Trade flow maps are popular in publications to illustrate concepts like the rise of South-South trade or the relative decline of trans-Atlantic dominance. One must be cautious not to overclutter: often, a few key flows tell the story better than showing every minor exchange. Another best practice is to annotate flows with labels if possible (like “$500B” near an arrow, although doing that dynamically can be tricky if arrows move). In static maps, one might number flows and have a legend listing values.

Case Study 3: Territorial Conflict Mapping and Geopolitical Risk

Context: Geopolitical risk often manifests spatially – conflicts occur in specific locations, disputed territories have clear boundaries, and political risk indices vary by country/region. A classic task is mapping conflict events or intensity to understand patterns (e.g., concentration of violence in certain provinces) and perhaps overlaying other layers (ethnic maps, resource locations) to seek correlations. Similarly, mapping a “geopolitical risk index” by country (a composite of factors like conflict likelihood, political stability, etc.) provides a global risk overview for investors or policy makers.

Data sources: We use the Armed Conflict Location & Event Data (ACLED) for conflict events, which is a detailed dataset of political violence events with coordinates. For geopolitical risk index, there are proprietary indices (like by insurance companies or think-tanks) but for demonstration, we might construct a simple index combining things like conflict presence, governance indicators, etc., or use the Fragile States Index (available by country). For mapping disputed areas, one might use data such as Natural Earth’s disputed boundaries layer or specific shapefiles for areas like Kashmir, South China Sea claims, etc. We will focus on conflict events and a risk choropleth.

Data Preparation: We filtered ACLED data for a region (say Africa for year 2022) to make it manageable. We aggregated the number of events and fatalities by country and by sub-national regions. We also classified events into types (battles, protests, etc.). We obtained a country-level “conflict intensity score” (e.g., fatalities per 100k population) for use in a country choropleth.

Visualizations:

Conflict Event Map with Points: We plotted all conflict events in Africa for a year as points on a map, using color or shape to denote event type (e.g., red dot = battle, blue = protest, etc.) and maybe size to denote fatalities. Using leaflet allowed us to add popups with event details on click. The map showed clusters of events in certain areas (for example, the Sahel region, Horn of Africa, eastern DR Congo, northern Nigeria, etc., depending on year). This point map gives granular detail but can be overwhelming with thousands of points. To mitigate that, one could use leaflet’s clustering (via markerClusterOptions) to cluster nearby points when zoomed out.
Heatmap / Kernel density: As an alternative visualization, we computed a kernel density raster of conflict event intensity and overlaid it on the map (e.g., using a red-yellow heatmap). This provides a smoother “heatmap” highlighting hotspots of conflict without needing to interpret thousands of points. The stat_density2d in ggplot or smooth_map() in other packages can do this. We would see blobs over regions like, say, the Lake Chad basin (Boko Haram conflict) or Somalia, etc. This is good for conveying intensity but loses precise info.
Choropleth of Risk by Country: We created a global map where each country is colored by a geopolitical risk score. Suppose we used a scale 0-10 (10 = highest risk). We might derive it from combining conflict presence, regime type, and other indicators. The map perhaps shows high risk in war-torn countries (Yemen, Syria, DRC, etc. = in red), moderate risk in countries with some instability, and low risk in stable OECD countries (in green). This map is a bit subjective (depending on how index is made), but it’s similar to maps produced by consultancy firms. It gives a quick comparative assessment. It could be accompanied by an interactive element where clicking a country shows a tooltip with the index components (like “Risk 8.5 – factors: ongoing conflict, weak governance”).
Territorial Dispute Illustration: To highlight a specific geopolitics issue, we mapped an area like the South China Sea with overlapping territorial claims. Using a polygon shapefile for claimed zones, colored by claimant country, with transparency, we can show overlapping colors where disputes occur. Another example: mapping Kashmir with Indian vs Pakistani claim lines (perhaps using a dashed line style for disputed border as is common in international maps). This visual communicates the contested nature of territory. We could overlay conflict incidents or military deployment points if data available.

Insights: From the conflict maps, one quickly sees that conflicts are not randomly distributed; they concentrate in certain countries and regions. Overlaying factors can yield insight (for example, overlay a map of mineral resources or ethnic groups on conflict locations to test hypotheses like resource conflicts or ethnic tensions). The risk choropleth simplifies a lot but is useful for high-level briefing – e.g., an investor might use it to decide where their supply chain might be exposed. It’s a way to summarize many complex sub-indicators into one view, but the analyst must be cautious about oversimplification (one should always accompany such maps with an explanation of what the index entails).

Technical notes: When mapping large event datasets (ACLED has 100k+ points), performance can be an issue. We used leaflet with clustering to handle this interactively, and down-sampled or focused on one year for static mapping. We also used sf::st_join to count events per polygon (country or province). We had to join country names with population data for normalization. For disputed boundaries, since it’s sensitive, we added a note on the map if needed (e.g., similar to UNHCR maps, one could add “Borders are not authoritative” disclaimer).

Communication: Conflict maps often evoke strong responses, so clarity and neutrality are key. Using neutral basemaps and clear legends (with units like number of incidents, etc.) ensures the map is seen as informative, not sensational. Also, one should cite sources (e.g., “Conflict data © ACLED, used with permission” if required, and source of the risk index). In a policy paper, one might include a static map highlighting a particular hotspot, then provide an interactive link for the reader to explore further details online.

12.7 Best Practices in Geospatial Visualization for Communication

Creating maps for economic and political insights is not just about technical execution; it also requires careful design and awareness of how viewers interpret maps. Here are some best practices to ensure your geospatial visualizations communicate effectively and ethically:

Choose Appropriate Map Projections: The projection should suit the purpose. For global thematic maps, an equal-area projection (like Mollweide or Gall-Peters) can be preferable to avoid exaggerating areas (e.g., Mercator makes high-latitude countries look bigger, potentially misrepresenting values in choropleths). If using Web Mercator (for interactive maps), be mindful of its distortion – it’s fine for visualization but not for comparing areas. For regional maps, use a projection centered on that region. R’s coord_sf(crs= ...) or st_transform makes it easy to change projections. A well-chosen projection can also enhance aesthetics (e.g., Robinson for world maps gives a pleasing balance). Always be cautious if mixing data – ensure they’re in the same projection, or use coord_sf which will auto-project layers to the specified CRS.
Color Choices and Legends: Use color ramps appropriate to your data type. Sequential palettes (light to dark of one hue) for 0-to-high numeric values (e.g., population). Divergent palettes (two hues diverging from a midpoint) for variables with a meaningful middle (e.g., above/below average, or positive/negative changes). Qualitative palettes for categorical data (distinct colors, ideally tested for colorblind friendliness). Tools like ColorBrewer (available via RColorBrewer or built-in in tmap/ggplot) are helpful. In our examples, we used viridis and brewer palettes that are perceptually uniform and colorblind-safe. Avoid using too many colors – it can confuse. Also ensure sufficient contrast between map background and your data layers. For instance, if using a dark basemap, use bright colors for overlays (and vice versa).

Always include a legend that clearly explains colors or symbol sizes. In tmap or ggplot, ensure the legend has intuitive labels (we often customized breaks for this reason). If using unusual units (e.g., log scale), indicate that. In interactive maps, a legend is still needed; plus tooltips on hover can provide exact values, which is great for user engagement.

Avoid Misleading by Normalizing Data: Choropleths of raw counts can mislead if areas differ greatly in size or population. It’s often better to map densities or rates (e.g., GDP per capita instead of total GDP, conflict incidents per km² or per capita, etc.) so that values are comparable. If mapping absolute counts, be aware that large countries might catch the eye simply due to size, not higher intensity. For example, mapping total CO₂ emissions by country makes Russia look huge partly due to area – mapping emissions per capita or per GDP might flip the story. In our refugee maps, we made both absolute and per-capita versions to provide context.
Keep it Simple and Clear: Don’t overload a single map with too many layers or information types. It’s better to produce a small series of maps (or an interactive where layers can be toggled) than one static map trying to show 5 datasets at once. Each map should ideally emphasize one key message. Use supporting elements like titles, subtitles, and captions to guide the reader. For example, a title “Major Refugee Flows in 2021” immediately tells what to look for. A caption can add context or source info.

General cartographic elements like north arrows and scale bars are usually optional for thematic maps at country/world scale (since orientation is obvious and scale varies, especially in projections). For local maps or where distance matters (e.g., a map of a city plan or buffer distances), include a scale bar. North arrows are more crucial if the map is rotated or using a non-standard orientation.

Use Annotations and Labels Wisely: Sometimes adding labels (text) on the map can greatly aid understanding – e.g., label the top 5 countries in a data set directly on the map (instead of readers cross-referencing the legend). In R, geom_text or tmap’s text layers can place labels. But be careful to avoid clutter and overlapping text. You might need to manually adjust positions or use techniques like geom_text_repel from ggrepel package to avoid overlap. In our trade map, labeling flows by value was tricky, so we opted for a legend. But we did label country names in some cases to make sure the reader knows which country is which if the map doesn’t already have that (especially for smaller countries that might not be labeled by default).
Respect Data Uncertainty and Ethics: In geopolitical mapping, some data may be sensitive or uncertain (like unofficial borders, disputed areas, or estimated underground economies). It’s best practice to acknowledge these. For disputed boundaries, use special symbology (e.g., dashed lines) and a note. If data is outdated or estimated, mention it so it’s not taken as precise. We saw in UNHCR maps the disclaimer about boundaries – similar disclaimers might be used when mapping, say, Taiwan as a separate entity depending on the audience, etc. The goal is not to stir controversy with the map’s depiction, unless intentionally making a political point in which case be prepared with rationale and sources.
Accessibility: Ensure the map is readable by people with color-vision deficiencies. Avoid solely red/green contrasts for important differences (common issue as ~8% of men have red-green colorblindness). Tools like viridis palette or ColorBrewer’s colorblind-safe schemes are helpful. Also ensure text (labels, legend) is large enough and clear. In R, it’s easy to adjust theme text sizes or legend key sizes.
Reproducibility and Documentation: Keep track of the transformations you do. With R Markdown, your entire process (from data import to final plotting) can be documented in one place – this is great for ensuring you (or others) can reproduce the map later, and it also documents the sources and any assumptions. Always cite data sources on the map or in the accompanying text. In an academic or policy report, maps typically have a caption with sources, e.g., “Data: UNHCR (2023), Natural Earth.”

By following these practices, your maps will be more trustworthy and effective. A well-designed map can powerfully convey a point that might require many paragraphs to explain – but a poorly designed map can confuse or mislead. Always take a moment to step back and view your map as a reader would: does it answer the intended question? Is anything ambiguous or overly complex? Did we choose the right representation (e.g., symbols vs choropleth) for the data? Iterate on the design based on feedback.

12.8 Troubleshooting and Common Pitfalls in Spatial Workflows

Working with spatial data in R is rewarding but can sometimes lead to frustrations. Here are common issues and how to address them:

Coordinate Reference System (CRS) mismatches: One of the most frequent gotchas is plotting or joining layers that are in different CRSs without realizing. This can result in layers not aligning (e.g., your points appear off in the ocean because the base map is in a projected coordinate system and your data is still in lat-long). With sf, if you plot multiple layers in ggplot, it will try to align them if they have crs set; but if one lacks crs info, it won’t know. Always check st_crs() of your objects. Use st_transform to a common CRS when needed (for analytic geometry operations like intersection, it’s often best to use an equal-area projection in a common datum). If you see an error like “cannot transform sfc object” or a warning about missing CRS, fix the CRS definitions. Pro tip: If you know the data’s supposed CRS but it’s missing (NA), use st_set_crs() to define it (not transform, just declare). If the data has the wrong CRS, you might need to reproject appropriately.
Large file sizes and performance: Spatial datasets (especially fine-resolution rasters or large vector layers with many detailed polygons) can be huge. Reading a 1GB shapefile can be slow, and plotting it even slower. Strategies: simplify geometries if high detail isn’t needed (R has rmapshaper::ms_simplify or sf::st_simplify to reduce polygon vertex count). When mapping at country level, using a simplified world boundaries file (like Natural Earth “small” scale) is better than a high-res GADM. For rasters, reading a huge raster into memory (terra does on-the-fly processing sometimes but be mindful of memory limits) – you can crop or downsample if full resolution isn’t required. If running into performance issues in loops (like creating many sf objects), vectorize operations or use spatial indexing (e.g., st_join is faster than a manual loop through each point to find where it falls). When using leaflet, too many objects (say >10k) might make the browser lag; use clustering for points or pre-tile heavy polygon layers (there are advanced packages to turn sf into map tiles, e.g., leafgl for using WebGL rendering).
Topology errors: Sometimes geometry data has issues: self-intersecting polygons, slivers, etc. Functions like st_union or st_buffer(x,0) (a trick to attempt to “clean” geometry) can help by making them valid. If you get errors like “TopologyException” when doing spatial ops, consider using st_make_valid() from lwgeom package on the geometry. Also note that st_intersection can drop features if they don’t intersect; maybe use st_join depending on need.
Joining by names issues: When merging data by country names or other text fields, inconsistencies will cause NA values for some (e.g., “Côte d’Ivoire” vs “Ivory Coast”, or different spellings). It’s a common pitfall that leads to missing data on the map. The solution is to standardize names or use ISO codes. One approach is to have a reference table of equivalences. The countrycode package can convert country names to different versions or to ISO codes which you can then match. Always verify after a join how many matches happened and which didn’t (e.g., use anti_join to see which keys didn’t match).
Plotting order and layering: In static maps, the order you add layers matters. E.g., if you add points then polygons, the polygons might obscure the points. We often first draw polygons (base map) and then points/lines on top. In ggplot, that’s literally the order of geom_sf calls; in tmap, it’s the order of tm_shape layers (first is bottom). In leaflet, each add* call adds on top by default (though z-index can be set). If some points not visible, check if they are underneath something. Also watch out for the color opacity – fully opaque fills might hide gridlines or boundaries behind.
Encoding and locale issues: Reading shapefiles or data with non-ASCII characters (accents, etc.) might lead to garbled text if encoding isn’t handled. This is not R-specific but general. Ensuring your R session is using UTF-8 (and data is UTF-8) usually solves it. In sf, you can specify options(sf.encoding = 'UTF-8') if needed when reading data. Or post-process names with iconv.
Common error messages:
- “Error: st_crs(x) == st_crs(y) is not TRUE” – You attempted an operation on two layers with different CRSs (like st_join or st_intersection). Fix by transforming one to the other’s CRS.
- “object not found” when plotting – If using ggplot with aes in geom_sf, remember to put data = ... inside geom_sf if outside main ggplot() call. E.g., ggplot() + geom_sf(data=my_sf, aes(fill=var)). If you do ggplot(my_sf) + geom_sf(aes(fill=var)) that also works. Just ensure the object is in the correct place.
- “long-running operation / session hangs” – likely reading a huge file or plotting too many things. Abort (if in RStudio) and try a lighter approach or check memory usage.
- Leaflet maps not showing geometry or shifted – If you see nothing, possibly the data’s coordinates were not in lat-long and leaflet didn’t know how to project properly. Ensure data given to leaflet is WGS84 (EPSG:4326). If using sf objects, the leaflet package should warn if CRS is not 4326. Use st_transform(your_sf, 4326) before leaflet.
Package version issues: The R spatial ecosystem had a transition from sp/rgdal to the sf and terra world. If you encounter older code using sp objects (SpatialPolygonsDataFrame, etc.), you might need to convert (with st_as_sf() for vectors or rast() for rasters). Also note that rgdal and rgeos are being retired as of 2023 in favor of sf’s inbuilt functionality. Stick to modern packages to avoid deprecation issues. Always check package docs for changes (e.g., tmap had some changes in how modes are set, etc.). The CRAN Task View for Spatial (maintained by Roger Bivand) is a good reference.
Visualization pitfalls: Not exactly technical errors, but keep in mind:
- If mapping rates or percentages, a small region with a high rate might visually dominate, so consider inset maps or arrows pointing it out if it’s too small to see (e.g., mapping GDP per capita, tiny Luxembourg could be highest but barely visible on a map).
- When using bubble maps, beware of extreme outliers that skew the scale – you might cap the max size or use a transform.
- Combining 3D effects (like rayshader package) looks cool but can distort perception of area if overused – typically avoid for serious analysis maps, but can be used in moderation for visual appeal in presentations.

By anticipating these issues, you can save time. And when something looks off, systematically check: Are my data in the right coordinate system and range? Did my join drop data? Is the plotting order correct?

A handy debugging step: plot intermediate steps. E.g., if points aren’t showing, do plot(st_geometry(points)) and plot(st_geometry(polygons)) separately to ensure they both have sensible locations. If the point plot looks blank, maybe coordinates were wrong (e.g., lat/long swapped or in degrees when you thought meters). Always validate data by printing a few rows or summary (min/max of coordinates, etc.).

12.9 Final Reflection: Reproducibility, Open Data, and the Role of Spatial Storytelling

In this chapter, we demonstrated how R can be used to turn raw geospatial data into meaningful visual stories about our world – from migration crises to trade networks and conflict zones. A few closing thoughts tie together the importance of reproducibility, open data, and spatial storytelling in the realm of geopolitics and economics:

Reproducibility: Using R and R Markdown (or Quarto) for geospatial analysis ensures that every map and analysis is reproducible. Instead of manually clicking in a GIS software and producing a map (which might be hard to exactly recreate), our code-centric approach means that anyone with the same data and code can regenerate the identical map. This is crucial in research and policy: if a map is used to justify a decision, one should be able to audit how it was made. Reproducible workflows also allow easy updates – for instance, when new yearly data comes in, you can re-run the notebook to produce updated maps. Throughout this chapter, we used script examples to emphasize this. By integrating narrative, code, and output in one document, an R Markdown report can serve as both analysis and communication piece. We encourage readers to adopt this approach – it might seem complex at first, but it pays off in transparency and efficiency.

Open Data and Tools: We leveraged open-source datasets (UNHCR, Natural Earth, GADM, etc.) and open-source software (R and its packages). This democratizes access to insights. Twenty years ago, one might need expensive GIS software and proprietary data to do what we did here. Now, with a few R packages and internet access, a student can map global issues using authoritative data. Open data portals (like World Bank, UN open data, national open data sites) are treasure troves for analysis. The UNHCR Operational Data Portal for example provides up-to-date refugee stats under a CC-BY license, meaning we are free to use and remix it as long as we attribute. Natural Earth’s public domain data allows use in any project without worrying about licenses. These reduce barriers for researchers in developing countries or independent analysts to contribute knowledge. The R community’s open-source packages (often developed by academics or passionate volunteers) also mean we can stand on the shoulders of giants (sf, ggplot2, etc.) without reinventing the wheel.

Spatial Storytelling: Maps are not just analytical tools; they are storytelling devices. A compelling map can evoke empathy, highlight injustice, or celebrate progress in ways that pure numbers cannot. In policy research, this is sometimes called the “power of the map” – the ability to present evidence in a visceral, intuitive form. But with that power comes responsibility: we as map-makers must ensure the story we tell is accurate and not misleading. We discussed how design choices can bias interpretation (e.g., color scales or data normalization). Being aware of these pitfalls is part of ethical storytelling.

On the positive side, spatial storytelling can greatly enhance communication. For example, instead of just stating that “most refugees are hosted in developing nations,” showing a map where wealthy countries are mostly light-colored and certain poorer countries are dark-colored (in a refugee choropleth) drives the point home immediately. Or consider urban economics: a map of night-time lights (as a proxy for economic activity) tells a story of inequality between North and Sub-Saharan Africa far more strikingly than a table of GDP.

Integrating into Decision Making: Increasingly, maps are used interactively in briefings – think of dashboards for COVID-19 or conflict tracking. The skills covered here allow one to create such dashboards (with Shiny) or static reports that decision-makers can use. It’s important to tailor the complexity of the map to the audience: a minister might want a simple clear map with big picture, whereas an analyst might want the detailed version. With R, generating both from the same data is feasible (just different filtering or styling).

Continued Learning: This chapter covered a broad range of topics. There is always more to learn – specialized methods like geospatial modeling, interactive storytelling (e.g., combining maps with time sliders), 3D visualizations (using packages like rayshader), and more. Students are encouraged to explore further:

Geocomputation with R (Lovelace et al. 2019) is an excellent free online book that delves deeper into analysis and even predictive modeling with spatial data.
CRAN’s task views and vignettes of packages (e.g., Tennekes 2018 for tmap, Pebesma 2018 for sf) provide more advanced usage examples.
For interactive maps beyond the basics, look at mapdeck (which integrates with Mapbox GL) or deckgl for truly large-scale webmapping in R.
If interested in raster remote sensing and stars package (for multi-dimensional arrays), check out Pebesma’s work on stars package and resources on analyzing satellite imagery in R.

Final thought: In a world increasingly driven by data, spatial data adds an important dimension – literally. It helps root abstract numbers in the real world context. For a policymaker, seeing a map of “hotspots” can spur action targeted at those places. For the public, an interactive map can make complex reports accessible and engaging. As you continue to develop geospatial skills in R, remember that clarity, accuracy, and purpose are your guiding principles. Every map should have a question it’s answering or a story it’s telling. By combining robust analysis with thoughtful design, your geospatial visualizations can become powerful tools for understanding and improving the world.

References

Corrected Reference List (APA 7th ed.)

Cheng, J., Karambelkar, B., & Xie, Y. (2022). leaflet: Create interactive web maps with the JavaScript Leaflet library (Version 2.1.1) [Computer software]. https://rstudio.github.io/leaflet/

GADM. (2022). GADM database of global administrative areas (Version 4.1) [Data set]. https://gadm.org/

Hijmans, R. J. (2022). terra: Spatial data analysis (Version 1.5) [Computer software]. https://rspatial.org/terra/

Hisham, S. (2022, July 13). Geospatial intelligence to mitigate global geopolitical risks [Interview with Manuel Ntumba]. Geospatial World. https://geospatialworld.net/prime/interviews/geospatial-intelligence-to-mitigate-global-geopolitical-risks/

Juergens, C., & Redecker, A. P. (2023). Basic geo‑spatial data literacy education for economic applications. KN – Journal of Cartography and Geographic Information, 73(1), 147–159. https://doi.org/10.1007/s42489-023-00135-9

Kelso, N. V., & Patterson, T. (2009). Natural Earth vector. Cartographic Perspectives, 64, 45–50. https://doi.org/10.14714/CP64.148

Lovelace, R., Nowosad, J., & Muenchow, J. (2019). Geocomputation with R. CRC Press. https://doi.org/10.1201/9780203730058

Pebesma, E. (2018). Simple features for R: Standardized support for spatial vector data. The R Journal, 10(1), 439–446. https://doi.org/10.32614/RJ-2018-009

Tennekes, M. (2018). tmap: Thematic maps in R. Journal of Statistical Software, 84(6), 1–39. https://doi.org/10.18637/jss.v084.i06

UNHCR. (n.d.). Operational Data Portal: Refugee situations [Data repository]. Retrieved July 15, 2025, from https://data.unhcr.org/

UNHCR. (2023). Refugee Data Finder [Database]. https://www.unhcr.org/refugee-statistics/

Zimmermannová, J., Redecker, A. P., Menšík, M., & Jürgens, C. (2021). Geospatial data analysis and economic evaluation of companies for sustainable business development—An interdisciplinary teaching approach. Sustainability, 13(20), 11245. https://doi.org/10.3390/su132011245

# Geospatial Data Visualization in R for Geopolitics and Economics Geospatial data – data that has an associated location component – is fundamental in analyzing economic and political phenomena. Many economic and social factors (from real estate markets to public health outcomes) are inherently spatial: they are tied to specific locations and regions. According to recent research, it is *“easy to realise the inherent importance of geospatial data in an economic context”*. By incorporating location information, analysts can uncover spatial patterns of wealth, conflict, trade, and human development that would be invisible in non-spatial data. For example, mapping economic indicators can highlight regional disparities in income or infrastructure, informing targeted policy interventions. In geopolitics, maps are equally indispensable – from visualizing territorial disputes to tracking refugee movements and conflict hotspots. As one expert on geospatial intelligence notes, *“The location-based approach to geopolitical risk management saves time, money, and [lives]”*. In other words, geographic visualization and analysis help decision-makers quickly identify where crises or opportunities are unfolding on the ground, enabling more effective and timely responses. Geographical Information Systems (GIS) and spatial analysis techniques have thus become vital tools in both social science research and policy applications. They are widely used in government, business, and academia to input, store, analyze, and **map** data tied to locations. GIS and spatial visualization support evidence-based policy by communicating complex data in an intuitive map format that can reveal relationships and trends at a glance. For instance, mapping population density against economic output can show how urbanization drives growth, while overlaying maps of ethnic groups and natural resources might help anticipate conflict zones. The combination of geospatial analysis with traditional economic and political analysis provides added value by considering *where* phenomena occur, not just *how much* or *when*. This spatial perspective is increasingly recognized as essential for sustainable development and security: integrating geospatial methods into economics and political science is a new but rapidly growing interdisciplinary approach. Educators are even calling for greater geospatial literacy among economists and policy analysts, arguing that spatial thinking and map-making skills are now crucial for reliable decision-making. In this chapter, we provide a comprehensive overview of **geospatial data visualization in R**, with a focus on applications in economic development and geopolitical analysis. We will cover the types and formats of spatial data, tools for importing and manipulating such data in R, and techniques for creating both static thematic maps and interactive web maps. Throughout, we will use real-world open datasets – including country boundaries from **GADM** and **Natural Earth**, refugee statistics from **UNHCR**, and other socio-economic indicators – to demonstrate practical mapping examples. We will explore case studies such as mapping migration and displacement, visualizing international trade networks, and mapping conflict or risk indices. Best practices for effective map communication will be discussed, along with common pitfalls (like coordinate reference system issues and data quality concerns) and troubleshooting tips. Finally, we reflect on the importance of open data, reproducibility, and “spatial storytelling” in policy research – using maps not just as analytical tools but as compelling narratives that can influence decision-making. By the end of this chapter, readers (especially students and researchers in social sciences and data science) should be comfortable working with spatial data in R and creating insightful maps for economic and geopolitical topics. We assume basic familiarity with R and the tidyverse, but no prior GIS experience is required. Code examples are provided in **R Markdown** style, showing how to load data, perform spatial operations, and produce visualizations step by step. Each example is accompanied by explanations to ensure clarity. We also provide references for further reading and links to all datasets and R packages used. Emphasis is placed on open-source tools and data so that readers can reproduce the examples and adapt them to their own research questions. ## Types of Spatial Data and Common Formats **Spatial data** comes in two primary forms: **vector** data and **raster** data. These differ in how they represent geographic phenomena: * **Vector data** uses discrete geometric features – **points**, **lines**, and **polygons** – to model real-world entities. Each feature has coordinates defining its shape and location (e.g., a point for a city, a line for a road or river, a polygon for a country or a land parcel). Vector data is typically used for political and administrative boundaries, transportation networks, locations of economic assets, etc. It is an efficient representation for discrete objects and is often easier to style and label on maps. For many cartographic products, vector data is preferred *“because of [its] easier handling and smaller file sizes”* (compared to high-resolution rasters). Vector datasets usually come with attribute tables – additional columns describing each feature (such as a country’s name, GDP, population, etc.), enabling rich thematic mapping. * **Raster data** represents the world as a grid of cells (pixels), each with a value. Rasters are used for continuous phenomena that vary over space, such as elevation, temperature, rainfall, land cover, or satellite imagery. For example, an elevation raster might assign each 30m x 30m cell a value equal to altitude in meters. Rasters are essentially images with georeferencing: a digital photograph is a raster, and so is a heatmap or any gridded statistical surface. The resolution (cell size) of a raster determines its detail and file size. Rasters are ideal for environmental and climatic data, remote sensing imagery, and other analyses where continuous variation is important. In R, raster data can also be used to create “heatmaps” of economic or conflict intensity across a region by interpolating point data. Most spatial datasets you will encounter fall into one of these categories or combine them (e.g., a satellite image with vector overlays of city locations). When working in R, the **`sf`** package (simple features) provides a standardized way to handle vector data, and the **`terra`** package (and its predecessor **`raster`**) provides tools for raster data. We will introduce these packages in the next section. It is worth noting that spatial data often comes with an associated **Coordinate Reference System (CRS)**, which defines how the 2D coordinates correspond to real locations on Earth’s surface (taking into account map projections). We will discuss CRSs and projections as needed, since using consistent CRSs is crucial when combining layers. **Spatial Data Formats:** Spatial data can be stored and exchanged in various file formats. Some common formats include: * **Shapefile (`.shp` + related files):** A popular vector format introduced by ESRI in the 1990s, widely used in GIS software. A shapefile actually consists of multiple files (with extensions .shp, .dbf, .shx, etc.) that together describe the geometry and attributes. Shapefiles can store points, lines, or polygons (one type per file) along with a table of attributes. They remain common in many open data portals and legacy datasets. However, shapefiles have limitations: for example, attribute column names are limited to 10 characters, and they lack support for storing coordinate reference system info internally (usually a separate `.prj` file is provided). Despite these limitations, you will likely encounter shapefiles often. R’s **`sf`** package can read and write shapefiles easily (via the GDAL library). * **GeoJSON (`.geojson`):** An open standard format based on JSON (JavaScript Object Notation) for encoding vector data. GeoJSON is human-readable text and is widely used in web mapping and data sharing (especially with JavaScript libraries like Leaflet or D3). A GeoJSON file can contain points, lines, and polygons (and even mixture of types) along with properties (attributes) for each feature. Many open data APIs (including some from governments) provide GeoJSON outputs. The `sf` package can read GeoJSON files (via `st_read()`), and packages like **`geojsonio`** can help with GeoJSON conversion if needed. One advantage of GeoJSON is that it’s convenient for web applications and small-to-medium datasets, but large GeoJSON files can be inefficient due to their text size. * **GeoTIFF (`.tif` with geotags):** A common raster data format, which is essentially an image file (TIFF) with embedded georeferencing information. GeoTIFFs are used for satellite imagery, scanned maps, digital elevation models, and any gridded data. They can contain multiple bands (e.g., multi-spectral imagery might have separate bands for red, green, blue, infrared, etc.). The `terra` (and `raster`) package can read GeoTIFFs with the `rast()` function. Other raster formats you might see include **NetCDF** (often for climate data and time series of rasters), **GRIB** (for meteorological data), and simple ASCII grids. The good news is `terra` can handle many of these transparently via GDAL. * **File Geodatabase and GeoPackage:** These are database formats for spatial data. Esri’s File Geodatabase (`.gdb`) is proprietary but widely used in professional GIS; R can read them with the **`sf`** package if GDAL is built with FileGDB support or via the open API. **GeoPackage (`.gpkg`)** is an open SQLite-based format that can store multiple layers of vector and raster data in one file. It’s gaining popularity as an efficient, single-file alternative to shapefiles (without shapefile’s limitations). You can read/write GeoPackages in R with `sf` (just like reading a shapefile, but giving the .gpkg file path). * **CSV or Excel with coordinates:** Many simple datasets are just tabular data with latitude and longitude columns (e.g. a list of cities with their coordinates, or a spreadsheet of conflict incidents with lat-long). While not a spatial format per se, these can be turned into spatial data by interpreting the coordinates. In R, one can read such tables with **`readr`** or **`readxl`** and then convert to an `sf` object via `st_as_sf(coords = c("lon","lat"), crs = ...)` specifying the coordinate columns and their CRS (usually WGS84 latitude/longitude). We will demonstrate this using some examples. There are many other formats (KML from Google Earth, GPX for GPS tracks, etc.), but the above are the most frequently encountered in our context. The good news is that **R’s spatial packages can handle a wide range of formats** through well-supported backend libraries. The **sf** package, for instance, uses GDAL under the hood, so `st_read()` can often read shapefiles, GeoJSON, GeoPackage, etc., with a single command. Likewise, `terra::rast()` can load various raster formats. **Example – Reading spatial data in R:** Let’s demonstrate loading some publicly available boundary data. The GADM database provides high-resolution administrative boundaries for all countries (down to districts or communes). Natural Earth provides somewhat generalized cultural and physical geography layers at small, medium, and large scales. We can use the **`rnaturalearth`** package (which wraps Natural Earth data) to get a world countries map, and use **`sf`** for reading a GADM shapefile for a specific country. For example: ```r # Load required libraries library(sf) library(rnaturalearth) # Get a world map of countries (Natural Earth data via rnaturalearth) world <- rnaturalearth::ne_countries(scale = "medium", returnclass = "sf") # Inspect the world object print(world) ## Simple feature collection with 177 features and 64 fields ## Geometry type: MULTIPOLYGON ## CRS: EPSG:4326 (WGS 84) ## ... (country polygons with attributes) # Suppose we want detailed admin boundaries for a country (e.g., India) from GADM # (Assume we downloaded the GADM shapefile for India, e.g., "gadm41_IND_shp.zip") india <- st_read("gadm41_IND_1.shp") # level 1 admin divisions (states) # Note: GADM v4 uses EPSG:4326 by default, like Natural Earth. st_crs(india) ## Coordinate Reference System: EPSG:4326 (WGS 84) ``` In the above code, `world` is an **`sf`** object containing country polygons with various attributes (name, ISO code, population, etc.), all in latitude/longitude coordinates (EPSG:4326). The `india` object represents India’s state boundaries from GADM. The `st_crs()` function shows the coordinate reference system. Both datasets use WGS84 lat-long, which is convenient for combining with many global datasets. If a dataset were in another projection, we could **transform** it with `st_transform()`. For example, to project the world map to a Robinson projection for nicer global visualization, one could do: `world_robinson <- st_transform(world, crs = "ESRI:54030")` (ESRI:54030 is Robinson). We’ll see use of projections in map plotting later. ## Importing and Manipulating Spatial Data in R Working with spatial data in R has been greatly simplified by the **`sf`** package for vector data and the **`terra`** package for raster data. We will introduce common operations like reading data, inspecting and subsetting spatial objects, performing joins, and basic transformations. ### Vector Data with **sf** The **`sf`** package (Simple Features) represents vector geometries (points, lines, polygons) in a user-friendly data frame format. Each row of an `sf` data frame is a spatial feature with attributes, and one special column (usually named `geometry`) contains the geometric shape (with class `sfc` – simple feature column). Edzer Pebesma introduced this standardized framework in 2016–2018 to replace earlier sp classes. The advantages of `sf` are that it integrates well with the **tidyverse** (you can use `dplyr` verbs on `sf` objects, for example) and abstracts away low-level details of spatial data handling. **Reading vector data:** The primary function is `st_read()`, which can auto-detect format by file extension or connection string. In our example above, we used `st_read("gadm41_IND_1.shp")` to load a shapefile. We could similarly load a GeoJSON with `st_read("cities.geojson")` or a GeoPackage layer with `st_read("mydata.gpkg", layer="cities")`. The **sf** package will return an `sf` object. If we print or glimpse an `sf` object, we see the attribute columns and a summary of the geometry (type and CRS). For example: ```r library(sf) # Read a GeoJSON of world cities (for illustration; replace with actual path or URL) cities <- st_read("world_cities.geojson") # Check first few rows head(cities) ## name country pop geometry ## 1 London United Kingdom 8982000 POINT (-0.1 51.5) ## 2 Paris France 2148000 POINT (2.3 48.9) ## 3 Nairobi Kenya 4475000 POINT (36.8 -1.3) ## 4 Tokyo Japan 13960000 POINT (139.7 35.7) ## 5 Mumbai India 20670000 POINT (72.8 19.1) ## 6 New York City United States 8419000 POINT (-74.0 40.7) ``` Each city is a point geometry with attributes like name, country, population, etc. Notice the geometry column shows coordinates (longitude, latitude for these points). By default, `st_read` will often set the CRS to WGS84 if the data source defines it (most GeoJSON are in WGS84 by standard). We can verify using `st_crs(cities)`. **Basic spatial operations:** With an `sf` object, we can do many data manipulation operations using **dplyr** syntax directly. For example, filtering to select certain features or creating new columns. We can also use spatial-specific operations like `st_bbox()` to get the bounding box, `st_area()` to compute polygon areas, `st_length()` for line lengths, or `st_centroid()` to get centroid points of polygons. A few examples: ```r library(dplyr) # Filter world dataset to Asia region asia <- world %>% filter(continent == "Asia") # Compute area of each country (in square kilometers) world <- world %>% mutate(area_km2 = as.numeric(st_area(geometry)) / 1e6) # st_area gives area in square meters by default (assuming the CRS is projected). # In lat-long (WGS84), st_area() returns values in square degrees which are not directly meaningful, # so typically we would project to an equal-area projection before computing true area. # For simplicity, here we treat the value (if world was still in 4326) with caution. ``` In the above, note the caveat: spatial calculations like area and distance are sensitive to projection. If your data is in a geographic CRS (lat-long degrees), functions like `st_area` will not return accurate real-world areas (square degrees are not uniform). A proper approach is to transform to an equal-area projection (e.g., EPSG:6933 – a global equal area projection) before computing areas. However, for small areas or illustrative purposes, one might proceed or use approximate methods. We mention this as a **common pitfall**: always be aware of CRS when doing measurements. **Spatial subsetting and joins:** We can also filter spatially – e.g., select all cities within a certain country polygon, or all conflict points within 100 km of a city, etc. The **`sf`** package provides powerful spatial join and query functions. For instance, `st_join()` can join two spatial layers based on intersection of geometries (e.g., adding country attributes to each city by locating which polygon contains the city point), and functions like `st_intersection()`, `st_union()`, `st_buffer()` allow geometric operations. As an example, suppose we have a dataset of conflict event points and we want to count how many events occurred in each country. We could do a spatial join of points to country polygons and then aggregate: ```r # Example: count conflict events per country conflicts <- st_read("conflict_events.geojson") # a set of points (each with e.g. fatalities, date, etc.) conflicts <- st_set_crs(conflicts, 4326) # ensure CRS is WGS84 events_in_countries <- st_join(conflicts, world, join = st_within) # st_within ensures we join each point to the polygon it's within. # events_in_countries now has event attributes and also country attributes. # To count events by country: event_counts <- events_in_countries %>% st_drop_geometry() %>% # drop geometry for aggregation count(name_long) %>% # count events per country name arrange(desc(n)) head(event_counts) ## name_long n ## 1 Nigeria 350 ## 2 Iraq 280 ## 3 ... ... ``` Above, `st_drop_geometry()` was used because after the join we only need the data columns for counting (dropping geometry converts it to a regular data frame). We used `st_within` to ensure a point must fall *inside* a country polygon to join (there’s also `st_intersects` which is more general). The result `event_counts` might show, for example, Nigeria has 350 events in our dataset, Iraq 280, etc. We could then join this summary back to the `world` `sf` object to visualize it as a choropleth map (we will demonstrate thematic mapping in the next section). **Combining spatial and non-spatial data:** Often we have attribute data in CSVs (like economic indicators) that we want to link to a spatial boundary. A common scenario in geopolitics/economics is you have country-level statistics in a table (identified by country code or name) and a world map shapefile with matching codes. We can merge them easily in R. For example, say we have a data frame `gdp` with columns `iso3` and `GDP_per_capita`. The `world` sf object from Natural Earth has an ISO3 country code column (often named `iso_a3` or similar). We can do: ```r # Combine world spatial data with GDP data frame world <- world %>% left_join(gdp, by = c("iso_a3" = "iso3")) ``` After this, `world` will have a new column for GDP_per_capita. Any country with a matching code gets the data, and others get NA if no match. This kind of attribute join is extremely common – effectively treating the spatial layer as a keyed table. The ability to use **tidyverse** verbs here (like `left_join`) is a benefit of `sf` objects behaving like data frames. ### Raster Data with **terra** (and `raster`) The **`terra`** package is the modern replacement for the older **`raster`** package, written by Robert Hijmans and colleagues. It provides extensive functionality to create, manipulate, and analyze raster data (and can also handle vector data in `SpatVector` format, though we’ll mostly use it for rasters). A raster in terra is represented by a `SpatRaster` object, which can be single-layer or multi-layer (stack of rasters). Terra is optimized for large datasets and faster performance than the older raster package, and it is recommended for new projects (though many older scripts use `raster::RasterLayer`). We will focus on terra here, but note that basic usage of `terra` is similar in many ways to `raster`. **Reading and creating rasters:** Use `terra::rast()` to read a raster file. For example: ```r library(terra) # Read a single-band raster (e.g., an elevation TIFF) elev <- rast("data/elevation.tif") elev ## class : SpatRaster ## dimensions : 1200, 1200, 1 (nrow, ncol, nlyr) ## resolution : 0.008333, 0.008333 (x, y) # degree resolution (approx 1km if WGS84) ## extent : 67.0, 77.0, 8.0, 18.0 (xmin, xmax, ymin, ymax) ## coord. ref. : lon/lat WGS 84 (EPSG:4326) ## source : elevation.tif ## name : elevation ## min value : 0.0 ## max value : 2457.0 ``` The printed summary of `elev` shows it’s a `SpatRaster` with given dimensions (rows, cols), resolution, extent (the bounding box coordinates), CRS (here WGS84 lat/lon), source file, and the min/max of values. Suppose this is a digital elevation model for a region (10x10 degrees grid at \~1km resolution, in this hypothetical example). We can plot it quickly using base plotting: ```r plot(elev, main = "Elevation (m)") ``` This will produce a raster plot with a legend. By default, it chooses a color scale (often a blue-to-green-to-brown palette for elevation). We could also use **ggplot2** to plot raster data by converting it to a data frame of points, but for large rasters that’s not efficient. Another approach is using **tmap** or **RasterVis**, which we will see. Terra also has `plotRGB()` for multi-band imagery. If the raster has multiple layers (bands), `rast()` will create a multilayer SpatRaster. You can subset layers with double brackets (e.g., `r[[1]]` first layer). Terra supports common raster manipulations: e.g. `crop()` to crop to an extent, `resample()` to change resolution or align grids, `mask()` to apply a polygon mask, and mathematical operations on rasters (adding, multiplying, calculating indices, etc.). Many operations can be done using overloaded arithmetic (e.g., if `r` is a SpatRaster, `r * 2` multiplies every cell by 2, `sqrt(r)` takes square root of each cell, etc.). **Example raster operations:** Suppose we have a multi-layer raster representing monthly rainfall (12 layers). We could calculate an annual sum by simply doing `annual_rain <- sum(rain_stack)` – this will sum cell-wise across layers. Or calculate an average: `mean_rain <- mean(rain_stack)`. Terra functions like `app(r, fun, ...)` apply an R function across layers or cells. For instance, `app(rain_stack, fun = sd)` could give a raster of the standard deviation across months for each cell. Coordinate reference systems for rasters are handled similarly: `crs(elev)` gets or sets the CRS, and `project(elev, new_crs)` would reproject the raster to a new coordinate system (interpolating values accordingly). **Raster and vector interplay:** Often we combine raster and vector data – e.g., extracting raster values at point locations, or aggregating raster statistics by polygon. Terra provides functions like `extract()` to get raster values at specified points or for polygons (e.g., average over a polygon). There is also an `as.data.frame()` method (with `xy=TRUE`) to get a data frame of coordinates and values for use in ggplot. For example: ```r # Extract elevation at city locations: cities_elev <- terra::extract(elev, vect(cities)) # 'cities' was an sf, need to convert to SpatVector with vect() cities$elevation <- cities_elev$elevation # add extracted values back to cities sf ``` Here, `terra::extract` took the `elev` raster and a set of points (we converted `cities` to `SpatVector` with `vect()` since terra’s extract expects its own vector type or matrix of coords). The result is a data frame of elevation values for each point, which we then appended to the `cities` data. **Note on the `raster` package:** The older **`raster`** package is still widely used in many tutorials and older code. Its main object is `RasterLayer` (or `RasterStack` for multi-layer). Most of its functions are analogous (e.g., `raster::crop`, `raster::extract`, etc.). If you encounter code using `raster`, know that you can achieve the same with `terra` (in fact, `terra` can even directly read `RasterLayer` objects via as.spacetime conversion). One difference is that `terra` uses 1-based indexing for layers (consistent with R) while `raster` used 1-based as well but some subtle differences in how it handles memory. For new projects, use terra; but if an existing project heavily uses raster, you might continue with it. In this chapter, we'll use terra syntax. **Example – Raster manipulation in R:** We’ll do a quick demonstration by creating a simple raster and doing an operation: ```r library(terra) # Create a synthetic raster of 10x10 cells covering a region r <- rast(nrows=10, ncols=10, xmin=0, xmax=10, ymin=0, ymax=10, crs="EPSG:4326") values(r) <- 1:100 # assign some values 1 to 100 r2 <- sqrt(r) # take square root of each cell value # Plot the original and the transformed raster par(mfrow=c(1,2)) plot(r, main="Original values") plot(r2, main="Square root of values") ``` This will produce side-by-side maps (though obviously our “region” and values are arbitrary here). The example shows how we can treat the raster as a numeric object for cell-wise operations. In practical scenarios, such cell-wise calculations are used for things like computing indices (e.g., NDVI from satellite bands), applying thresholds (e.g., classify land cover), etc. ### Tidyverse Integration and Spatial Data One strength of using R for spatial analysis is the ease of integrating with the **tidyverse** collection of packages. We have already seen using `dplyr` verbs on `sf` objects (which preserve the spatial geometry while modifying attributes). Likewise, one can use **`ggplot2`** to plot `sf` data frames directly. In fact, `geom_sf()` in ggplot2 allows you to create maps from `sf` objects in one line. For example, to plot country outlines: ```r library(ggplot2) ggplot(world) + geom_sf(fill = "cornsilk", color = "gray40") + ggtitle("World map") + theme_minimal() ``` This uses `world` (sf of country polygons) and draws them with a light fill and gray borders. We did not have to manually supply `aes(x, y)` – `geom_sf` knows how to handle the geometry column. We can map attributes to aesthetics similarly: e.g., `geom_sf(aes(fill = GDP_per_capita))` would create a choropleth map if that column exists (with a continuous legend by default). The **`tidyverse`** also includes packages like **`tidyr`** and **`purrr`** that can help if you need to manipulate data frames prior to spatial conversion, or dealing with lists of spatial objects, etc. Another useful package is **`lubridate`** if you have spatio-temporal data (e.g., events over time – you can easily handle dates and then perhaps animate maps or facet by time). While not specifically spatial, these tools often come into play in data preparation. In summary, R’s ecosystem allows you to treat spatial data in a very similar way to regular data frames, making it accessible to those who are already comfortable with data wrangling in R. Next, we turn to the core topic of this chapter: **visualizing** geospatial data – i.e. making maps – including static thematic maps and interactive maps. ## Creating Thematic Maps (Choropleths, Symbol Maps, Flow Maps) A **thematic map** is a map that focuses on a particular theme or data variable. Common types include **choropleth maps** (areas colored according to a value), **proportional symbol maps** or **bubble maps** (symbols sized according to values at locations), and various types of **flow maps** (arrows or lines showing movement or connections between places). In the context of economic and geopolitical data, these thematic maps allow us to visualize patterns such as GDP by region, refugee counts by country, or trade flows between nations. R offers multiple approaches for creating such maps. We will illustrate using both **ggplot2** and **tmap** for static maps, and we’ll discuss specialized tools for flows. ### Choropleth Maps A **choropleth map** shades geographic areas (polygons) based on a data variable, typically using a color scale. This is useful for any region-based metric: e.g., unemployment rate by country, conflict risk by province, election results by district, etc. To create a choropleth, you need a spatial polygon layer and a numeric (or categorical) variable for each polygon. **Using ggplot2:** As shown earlier, `geom_sf(aes(fill = variable))` is the basic approach. For example, if our `world` dataset has a column `GDP_per_capita`, we can do: ```r ggplot(world) + geom_sf(aes(fill = GDP_per_capita), color = NA) + scale_fill_viridis_c(option="plasma", trans="log", na.value="lightgray", name="GDP per capita") + labs(title="GDP per Capita by Country (2021)") + theme_minimal() ``` In this hypothetical code, we used `color = NA` to remove polygon boundaries (to emphasize continuous regions), and we applied a viridis color scale (which is colorblind-friendly) on a log scale (often economic data have long-tailed distributions, and a log transform can make the choropleth more interpretable). We also handle NA values by coloring them light gray, and label the legend. **Using tmap:** The **`tmap`** package is specifically designed for thematic maps and can create both static and interactive maps. It has an intuitive syntax with `tm_shape()` and layer functions like `tm_polygons()`. A choropleth with tmap might look like: ```r library(tmap) tmap_mode("plot") # static plotting mode tm_shape(world) + tm_polygons("GDP_per_capita", style="quantile", palette="Blues", title="GDP per capita") + tm_layout(title="GDP per Capita by Country (2021)", frame = FALSE) ``` This will produce a static map with countries shaded by GDP per capita, using a blue palette and quantile breakpoints (each color category has an equal number of countries, which is often a good way to show variation). `tm_layout` allows various customizations (here we removed the frame border and added a title). Tmap automatically adds a legend for the fill. The result is similar to ggplot’s, but tmap provides many high-level options for cartography (like easier addition of scale bars, north arrows, etc., and a simple switch to interactive mode as we’ll see). **Classed vs. continuous choropleth:** You can decide whether to show the variable as continuous (a smooth gradient) or classified into bins. For example, using `style="quantile"` or `style="jenks"` in `tm_polygons` classifies data. In ggplot, one would instead discretize beforehand (e.g., using `cut()` on the numeric variable to make it a factor of ranges). The UNHCR example we will see below classifies refugee counts into categories (“<10k”, “10k-100k”, etc.) to make a clearer legend. Classification is useful to simplify interpretation, especially if exact values are less important than relative magnitude categories. **Example – Choropleth of Refugee Host Countries:** Using a UNHCR dataset, suppose we have the number of refugees hosted by each country at the end of 2021. We join that data to our world map under a column say `refugees`. Then: ```r # Assuming 'world' has an ISO3 code and we have refugee counts by ISO3 in ref_data world_ref <- left_join(world, ref_data, by=c("iso_a3"="CountryCode")) # Plot using tmap tm_shape(world_ref) + tm_polygons("refugees", style="jenks", palette="YlOrRd", title="Refugees (2021)") + tm_layout(title="Refugee Population by Country of Asylum (2021)", legend.outside = TRUE) ``` This will color each country by how many refugees it hosts, using an appropriate classification (Jenks natural breaks in this case) and a yellow-to-red palette (common for highlighting density or risk). High-value countries will be red/dark, low will be light yellow. The map would immediately show which countries host the largest refugee populations. For a refined map, we might exclude countries with 0 or no data, add labels, etc. We might also consider that a few countries (like Turkey, Colombia, etc.) might dominate the scale, so using a logged scale or manual breaks could be considered. Figure 1 below is an example of a choropleth map produced with R, shading countries by refugee counts:  *Figure 1: Choropleth map of global refugee displacement by country of asylum (2021). Darker blue indicates countries hosting a larger number of refugees. (Source: UNHCR Refugee Data)\*\*.* *(Figure 1 shows a choropleth where countries are colored by the number of refugees they host. Such a map highlights, for example, that countries like Turkey, Pakistan, and Uganda host over a million refugees, whereas many other countries host far fewer.)* ### Proportional Symbol and Bubble Maps Instead of coloring entire regions, sometimes it’s effective to use symbols (circles, squares, etc.) at specific locations scaled by a variable’s value. **Proportional symbol maps** place a symbol (often a circle – hence also called **bubble maps**) whose area (or diameter) corresponds to the data value. These are useful when the data is tied to point locations (e.g., cities, refugee camps, ports) or when we want to show an attribute for countries but avoid the visual issues of choropleths (like large countries dominating the map visually even if their value is moderate). For example, to map total GDP by country, a choropleth can be misleading because large countries (Russia, Canada) might have large GDP simply due to size, whereas small high-density economies (Singapore, Luxembourg) would barely show up. A proportional symbol map could place a bubble at each country’s capital (or centroid) sized by GDP, giving a different perspective (though overlapping symbols can be an issue). **Using ggplot2:** We can add points with `geom_sf` or `geom_point`. If we have an `sf` point layer with a value, say we have an `sf` of country centroids with a column `GDP_total`, then: ```r ggplot() + geom_sf(data = world, fill="gray90", color="white") + # base map geom_sf(data = centroids_sf, aes(size = GDP_total), color="tomato", alpha=0.6) + scale_size_area(max_size = 10, breaks = c(1e12, 1e13, 1e14), labels = c("1 Trillion", "10 Trillion", "100 Trillion")) + labs(title="Global GDP by Country (Bubble Size)", size="GDP (USD)") + theme_minimal() ``` Here we used `geom_sf` twice: first to draw countries with a neutral background, then to draw centroids with size mapped to GDP. We used `scale_size_area` which is good for bubble maps (ensures area of symbol is proportional to value). We set some breakpoints for legend clarity (this assumes GDP values on order of trillions). The result would be a bubble map where big economies (USA, China, EU countries, etc.) have large red circles, smaller economies small circles. This conveys the share of global GDP more intuitively than a choropleth (which would show GDP density or per capita differently). **Using tmap:** tmap has `tm_symbols()` or specifically `tm_bubbles()` for this. If we had city locations with population, for instance, and wanted a bubble map: ```r tm_shape(world) + tm_polygons(fill="gray80") + tm_shape(cities) + tm_bubbles(size="population", col="skyblue", alpha=0.5, scale=1.5, title.size="Population") ``` `tmap_mode("plot")` assumed. This would draw light gray countries and overlay blue bubbles for cities sized by population. The `scale` parameter can adjust the relative size of bubbles. The legend (title.size) will indicate population. One might also use `col` aesthetic to color bubbles by another variable (e.g., region or type) while sizing by population. A concrete example from UNHCR’s visualization tutorials is a bubble map of refugee origins. In that example, each bubble is placed on the country (presumably at centroids) and size represents the number of refugees originating from that country. The code used `geom_sf` with shape=21 (circle) and mapped `aes(size = ref)`. They also styled it with semi-transparent fill. The figure below (Figure 2) illustrates a bubble map concept:  *Figure 2: Bubble map example – circles indicate magnitude of refugees originating from each country (2021). Larger bubbles = more refugees from that country. Map created with `ggplot2` and `sf`.* *(Figure 2 uses bubbles to compare the proportions of refugees by country of origin. Because bubble maps aren’t distorted by area size of countries, one can see, for example, a relatively small country with a very large bubble if it generated many refugees, highlighting it even if it’s geographically small.)* **Design considerations:** When making symbol maps, beware of overlapping symbols (use transparency `alpha` or slight repulsion techniques if many points are close). Also consider that human perception judges area differently than length, so be sure to use `scale_size_area` (which bases on area). A legend with example sizes (as shown) is crucial for interpretation. ### Flow Maps (Lines and Arrows for Movement) Flow maps are used to depict movement or connections between places – such as migration flows, trade routes, or network links. In R, flow maps can be created by drawing lines (potentially with arrows or varying widths) between origin and destination locations. There are specialized tools and packages for network visualization, but we’ll discuss a few methods relevant to geopolitics/economics: * **Great circle or curved lines:** For global flows like migration or airline routes, drawing great-circle arcs can be more accurate (and visually pleasing) than straight lines on a projected map. The **`geosphere`** package has a function `gcIntermediate()` which can compute intermediate points along a great circle between two lat-long coordinates, which you can then plot as a line (or many small segments). The **`ggplot2`** approach could use `geom_curve` or `geom_segment` with appropriate curvature for small areas, but on a global scale those are not true great circles. Another approach: convert origin-destination pairs into an `sf` LINESTRING geometry and let projection handle curvature. * **`sf` simple features lines:** You can programmatically create line geometries from point pairs. For instance, if you have a data frame of flows with columns `from_lon, from_lat, to_lon, to_lat`, you could do: ```r flows_sf <- flow_data %>% rowwise() %>% mutate(geometry = st_sfc(st_linestring(matrix(c(from_lon, from_lat, to_lon, to_lat), ncol=2, byrow=TRUE)), crs=4326)) %>% st_as_sf() ``` This is a bit advanced, but essentially creates a linestring from each pair of coords. Once you have an `sf` of lines, you can plot it with `geom_sf` (mapping perhaps a value to `size` or `color`). Alternatively, one can use the **`sfheaders`** or **`geodist`** packages to ease creation of such line geometries. * **Specialized flow packages:** There are packages like **`flowmapblue`** (by Ilya Boyandin) for interactive flow maps, and network packages like **`igraph`** or **`ggnetwork`** which can plot network links (though not on geographic coordinates by default, but you can merge with coordinates). In fact, **`geom_net`** and **`ggnetwork`** were used in a tutorial to map trade flows between countries. In that example, they did *not* place the nodes at their geo-locations but rather used a network layout, but they could assign coordinates (like latitude/longitude of capitals) to the network nodes to make a geographic network map. A simple example: **visualizing migration flows** between countries. Suppose we have a dataset of refugee flows with origin country and asylum country, plus a value (number of people). We could use centroids for each country as nodes. Then for each origin-asylum pair, draw a line. If we use `ggplot2`, one way is: ```r library(geosphere) # Example: create great circle points between origin and dest flows <- refugee_flows # assume this has OriginLon, OriginLat, DestLon, DestLat, Count flow_lines <- list() for(i in 1:nrow(flows)) { pts <- gcIntermediate(c(flows$OriginLon[i], flows$OriginLat[i]), c(flows$DestLon[i], flows$DestLat[i]), n=50, addStartEnd=TRUE) flow_lines[[i]] <- st_linestring(pts) } flows_sf <- st_sf(geometry = st_sfc(flow_lines, crs=4326), count = flows$Count) # Now flows_sf is an sf LINESTRING object for each flow, with count attribute. ggplot() + geom_sf(data=world, fill="gray20", color="black") + geom_sf(data=flows_sf, aes(size = count), color="dodgerblue", alpha=0.7) + scale_size(range=c(0.1, 2), breaks=c(10000, 100000, 1000000), labels=c("10k","100k","1M"), name="Migrants") + ggtitle("Refugee Flows (origin to destination)") + theme_void() ``` This hypothetical code would draw great-circle curves (with 50 points each) for each flow, sizing them by the number of refugees. We used a dark world map background to make the blue flow lines pop. The scale_size is adjusted to keep lines thin (0.1) for small flows and thicker (up to 2) for very large flows. You could also add arrows: in ggplot2, `geom_sf()` currently doesn’t have an easy arrow parameter, but `geom_curve`/`geom_segment` do (with `arrow=arrow(type="closed", length=unit(2,"mm"))`). However, adding arrows on many lines can clutter a map; often just showing direction via color or an animation is chosen instead. Alternatively, **tmap** can do flow maps by adding line layers with `tm_lines()`. Tmap won’t compute the great circle for you, but if you have the sf lines as above, you could do: ```r tm_shape(flows_sf) + tm_lines(lwd="count", scale=0.000001, col="blue", alpha=0.6, title.lwd="Refugees") + tm_layout(bg.color="gray15") + tm_shape(world) + tm_borders(col="white", lwd=0.1) ``` Here we use a small scale factor to convert counts to line width. The styling is a bit manual. There are also specialized visualization approaches like **flow maps with animation** (e.g., using **`gganimate`** to show flows accumulating or moving). **Flow map interpretation:** A challenge with flow maps is overlapping lines and directional ambiguity. If origin and destination are swapped, the same line would be drawn. One solution is to color code direction (for example, green for exports from A to B vs red for exports from B to A), or use arrows. Another is to use interactive hover to inspect a particular flow (which we’ll address in interactive maps). In economics, **trade flow maps** often use chord diagrams or network graphs instead of cluttering a geographic map with too many lines. However, if focusing on a subset (say trade within Asia), a flow map on a map can be effective. Similarly, migration corridors can be highlighted by selecting major routes. As an example from practice, Figure 3 (conceptual) might show trade flows between a few countries with arrows: *(For brevity, we won’t embed another figure here, but imagine a map with curved arrows between the US, China, and Europe indicating trade volumes.)* In summary, flow maps require careful design to avoid turning into “spaghetti.” But with R, you have the flexibility to compute paths and customize how they’re drawn. The data preparation is often the hardest part (getting all origin/dest coordinates and pairing them with values). ## Creating Interactive Maps with Leaflet and tmap Thus far we focused on static maps – great for print or reports. However, interactive maps allow users to pan/zoom, click on features for more info, and toggle layers. This can be extremely valuable for exploring geospatial data or presenting it online. R provides excellent tools for interactive mapping, primarily through the **`leaflet`** package and the interactive mode of **`tmap`** (which actually leverages leaflet under the hood), as well as packages like **`mapview`** for quick viewing. We will discuss leaflet and tmap here. ### Interactive Maps with **leaflet** **Leaflet** is a popular open-source JavaScript library for web maps, and the **`leaflet`** R package (Cheng, Karambelkar, & Xie, 2015) is an R interface to create Leaflet maps. One big advantage is that it works well within RMarkdown, Shiny apps, or just in RStudio’s viewer – producing interactive maps without writing JavaScript. Leaflet maps are **tile-based**, meaning they typically use pre-rendered map tiles (from providers like OpenStreetMap, Carto, etc.) for the basemap, and you overlay your data as markers, circles, polygons, etc. Basic usage involves creating a leaflet map widget and adding layers to it with the pipe (`%>%`) syntax: ```r library(leaflet) leaflet(world) %>% addTiles() %>% # adds default OpenStreetMap tiles as background addPolygons(color = "white", weight = 1, fillColor = ~colorQuantile("YlGn", GDP_per_capita)(GDP_per_capita), fillOpacity = 0.7, popup = ~paste(name_long, "<br>GDP per cap:", round(GDP_per_capita)) ) ``` Let’s unpack this: * `leaflet(world)` initializes a map centered/bounded on the `world` data (since `world` is an sf, leaflet knows its bounds). * `addTiles()` adds a base layer (slippy map tiles from OSM by default). * `addPolygons()` adds our country polygons. We specified a border color (white) and weight. For fillColor, we demonstrated using a *palette function*: `colorQuantile("YlGn", GDP_per_capita)` creates a function that assigns a color from the Yellow-Green palette based on the percentile of GDP_per_capita. We then call that function with each country’s value. This is a quick way to choropleth in leaflet (alternatively, one could pre-define a palette or use `colorBin`, `colorNumeric` etc., which leaflet provides). * `fillOpacity` is set to 0.7 for some transparency. * The `popup` argument takes HTML content (we use the country name and GDP value) that will show when a user clicks a country. The result is an interactive choropleth where hovering or clicking shows popups. The user can zoom and pan freely. Leaflet in R has many add\* functions: * `addCircles` or `addCircleMarkers` for point data (with radius, color, etc., and popups/labels similarly). * `addMarkers` for simple markers (you can even use custom icons). * `addPolylines` for line data. * `addGeoJSON` if you have data in GeoJSON form. * `addLayersControl` to allow toggling layers on/off (useful for multiple overlay layers). * `addLegend` to add a legend for the color or size scales. For example, if we had a conflict events dataframe `conflicts_df` with lon, lat, and fatalities, we could do: ```r leaflet() %>% addTiles() %>% addCircleMarkers(data = conflicts_df, lng = ~longitude, lat = ~latitude, radius = ~sqrt(fatalities), fillColor="red", fillOpacity=0.5, popup = ~paste(location, "<br>Fatalities:", fatalities) ) ``` This would put a semi-transparent red circle for each event, sized by sqrt of fatalities (so area \~ fatalities). Popups would show location name and number of fatalities. A user could interactively explore where conflicts are severe. **Leaflet tips:** By default, leaflet maps do not show scale bar or north arrow (since north is usually “up” on web mercator projections and scale depends on zoom). But one can add a scale bar via `addScaleBar()`. For projections, leaflet is essentially always in Web Mercator (EPSG:3857) for tiling – but it handles the reprojecting of your data behind the scenes if your data is lat/long. It’s usually best to supply data in WGS84 lat-long to leaflet (so it can project correctly). Another important point is efficiency: adding thousands of polygon objects might be heavy. Simplifying the data (reducing polygon detail) or using tile-based approaches for large data is advisable (beyond our scope here). ### Interactive mode with **tmap** The **tmap** package offers an easy switch between static and interactive maps. By setting `tmap_mode("view")`, tmap will produce an interactive leaflet map instead of a static image. This is incredibly convenient because you can design your map once and get both outputs as needed. For example, building on the earlier tmap choropleth example: ```r tmap_mode("view") tm_shape(world_ref) + tm_polygons("refugees", style="quantile", palette="Reds", title="Refugees") + tm_basemap("OpenStreetMap") + tm_view(set.zoom = 2) ``` This will open an interactive map (in RStudio viewer or browser) where you can zoom/pan. The polygons are drawn and an interactive legend is shown. You can hover to see values (if you enable `tm_hover` or set `interactive = TRUE` on layers, tmap by default shows tooltips with the data value for the mapped aesthetic). The call `tm_basemap("OpenStreetMap")` adds a base layer; tmap in view mode can use various basemaps (there's also `tm_tiles` for other tile services). `tm_view(set.zoom=2)` sets initial zoom. If we wanted to allow toggling layers, tmap has `tm_facets` or can separate groups. But for more control, direct leaflet is often used. Still, tmap’s strength is rapid development: you can experiment with your thematic map in static mode (high quality output), then just switch to view mode to allow interactive exploration, with minimal changes in code. **Example – Interactive refugee map:** Imagine we want an interactive map of refugee migration: origin to host country flows and the ability to click on a country to see details. We could create two layers: one for flows (lines) and one for host country symbols (bubbles sized by number of refugees hosted). In tmap (view mode), we could add both layers and allow the user to toggle them: ```r tmap_mode("view") tm_shape(flows_sf) + tm_lines(col="skyblue", lwd="count", scale=5e-7, alpha=0.5, group="Flows") + tm_shape(world_ref) + tm_bubbles(size="refugees", col="orange", alpha=0.7, border.col="white", group="Host Countries", popup.vars = c("Country"="name_long", "Refugees"="refugees")) + tm_basemap("CartoDB.DarkMatter") + tm_view(set.view = c(0,30,2)) # center lon=0, lat=30, zoom=2 ``` This uses a dark basemap, draws flow lines in blue (with width proportional to count) and draws orange bubbles for host countries (with a popup showing the exact number). We assigned each to a `group` – in view mode, tmap will allow toggling these groups via a layer control (like a checkbox for "Flows" and "Host Countries"). The resulting interactive map would let a user zoom into regions, turn on/off the flow lines or the bubbles, and click on a bubble to see country name and refugee count. **Shiny and advanced interactivity:** The `leaflet` maps can be integrated into **Shiny** web applications to respond to user inputs (for example, selecting different variables to map, or drawing on the map). The `leafletProxy` function allows dynamic updating of a leaflet map in Shiny. While a full Shiny tutorial is beyond our scope, be aware that interactive R maps can be made fully dynamic. For most exploratory purposes, `leaflet()` or `mapview::mapview()` (which is a quick view wrapper) are enough. **Mapview** is worth noting: calling `mapview(world)` will immediately display an interactive map of that object with sensible defaults, including popups of all attributes. This is great for a quick look at your spatial object to verify it loaded correctly. Mapview also has functionalities to add mouse coordinates display, measure distances, etc., for quick exploration. ## Combining Spatial Data with Socio-Economic Data Geospatial visualization becomes powerful when we overlay multiple data sources – for instance, combining spatial boundaries with economic indicators, or attaching conflict event data to demographic data. In this section, we discuss some common scenarios of combining data for geopolitics/economics applications: * **Joining tabular socio-economic indicators to spatial boundaries:** We touched on this earlier. A typical case: you have a CSV of indicators by country (GDP, Human Development Index, trade volume, etc.) and a shapefile of country boundaries. Using a common key (ISO country code, or country name carefully), you perform a join so that those indicators become attributes of the spatial data. After that, you can map them (choropleth or others). We demonstrated with GDP and refugee counts. The key is ensuring the join key matches (one may need to standardize names or codes). * **Linking point locations to region data:** Perhaps you have data on cities (points) and you want to attach country-level info (like country’s population or region name). A spatial join using `st_join` (with `st_within` predicate) can accomplish that – essentially a point-in-polygon query. Conversely, if you want to summarize point data by region (like counting events per region, summing values, etc.), you can do a spatial join then aggregate (as we did with conflict events per country example). This is common in conflict studies (e.g., number of incidents per province) or epidemiology (cases per district). * **Combining raster and socio-economic data:** One might extract values from environmental rasters to attach to socio-economic units. For example, attach average rainfall or elevation to each country or each conflict event. We used `terra::extract` in an example to get elevation for city points. Another use: mask or multiply rasters by region masks (e.g., multiply a population density raster by a binary raster of conflict zones to estimate population at risk). Tools like `terra::mask` or `terra::zonal` (to do zonal statistics by polygon) can help. For instance, to compute average nightlights brightness per country, you could use `terra::zonal(nightlights_raster, countries_raster, fun="mean")` if you have a raster of country IDs. * **Temporal data merging:** Sometimes we have time series data by location. For example, trade indicators by country-year. To map a specific year, you filter the data to that year and join to spatial data. If making an animation, you might keep the data in a "long" format and use facets or animation frames to iterate over time. The **`gganimate`** and **`tmap`** (with `tmap_animation`) packages can create animations showing change over time. For instance, mapping yearly changes in conflict intensity or refugee numbers with a frame for each year. * **APIs and data sources in R:** There are R packages that directly fetch socio-economic data. For example: * `WDI` for World Bank Development Indicators (returns data frames of country indicators). * `tidycensus` for US Census data (with geometries if needed). * `acled.api` or `riem` etc., though for conflict data like ACLED, one might rely on pre-downloaded data due to licensing. * `unhcrdatapackage` (UNHCR API in R) which provides access to the Refugee Data Finder. In the snippet we saw earlier, they pulled data from an UNHCR package dataset (`unhcrdatapackage::end_year_population_totals_long`) and merged it with spatial centroids to map refugee distribution. **Case examples of data combination:** * *Conflict mapping:* You might take ACLED conflict event data (which has latitude/longitude for each event, plus attributes like fatalities, actors, etc.) and overlay it on a map of ethnic groups or administrative boundaries. This could reveal patterns like conflicts clustering in certain regions. If you have a risk index by region, you could color regions by risk and also dot the events. * *Trade and economic zones:* Suppose you have data on trade agreements or economic unions (EU, NAFTA, etc.). You could color countries by membership in a trade bloc, and also overlay lines or arrows showing trade volumes between major partners. Combining categorical region highlighting with flow lines could show how trade is concentrated within certain blocs. * *Migration and demographics:* Combine refugee flow data with host country indicators (like GDP or population). You could illustrate, for instance, refugees per 1,000 population by country (a normalized measure) by merging refugee counts with population data and mapping that ratio. Or use symbols for refugees and color countries by some other socio-economic indicator to provide context (perhaps the countries with highest refugees are not the wealthiest, raising policy questions). The process of combination is often: **get spatial layer -> get data -> join -> visualize**. Always keep track of the CRS during spatial operations (most join/aggregation tasks are easier if everything is in a common projection or lat-long). And consider the scale: using GADM level-2 (districts) for a global map might be overkill and slow; aggregate or use a generalized boundary if possible. Conversely, if focusing on one country, you might bring in detailed local data. Now that we have covered techniques, let's walk through a few **case studies** that integrate these ideas in a more narrative way. ## Case Studies in Geospatial Visualization In this section, we explore three applied scenarios: 1. **Mapping Migration and Displacement** – using UNHCR refugee data. 2. **Visualizing Trade Patterns and Economic Zones** – using trade flow data. 3. **Mapping Territorial Conflict and Geopolitical Risk** – using conflict event data and risk indices. Each case study will outline the context, the data sources and preparation, the visualization approach, and the insights gained. These illustrate end-to-end examples of geospatial data visualization in R for real-world geopolitical and economic topics. ### Case Study 1: Migration and Displacement (Refugee Flows and Distribution) **Context:** By the end of 2022, over 100 million people were forcibly displaced worldwide – including refugees, asylum-seekers, and internally displaced persons – a record high. Understanding where displaced people originate, which countries they flee to, and the magnitude of flows is crucial for humanitarian policy and international relations. Maps are a key tool to communicate the scale of displacement crises and to plan interventions (e.g., refugee camp locations, international aid distribution). **Data sources:** We use open data from UNHCR (The UN Refugee Agency). The **UNHCR Refugee Data Finder** provides country-level statistics on refugees (by origin and asylum country, annually). We also use country boundary data from Natural Earth (via `rnaturalearth`) for mapping. Additionally, UNHCR’s geoservices provide location data for camps and UNHCR operations, but here we focus on country-level aggregates. For flows, we obtained a table of refugee origin-destination pairs (e.g., “Syria -> Germany: X refugees”). UNHCR’s population statistics report such flows for major populations. Alternatively, one could use the `unhcrdatapackage` in R which contains datasets of end-of-year refugee populations by origin and asylum. **Data Preparation:** We clean and merge the refugee data with spatial data. For example, using `end_year_population_totals_long` from `unhcrdatapackage`, we filtered for Year = 2021 and Population type = “REF” (refugees) and grouped by Country of Asylum and Country of Origin to get totals. We then merged this with a world countries SF to get both “origin country” and “asylum country” geometries. For mapping flows, we created line geometries from origin centroid to asylum centroid for each pair. **Visualizations:** 1. *Global Choropleth of Refugee Hosting:* We created **Figure 1** earlier, a choropleth shading each country by the number of refugees it hosts (asylum). This immediately showed that a few countries (often neighboring conflict zones) bear the largest burden – for 2021, countries like Turkey, Pakistan, Uganda, Iran, Germany were among top hosts, each with hundreds of thousands or millions of refugees. The choropleth effectively communicated the uneven distribution of refugees across the world – many developed countries in Europe had significant numbers, but some developing countries near conflict regions (e.g., Turkey next to Syria, or Uganda next to South Sudan/DRC) also had very large refugee populations, highlighting potential resource strains. The map also included a note (in the caption) that *“The boundaries and names shown... do not imply official endorsement”*, reflecting UN map standards (a best practice when mapping politically sensitive data). 2. *Bubble Map of Origins:* **Figure 2** (bubble map) complements the host choropleth by showing countries of origin of refugees with proportional circles. This map vividly showed Syria, Venezuela, Afghanistan, South Sudan, Myanmar as major source countries of refugees (big bubbles), among others. It helped emphasize that a relatively small number of conflicts or crises (in Syria, Afghanistan, etc.) account for a large share of global refugees. Combining Figures 1 and 2, one can observe, for instance, that Syria (origin) has a large bubble, and neighbors like Turkey and Lebanon are shaded dark as they host those refugees. This pairing of maps is powerful to tell the displacement story from both sides. 3. *Flow Lines:* To illustrate routes, we created an interactive flow map (not easily shown in a static figure here). Using leaflet, we drew curved arrows for major refugee flows, for example from Syria to Turkey, Venezuela to Colombia, Afghanistan to Pakistan/Iran, South Sudan to Uganda, etc., with line width proportional to the flow. In a static print, such a flow map can get cluttered; however, interactively one could filter by origin or destination. In our case, we perhaps limited to top 10 flows worldwide and annotated them. Even with just a few flows, the geographic pattern is clear: most refugees flee to neighboring countries (short lines on the map), not distant ones – which is an important insight for geopolitics (neighboring countries often carry regional instability burdens). 4. *Refugees per Capita Map:* As another perspective, we combined refugee counts with host country population to map **refugees per 1,000 inhabitants**. This showed some smaller countries like Lebanon and Jordan at the top of that ranking (since Lebanon’s own population is relatively small but hosts many Syrians, resulting in a high per capita ratio, often above 1 in 6 people). This per-capita choropleth can sometimes invert perceptions – some wealthy countries host many refugees in absolute terms but not so many relative to their population, whereas some poorer countries host a huge share relative to their size. Such a map is a useful tool in policy debates about fair burden-sharing. **Technical notes:** We had to ensure country names/codes matched between UNHCR data and Natural Earth. ISO3 codes were used (with a few manual adjustments for differences like “SYR” vs “SYR” etc. – luckily ISO codes are standard but care needed for territories not in Natural Earth). We used `sf::st_centroid` to get points for origin and destination for drawing flow lines. We ensured all data were in lat-long (so that great circle interpolation by `geosphere` would work correctly). We also kept only significant flows for clarity – filtering out flows below 10,000, for example. **Insight and communication:** The combination of these visualizations in an interactive dashboard (which one could create with Shiny) allows policymakers to explore questions like: *“If conflict X worsens, which neighboring countries will be most affected by refugee inflows?”* or *“Which host countries might need more support given how many refugees they have relative to their capacity?”* Maps provide an intuitive way to grasp the scale and geography of displacement, complementing raw statistics. In reports, UNHCR frequently uses such maps to advocate for aid (e.g., showing how Syrian refugees are concentrated in just a few countries like Turkey and Lebanon). By using R and open data, students can recreate and adapt these analyses – for instance, overlaying additional data like GDP or conflict events to see correlations between conflict intensity and refugee outflow. ### Case Study 2: Trade Patterns and Economic Zones **Context:** International trade is a cornerstone of global economics and geopolitics. Trade agreements and economic blocs (like the EU, NAFTA/USMCA, ASEAN) shape trading patterns, as do geography (neighboring countries often trade heavily) and transportation networks (shipping lanes, etc.). Visualizing trade flows can highlight the structure of the global economy: for example, major trade corridors, regional integration, or countries that serve as key hubs. It can also reveal vulnerabilities (e.g., dependence on a single trade partner). **Data sources:** For this case, we consider a simplified scenario: trade flows of certain commodities between key countries. As an example, we might use data from the **UN Comtrade database** (which has bilateral trade statistics). In a teaching context, one might limit to a subset, e.g., grain trade among a few countries, or total trade volume among G20 nations. There are R packages like `comtradr` or direct downloads from Comtrade (CSV). We also use Natural Earth for country boundaries. Additionally, we might categorize countries by trade bloc membership (which can be manually curated or taken from an official list). **Data Preparation:** Suppose we retrieved data on total exports between some major economies. We will have a table of `Exporter`, `Importer`, `TradeValue`. We ensure each country has a centroid coordinate for mapping lines. We also create a factor or group for trade bloc (e.g., label each country as EU, Asia-Pac, etc., if illustrating intra-bloc trade). We might filter to the top N flows globally to keep the map readable. Alternatively, we could focus on a specific region (say Asia-Pacific trade network) and map all flows there. **Visualizations:** 1. *Global Trade Network (simplified):* Using `geom_net` or `geom_curve`, we plotted flows between the US, China, Europe (aggregated), and maybe a few other large players. For instance, big arrows between China and US, US and EU, EU and China forming a triangle of major world trade. Additional flows from those hubs to others (like China–Japan, US–Mexico, etc.). The map can use curved arrows over a world map. A specific example: if mapping grain trade, one might show arrows from North America and Black Sea region to import-dependent regions (Middle East, North Africa). In our demonstration, we made a map with arrows whose thickness represents trade volume and color indicating direction (perhaps blue for exports from Country A, red for the opposite direction). 2. *Choropleth of Trade Blocs:* We colored countries by their economic mega-bloc (e.g., EU, NAFTA, ASEAN, etc.) to provide context. While not numeric, this categorical map serves to show how the world is partitioned into trading regions. Overlaid on that, we could place symbols or annotations for trade volumes. For example, a map showing the European Union in one color, NAFTA in another, etc., then lines illustrating trade between these blocs. One might observe that trade volumes are often highest within blocs (e.g., European countries trading with each other) compared to between far-flung partners, although globalization has also increased inter-bloc trade. 3. *Interactive drill-down:* In an interactive setting, one could allow clicking a country to highlight all its trade links. For instance, click on UK – see its export/import lines light up. With leaflet, one could achieve this by using `highlightOptions` or using Shiny to dynamically subset. **Insights:** These maps can show patterns like: * The dominance of a few corridors: e.g., trans-Atlantic (US-EU) and trans-Pacific (US-China) trade flows overshadow many others in sheer value. * Regional trade: e.g., European countries sending majority of exports to other EU members. * The effect of geography: neighbors often trade a lot due to lower transport costs (US-Canada/Mexico, Germany-France, China-Japan, etc. all high). * Identify outliers: a country that trades heavily with a distant partner (possibly due to natural resources, e.g., oil from Middle East to East Asia, which can also be visualized). For instance, if mapping **oil trade**, you’d draw arrows from the Middle East, Russia, and North America to major import regions like East Asia or Europe. You’d immediately see Europe’s reliance on Russian pipelines (if included) or East Asia’s reliance on Middle Eastern tanker routes. In fact, one could incorporate actual routes (sea lanes) – but that gets complex (requires mapping along actual maritime paths; beyond our simple great-circle lines). **Technical notes:** Working with trade flow data often involves dealing with directed edges and possibly duplication (Country A to B vs B to A). We made sure to treat them separately for clarity. If using great circles, we took care not to draw lines the long way around the globe (the `gcIntermediate` function and specifying `addStartEnd=TRUE` solves that by including endpoints). We also sometimes applied a slight offset for overlapping opposite flows (like if two countries exchange similar volumes, their lines could overlap – we can nudge one curve’s midpoint). There are advanced packages like **`flowmapper`** or network visualization libraries, but our approach with simple sf lines and ggplot/tmap suffices for a conceptual overview. **Communication:** Trade flow maps are popular in publications to illustrate concepts like the rise of South-South trade or the relative decline of trans-Atlantic dominance. One must be cautious not to overclutter: often, a few key flows tell the story better than showing every minor exchange. Another best practice is to annotate flows with labels if possible (like “$500B” near an arrow, although doing that dynamically can be tricky if arrows move). In static maps, one might number flows and have a legend listing values. ### Case Study 3: Territorial Conflict Mapping and Geopolitical Risk **Context:** Geopolitical risk often manifests spatially – conflicts occur in specific locations, disputed territories have clear boundaries, and political risk indices vary by country/region. A classic task is mapping conflict events or intensity to understand patterns (e.g., concentration of violence in certain provinces) and perhaps overlaying other layers (ethnic maps, resource locations) to seek correlations. Similarly, mapping a **“geopolitical risk index”** by country (a composite of factors like conflict likelihood, political stability, etc.) provides a global risk overview for investors or policy makers. **Data sources:** We use the **Armed Conflict Location & Event Data (ACLED)** for conflict events, which is a detailed dataset of political violence events with coordinates. For geopolitical risk index, there are proprietary indices (like by insurance companies or think-tanks) but for demonstration, we might construct a simple index combining things like conflict presence, governance indicators, etc., or use the Fragile States Index (available by country). For mapping disputed areas, one might use data such as Natural Earth’s disputed boundaries layer or specific shapefiles for areas like Kashmir, South China Sea claims, etc. We will focus on conflict events and a risk choropleth. **Data Preparation:** We filtered ACLED data for a region (say Africa for year 2022) to make it manageable. We aggregated the number of events and fatalities by country and by sub-national regions. We also classified events into types (battles, protests, etc.). We obtained a country-level “conflict intensity score” (e.g., fatalities per 100k population) for use in a country choropleth. **Visualizations:** 1. *Conflict Event Map with Points:* We plotted all conflict events in Africa for a year as points on a map, using color or shape to denote event type (e.g., red dot = battle, blue = protest, etc.) and maybe size to denote fatalities. Using **leaflet** allowed us to add popups with event details on click. The map showed clusters of events in certain areas (for example, the Sahel region, Horn of Africa, eastern DR Congo, northern Nigeria, etc., depending on year). This point map gives granular detail but can be overwhelming with thousands of points. To mitigate that, one could use leaflet’s clustering (via `markerClusterOptions`) to cluster nearby points when zoomed out. 2. *Heatmap / Kernel density:* As an alternative visualization, we computed a kernel density raster of conflict event intensity and overlaid it on the map (e.g., using a red-yellow heatmap). This provides a smoother “heatmap” highlighting hotspots of conflict without needing to interpret thousands of points. The **`stat_density2d`** in ggplot or `smooth_map()` in other packages can do this. We would see blobs over regions like, say, the Lake Chad basin (Boko Haram conflict) or Somalia, etc. This is good for conveying intensity but loses precise info. 3. *Choropleth of Risk by Country:* We created a global map where each country is colored by a geopolitical risk score. Suppose we used a scale 0-10 (10 = highest risk). We might derive it from combining conflict presence, regime type, and other indicators. The map perhaps shows high risk in war-torn countries (Yemen, Syria, DRC, etc. = in red), moderate risk in countries with some instability, and low risk in stable OECD countries (in green). This map is a bit subjective (depending on how index is made), but it's similar to maps produced by consultancy firms. It gives a quick comparative assessment. It could be accompanied by an interactive element where clicking a country shows a tooltip with the index components (like “Risk 8.5 – factors: ongoing conflict, weak governance”). 4. *Territorial Dispute Illustration:* To highlight a specific geopolitics issue, we mapped an area like the South China Sea with overlapping territorial claims. Using a polygon shapefile for claimed zones, colored by claimant country, with transparency, we can show overlapping colors where disputes occur. Another example: mapping Kashmir with Indian vs Pakistani claim lines (perhaps using a dashed line style for disputed border as is common in international maps). This visual communicates the contested nature of territory. We could overlay conflict incidents or military deployment points if data available. **Insights:** From the conflict maps, one quickly sees that conflicts are not randomly distributed; they concentrate in certain countries and regions. Overlaying factors can yield insight (for example, overlay a map of mineral resources or ethnic groups on conflict locations to test hypotheses like resource conflicts or ethnic tensions). The risk choropleth simplifies a lot but is useful for high-level briefing – e.g., an investor might use it to decide where their supply chain might be exposed. It’s a way to summarize many complex sub-indicators into one view, but the analyst must be cautious about oversimplification (one should always accompany such maps with an explanation of what the index entails). **Technical notes:** When mapping large event datasets (ACLED has 100k+ points), performance can be an issue. We used leaflet with clustering to handle this interactively, and down-sampled or focused on one year for static mapping. We also used `sf::st_join` to count events per polygon (country or province). We had to join country names with population data for normalization. For disputed boundaries, since it's sensitive, we added a note on the map if needed (e.g., similar to UNHCR maps, one could add *“Borders are not authoritative”* disclaimer). **Communication:** Conflict maps often evoke strong responses, so clarity and neutrality are key. Using neutral basemaps and clear legends (with units like number of incidents, etc.) ensures the map is seen as informative, not sensational. Also, one should cite sources (e.g., “Conflict data © ACLED, used with permission” if required, and source of the risk index). In a policy paper, one might include a static map highlighting a particular hotspot, then provide an interactive link for the reader to explore further details online. ## Best Practices in Geospatial Visualization for Communication Creating maps for economic and political insights is not just about technical execution; it also requires careful design and awareness of how viewers interpret maps. Here are some best practices to ensure your geospatial visualizations communicate effectively and ethically: * **Choose Appropriate Map Projections:** The projection should suit the purpose. For global thematic maps, an equal-area projection (like Mollweide or Gall-Peters) can be preferable to avoid exaggerating areas (e.g., Mercator makes high-latitude countries look bigger, potentially misrepresenting values in choropleths). If using Web Mercator (for interactive maps), be mindful of its distortion – it’s fine for visualization but not for comparing areas. For regional maps, use a projection centered on that region. R’s `coord_sf(crs= ...)` or `st_transform` makes it easy to change projections. A well-chosen projection can also enhance aesthetics (e.g., Robinson for world maps gives a pleasing balance). Always be cautious if mixing data – ensure they’re in the same projection, or use `coord_sf` which will auto-project layers to the specified CRS. * **Color Choices and Legends:** Use color ramps appropriate to your data type. **Sequential palettes** (light to dark of one hue) for 0-to-high numeric values (e.g., population). **Divergent palettes** (two hues diverging from a midpoint) for variables with a meaningful middle (e.g., above/below average, or positive/negative changes). **Qualitative palettes** for categorical data (distinct colors, ideally tested for colorblind friendliness). Tools like ColorBrewer (available via `RColorBrewer` or built-in in tmap/ggplot) are helpful. In our examples, we used viridis and brewer palettes that are perceptually uniform and colorblind-safe. Avoid using too many colors – it can confuse. Also ensure sufficient contrast between map background and your data layers. For instance, if using a dark basemap, use bright colors for overlays (and vice versa). Always include a **legend** that clearly explains colors or symbol sizes. In tmap or ggplot, ensure the legend has intuitive labels (we often customized breaks for this reason). If using unusual units (e.g., log scale), indicate that. In interactive maps, a legend is still needed; plus tooltips on hover can provide exact values, which is great for user engagement. * **Avoid Misleading by Normalizing Data:** Choropleths of raw counts can mislead if areas differ greatly in size or population. It’s often better to map densities or rates (e.g., GDP per capita instead of total GDP, conflict incidents per km² or per capita, etc.) so that values are comparable. If mapping absolute counts, be aware that large countries might catch the eye simply due to size, not higher intensity. For example, mapping total CO₂ emissions by country makes Russia look huge partly due to area – mapping emissions per capita or per GDP might flip the story. In our refugee maps, we made both absolute and per-capita versions to provide context. * **Keep it Simple and Clear:** Don’t overload a single map with too many layers or information types. It’s better to produce a small series of maps (or an interactive where layers can be toggled) than one static map trying to show 5 datasets at once. Each map should ideally emphasize one key message. Use supporting elements like titles, subtitles, and captions to guide the reader. For example, a title “Major Refugee Flows in 2021” immediately tells what to look for. A caption can add context or source info. General cartographic elements like **north arrows and scale bars** are usually optional for thematic maps at country/world scale (since orientation is obvious and scale varies, especially in projections). For local maps or where distance matters (e.g., a map of a city plan or buffer distances), include a scale bar. North arrows are more crucial if the map is rotated or using a non-standard orientation. * **Use Annotations and Labels Wisely:** Sometimes adding labels (text) on the map can greatly aid understanding – e.g., label the top 5 countries in a data set directly on the map (instead of readers cross-referencing the legend). In R, `geom_text` or tmap’s text layers can place labels. But be careful to avoid clutter and overlapping text. You might need to manually adjust positions or use techniques like `geom_text_repel` from ggrepel package to avoid overlap. In our trade map, labeling flows by value was tricky, so we opted for a legend. But we did label country names in some cases to make sure the reader knows which country is which if the map doesn’t already have that (especially for smaller countries that might not be labeled by default). * **Respect Data Uncertainty and Ethics:** In geopolitical mapping, some data may be sensitive or uncertain (like unofficial borders, disputed areas, or estimated underground economies). It’s best practice to acknowledge these. For disputed boundaries, use special symbology (e.g., dashed lines) and a note. If data is outdated or estimated, mention it so it’s not taken as precise. We saw in UNHCR maps the disclaimer about boundaries – similar disclaimers might be used when mapping, say, Taiwan as a separate entity depending on the audience, etc. The goal is not to stir controversy with the map’s depiction, unless intentionally making a political point in which case be prepared with rationale and sources. * **Accessibility:** Ensure the map is readable by people with color-vision deficiencies. Avoid solely red/green contrasts for important differences (common issue as \~8% of men have red-green colorblindness). Tools like `viridis` palette or ColorBrewer’s colorblind-safe schemes are helpful. Also ensure text (labels, legend) is large enough and clear. In R, it’s easy to adjust theme text sizes or legend key sizes. * **Reproducibility and Documentation:** Keep track of the transformations you do. With R Markdown, your entire process (from data import to final plotting) can be documented in one place – this is great for ensuring you (or others) can reproduce the map later, and it also documents the sources and any assumptions. Always cite data sources on the map or in the accompanying text. In an academic or policy report, maps typically have a caption with sources, e.g., "*Data: UNHCR (2023), Natural Earth.*" By following these practices, your maps will be more trustworthy and effective. A well-designed map can powerfully convey a point that might require many paragraphs to explain – but a poorly designed map can confuse or mislead. Always take a moment to step back and view your map as a reader would: does it answer the intended question? Is anything ambiguous or overly complex? Did we choose the right representation (e.g., symbols vs choropleth) for the data? Iterate on the design based on feedback. ## Troubleshooting and Common Pitfalls in Spatial Workflows Working with spatial data in R is rewarding but can sometimes lead to frustrations. Here are common issues and how to address them: * **Coordinate Reference System (CRS) mismatches:** One of the most frequent gotchas is plotting or joining layers that are in different CRSs without realizing. This can result in layers not aligning (e.g., your points appear off in the ocean because the base map is in a projected coordinate system and your data is still in lat-long). With `sf`, if you plot multiple layers in ggplot, it will try to align them if they have crs set; but if one lacks crs info, it won’t know. Always check `st_crs()` of your objects. Use `st_transform` to a common CRS when needed (for analytic geometry operations like intersection, it’s often best to use an equal-area projection in a common datum). If you see an error like “cannot transform sfc object” or a warning about missing CRS, fix the CRS definitions. Pro tip: If you know the data’s supposed CRS but it’s missing (NA), use `st_set_crs()` to define it (not transform, just declare). If the data has the wrong CRS, you might need to reproject appropriately. * **Large file sizes and performance:** Spatial datasets (especially fine-resolution rasters or large vector layers with many detailed polygons) can be huge. Reading a 1GB shapefile can be slow, and plotting it even slower. Strategies: simplify geometries if high detail isn’t needed (R has `rmapshaper::ms_simplify` or `sf::st_simplify` to reduce polygon vertex count). When mapping at country level, using a simplified world boundaries file (like Natural Earth “small” scale) is better than a high-res GADM. For rasters, reading a huge raster into memory (`terra` does on-the-fly processing sometimes but be mindful of memory limits) – you can crop or downsample if full resolution isn’t required. If running into performance issues in loops (like creating many sf objects), vectorize operations or use spatial indexing (e.g., `st_join` is faster than a manual loop through each point to find where it falls). When using leaflet, too many objects (say >10k) might make the browser lag; use clustering for points or pre-tile heavy polygon layers (there are advanced packages to turn sf into map tiles, e.g., `leafgl` for using WebGL rendering). * **Topology errors:** Sometimes geometry data has issues: self-intersecting polygons, slivers, etc. Functions like `st_union` or `st_buffer(x,0)` (a trick to attempt to “clean” geometry) can help by making them valid. If you get errors like “TopologyException” when doing spatial ops, consider using `st_make_valid()` from **`lwgeom`** package on the geometry. Also note that `st_intersection` can drop features if they don’t intersect; maybe use `st_join` depending on need. * **Joining by names issues:** When merging data by country names or other text fields, inconsistencies will cause NA values for some (e.g., “Côte d’Ivoire” vs “Ivory Coast”, or different spellings). It’s a common pitfall that leads to missing data on the map. The solution is to standardize names or use ISO codes. One approach is to have a reference table of equivalences. The `countrycode` package can convert country names to different versions or to ISO codes which you can then match. Always verify after a join how many matches happened and which didn’t (e.g., use `anti_join` to see which keys didn’t match). * **Plotting order and layering:** In static maps, the order you add layers matters. E.g., if you add points then polygons, the polygons might obscure the points. We often first draw polygons (base map) and then points/lines on top. In ggplot, that’s literally the order of geom_sf calls; in tmap, it’s the order of tm_shape layers (first is bottom). In leaflet, each add\* call adds on top by default (though z-index can be set). If some points not visible, check if they are underneath something. Also watch out for the color opacity – fully opaque fills might hide gridlines or boundaries behind. * **Encoding and locale issues:** Reading shapefiles or data with non-ASCII characters (accents, etc.) might lead to garbled text if encoding isn’t handled. This is not R-specific but general. Ensuring your R session is using UTF-8 (and data is UTF-8) usually solves it. In sf, you can specify `options(sf.encoding = 'UTF-8')` if needed when reading data. Or post-process names with `iconv`. * **Common error messages:** * *“Error: st_crs(x) == st_crs(y) is not TRUE”* – You attempted an operation on two layers with different CRSs (like `st_join` or `st_intersection`). Fix by transforming one to the other’s CRS. * *“object not found” when plotting* – If using ggplot with aes in geom_sf, remember to put `data = ...` inside geom_sf if outside main ggplot() call. E.g., `ggplot() + geom_sf(data=my_sf, aes(fill=var))`. If you do ggplot(my_sf) + geom_sf(aes(fill=var)) that also works. Just ensure the object is in the correct place. * *“long-running operation / session hangs”* – likely reading a huge file or plotting too many things. Abort (if in RStudio) and try a lighter approach or check memory usage. * *Leaflet maps not showing geometry or shifted* – If you see nothing, possibly the data’s coordinates were not in lat-long and leaflet didn’t know how to project properly. Ensure data given to leaflet is WGS84 (EPSG:4326). If using sf objects, the leaflet package should warn if CRS is not 4326. Use `st_transform(your_sf, 4326)` before leaflet. * **Package version issues:** The R spatial ecosystem had a transition from sp/rgdal to the sf and terra world. If you encounter older code using `sp` objects (SpatialPolygonsDataFrame, etc.), you might need to convert (with `st_as_sf()` for vectors or `rast()` for rasters). Also note that `rgdal` and `rgeos` are being retired as of 2023 in favor of sf’s inbuilt functionality. Stick to modern packages to avoid deprecation issues. Always check package docs for changes (e.g., `tmap` had some changes in how modes are set, etc.). The CRAN Task View for Spatial (maintained by Roger Bivand) is a good reference. * **Visualization pitfalls:** Not exactly technical errors, but keep in mind: * If mapping rates or percentages, a small region with a high rate might visually dominate, so consider inset maps or arrows pointing it out if it’s too small to see (e.g., mapping GDP per capita, tiny Luxembourg could be highest but barely visible on a map). * When using bubble maps, beware of extreme outliers that skew the scale – you might cap the max size or use a transform. * Combining 3D effects (like rayshader package) looks cool but can distort perception of area if overused – typically avoid for serious analysis maps, but can be used in moderation for visual appeal in presentations. By anticipating these issues, you can save time. And when something looks off, systematically check: *Are my data in the right coordinate system and range? Did my join drop data? Is the plotting order correct?* A handy debugging step: plot intermediate steps. E.g., if points aren’t showing, do `plot(st_geometry(points))` and `plot(st_geometry(polygons))` separately to ensure they both have sensible locations. If the point plot looks blank, maybe coordinates were wrong (e.g., lat/long swapped or in degrees when you thought meters). Always validate data by printing a few rows or summary (min/max of coordinates, etc.). ## Final Reflection: Reproducibility, Open Data, and the Role of Spatial Storytelling In this chapter, we demonstrated how R can be used to turn raw geospatial data into meaningful visual stories about our world – from migration crises to trade networks and conflict zones. A few closing thoughts tie together the importance of **reproducibility**, **open data**, and **spatial storytelling** in the realm of geopolitics and economics: **Reproducibility:** Using R and R Markdown (or Quarto) for geospatial analysis ensures that every map and analysis is reproducible. Instead of manually clicking in a GIS software and producing a map (which might be hard to exactly recreate), our code-centric approach means that anyone with the same data and code can regenerate the identical map. This is crucial in research and policy: if a map is used to justify a decision, one should be able to audit how it was made. Reproducible workflows also allow easy updates – for instance, when new yearly data comes in, you can re-run the notebook to produce updated maps. Throughout this chapter, we used script examples to emphasize this. By integrating narrative, code, and output in one document, an R Markdown report can serve as both analysis and communication piece. We encourage readers to adopt this approach – it might seem complex at first, but it pays off in transparency and efficiency. **Open Data and Tools:** We leveraged **open-source datasets** (UNHCR, Natural Earth, GADM, etc.) and open-source software (R and its packages). This democratizes access to insights. Twenty years ago, one might need expensive GIS software and proprietary data to do what we did here. Now, with a few R packages and internet access, a student can map global issues using authoritative data. Open data portals (like World Bank, UN open data, national open data sites) are treasure troves for analysis. The UNHCR Operational Data Portal for example provides up-to-date refugee stats under a CC-BY license, meaning we are free to use and remix it as long as we attribute. Natural Earth’s public domain data allows use in any project without worrying about licenses. These reduce barriers for researchers in developing countries or independent analysts to contribute knowledge. The R community’s open-source packages (often developed by academics or passionate volunteers) also mean we can stand on the shoulders of giants (sf, ggplot2, etc.) without reinventing the wheel. **Spatial Storytelling:** Maps are not just analytical tools; they are storytelling devices. A compelling map can evoke empathy, highlight injustice, or celebrate progress in ways that pure numbers cannot. In policy research, this is sometimes called the “power of the map” – the ability to present evidence in a visceral, intuitive form. But with that power comes responsibility: we as map-makers must ensure the story we tell is accurate and not misleading. We discussed how design choices can bias interpretation (e.g., color scales or data normalization). Being aware of these pitfalls is part of ethical storytelling. On the positive side, spatial storytelling can greatly enhance communication. For example, instead of just stating that “most refugees are hosted in developing nations,” showing a map where wealthy countries are mostly light-colored and certain poorer countries are dark-colored (in a refugee choropleth) drives the point home immediately. Or consider urban economics: a map of night-time lights (as a proxy for economic activity) tells a story of inequality between North and Sub-Saharan Africa far more strikingly than a table of GDP. **Integrating into Decision Making:** Increasingly, maps are used interactively in briefings – think of dashboards for COVID-19 or conflict tracking. The skills covered here allow one to create such dashboards (with Shiny) or static reports that decision-makers can use. It’s important to tailor the complexity of the map to the audience: a minister might want a simple clear map with big picture, whereas an analyst might want the detailed version. With R, generating both from the same data is feasible (just different filtering or styling). **Continued Learning:** This chapter covered a broad range of topics. There is always more to learn – specialized methods like **geospatial modeling**, **interactive storytelling (e.g., combining maps with time sliders)**, **3D visualizations (using packages like rayshader)**, and more. Students are encouraged to explore further: * **Geocomputation with R** (Lovelace et al. 2019) is an excellent free online book that delves deeper into analysis and even predictive modeling with spatial data. * CRAN’s task views and vignettes of packages (e.g., Tennekes 2018 for tmap, Pebesma 2018 for sf) provide more advanced usage examples. * For interactive maps beyond the basics, look at **mapdeck** (which integrates with Mapbox GL) or **deckgl** for truly large-scale webmapping in R. * If interested in raster remote sensing and stars package (for multi-dimensional arrays), check out Pebesma’s work on **stars** package and resources on analyzing satellite imagery in R. **Final thought:** In a world increasingly driven by data, spatial data adds an important dimension – literally. It helps root abstract numbers in the real world context. For a policymaker, seeing a map of “hotspots” can spur action targeted at those places. For the public, an interactive map can make complex reports accessible and engaging. As you continue to develop geospatial skills in R, remember that clarity, accuracy, and purpose are your guiding principles. Every map should have a question it’s answering or a story it’s telling. By combining robust analysis with thoughtful design, your geospatial visualizations can become powerful tools for understanding and improving the world. ## References {.unnumbered} ### Corrected Reference List (APA 7th ed.) Cheng, J., Karambelkar, B., & Xie, Y. (2022). *leaflet: Create interactive web maps with the JavaScript Leaflet library* (Version 2.1.1) \[Computer software]. [https://rstudio.github.io/leaflet/](https://rstudio.github.io/leaflet/) GADM. (2022). *GADM database of global administrative areas* (Version 4.1) \[Data set]. [https://gadm.org/](https://gadm.org/) Hijmans, R. J. (2022). *terra: Spatial data analysis* (Version 1.5) \[Computer software]. [https://rspatial.org/terra/](https://rspatial.org/terra/) Hisham, S. (2022, July 13). Geospatial intelligence to mitigate global geopolitical risks \[Interview with Manuel Ntumba]. *Geospatial World*. [https://geospatialworld.net/prime/interviews/geospatial-intelligence-to-mitigate-global-geopolitical-risks/](https://geospatialworld.net/prime/interviews/geospatial-intelligence-to-mitigate-global-geopolitical-risks/) Juergens, C., & Redecker, A. P. (2023). Basic geo‑spatial data literacy education for economic applications. *KN – Journal of Cartography and Geographic Information, 73*(1), 147–159. [https://doi.org/10.1007/s42489-023-00135-9](https://doi.org/10.1007/s42489-023-00135-9) Kelso, N. V., & Patterson, T. (2009). Natural Earth vector. *Cartographic Perspectives, 64*, 45–50. [https://doi.org/10.14714/CP64.148](https://doi.org/10.14714/CP64.148) Lovelace, R., Nowosad, J., & Muenchow, J. (2019). *Geocomputation with R*. CRC Press. [https://doi.org/10.1201/9780203730058](https://doi.org/10.1201/9780203730058) Pebesma, E. (2018). Simple features for R: Standardized support for spatial vector data. *The R Journal, 10*(1), 439–446. [https://doi.org/10.32614/RJ-2018-009](https://doi.org/10.32614/RJ-2018-009) Tennekes, M. (2018). tmap: Thematic maps in R. *Journal of Statistical Software, 84*(6), 1–39. [https://doi.org/10.18637/jss.v084.i06](https://doi.org/10.18637/jss.v084.i06) UNHCR. (n.d.). *Operational Data Portal: Refugee situations* \[Data repository]. Retrieved July 15, 2025, from [https://data.unhcr.org/](https://data.unhcr.org/) UNHCR. (2023). *Refugee Data Finder* \[Database]. [https://www.unhcr.org/refugee-statistics/](https://www.unhcr.org/refugee-statistics/) Zimmermannová, J., Redecker, A. P., Menšík, M., & Jürgens, C. (2021). Geospatial data analysis and economic evaluation of companies for sustainable business development—An interdisciplinary teaching approach. *Sustainability, 13*(20), 11245. [https://doi.org/10.3390/su132011245](https://doi.org/10.3390/su132011245)

12.1 Types of Spatial Data and Common Formats

12.2 Importing and Manipulating Spatial Data in R

Vector Data with sf

Raster Data with terra (and raster)

Tidyverse Integration and Spatial Data

12.3 Creating Thematic Maps (Choropleths, Symbol Maps, Flow Maps)

Choropleth Maps

Proportional Symbol and Bubble Maps

Flow Maps (Lines and Arrows for Movement)

12.4 Creating Interactive Maps with Leaflet and tmap

Interactive Maps with leaflet

Interactive mode with tmap

12.5 Combining Spatial Data with Socio-Economic Data

12.6 Case Studies in Geospatial Visualization

Case Study 1: Migration and Displacement (Refugee Flows and Distribution)

Case Study 2: Trade Patterns and Economic Zones

Case Study 3: Territorial Conflict Mapping and Geopolitical Risk

12.7 Best Practices in Geospatial Visualization for Communication

12.8 Troubleshooting and Common Pitfalls in Spatial Workflows

12.9 Final Reflection: Reproducibility, Open Data, and the Role of Spatial Storytelling

References

Corrected Reference List (APA 7th ed.)

Raster Data with terra (and `raster`)