Exercises
Exercise: Analyzing Subnational Armed Conflict Data Using SUNGEO and xSub
Objective:
In this exercise, you will learn how to use the SUNGEO package in R to analyze subnational armed conflict data from the xSub dataset. You will download event-level data, perform spatial analysis, and visualize conflict patterns using Geographic Information Systems (GIS) tools available in SUNGEO.
Step 1: Install and Load the SUNGEO Package
If you have not already installed the SUNGEO package, you can install it from CRAN or GitHub:
# Install from CRAN
install.packages("SUNGEO", dependencies = TRUE)
# Install from GitHub (if you prefer the latest version)
library(devtools)
::install_github("zhukovyuri/SUNGEO", dependencies = TRUE)
devtools
# Load the SUNGEO package
library(SUNGEO)
Step 2: Accessing xSub Event Data
You will now download event data from the xSub repository using the SUNGEO package. For this exercise, we will use the “ACLED: Armed Conflict Location and Event Data Project” dataset for a specific country, such as Afghanistan.
# Downloading ACLED event data for Afghanistan
<- get_data(
acled_afg country_name = "Afghanistan",
topics = "ACLED: Armed Conflict Location and Event Data Project, v20191207"
)
# View the first few rows of the dataset
head(acled_afg)
Step 3: Geocoding Events
To perform spatial analysis, you need to ensure that your data has geographic coordinates. The xSub dataset already contains georeferenced event data, but you can perform additional geocoding if needed.
For example, if you have a list of addresses or place names instead of coordinates, you can use the geocode_osm
function:
# Example geocoding (if needed)
<- geocode_osm("Kabul, Afghanistan")
coords print(coords)
Step 4: Visualizing Conflict Data
Next, you will visualize the armed conflict events on a map. You can use the plot
function from the SUNGEO
package to create a simple plot or use the leaflet
package for interactive maps.
# Simple plot of event locations
plot(acled_afg$longitude, acled_afg$latitude,
main = "Conflict Events in Afghanistan",
xlab = "Longitude", ylab = "Latitude",
pch = 20, col = "red")
# Interactive map using leaflet
library(leaflet)
<- leaflet(data = acled_afg) %>%
m addTiles() %>%
addCircleMarkers(~longitude, ~latitude,
popup = ~paste("Event:", event_type, "<br>",
"Date:", event_date),
radius = 3, color = "red")
m
Step 5: Spatial Analysis
You can now perform more advanced spatial analysis, such as identifying hotspots of violence, analyzing patterns over time, or interpolating conflict intensity across regions.
Example 1: Heatmap of Conflict Events
library(ggplot2)
# Create a basic heatmap of conflict events
ggplot(acled_afg, aes(x = longitude, y = latitude)) +
stat_density_2d(aes(fill = ..level..), geom = "polygon") +
scale_fill_viridis_c() +
theme_minimal() +
labs(title = "Heatmap of Conflict Events in Afghanistan")
Example 2: Temporal Analysis
# Count the number of events per year
library(dplyr)
<- acled_afg %>%
conflict_by_year mutate(year = as.numeric(format(as.Date(event_date), "%Y"))) %>%
group_by(year) %>%
summarize(event_count = n())
# Plot the number of events over time
ggplot(conflict_by_year, aes(x = year, y = event_count)) +
geom_line() +
theme_minimal() +
labs(title = "Number of Conflict Events in Afghanistan Over Time",
x = "Year", y = "Number of Events")
Step 6: Reporting and Insights
Finally, summarize your findings from the spatial and temporal analysis. Consider discussing the following points: - Patterns and Trends: Are there specific regions or periods with high concentrations of conflict? - Impact of Geography: How does the geographic distribution of events relate to key geographic features, such as borders or population centers? - Policy Implications: What insights can be drawn for policymakers or humanitarian organizations based on the observed patterns?
Conclusion
By completing this exercise, you have learned how to use the SUNGEO package to access and analyze subnational event data on armed conflict from the xSub repository. The exercise demonstrated how to download and visualize event data, perform basic geospatial analysis, and generate insights from the results.
Exercise: Analyzing the 2022 Russian Invasion of Ukraine with VIINA Data Using SUNGEO
Objective:
In this exercise, you will learn how to use the VIINA dataset to analyze real-time data from the 2022 Russian invasion of Ukraine. You will work with geocoded event data, perform spatial analysis, and visualize conflict patterns using the tools available in the SUNGEO package.
Step 1: Install and Load the SUNGEO Package
If you haven’t already installed the SUNGEO package, you can do so from CRAN or GitHub:
# Install from CRAN
install.packages("SUNGEO", dependencies = TRUE)
# Install from GitHub (for the latest version)
library(devtools)
::install_github("zhukovyuri/SUNGEO", dependencies = TRUE)
devtools
# Load the SUNGEO package
library(SUNGEO)
Step 2: Download VIINA Data
The VIINA dataset provides near-real-time event data on the Russian invasion of Ukraine. We’ll download the latest event reports for the year 2023.
# Download event data for 2023
download.file("https://github.com/zhukovyuri/VIINA/raw/main/event_info_latest_2023.zip",
destfile = "event_info_latest_2023.zip")
# Unzip the downloaded file
unzip("event_info_latest_2023.zip", exdir = "VIINA_data")
# Load the data into R
<- read.csv("VIINA_data/event_info_latest_2023.csv")
viina_data
# View the first few rows of the dataset
head(viina_data)
Step 3: Geocode Event Locations
The VIINA dataset already includes geocoded event locations, but you can visualize these on a map to better understand the spatial distribution of conflict events.
# Simple plot of event locations
plot(viina_data$longitude, viina_data$latitude,
main = "Conflict Events in Ukraine (2023)",
xlab = "Longitude", ylab = "Latitude",
pch = 20, col = "red")
# Interactive map using leaflet
library(leaflet)
<- leaflet(data = viina_data) %>%
m addTiles() %>%
addCircleMarkers(~longitude, ~latitude,
popup = ~paste("Event:", headline, "<br>",
"Date:", event_date),
radius = 3, color = "red")
m
Step 4: Temporal Analysis of Conflict Events
You can analyze how the conflict evolved over time by examining the frequency and distribution of events.
# Convert event_date to Date format
$event_date <- as.Date(viina_data$event_date)
viina_data
# Aggregate the number of events per day
library(dplyr)
<- viina_data %>%
events_per_day group_by(event_date) %>%
summarize(event_count = n())
# Plot the number of events over time
library(ggplot2)
ggplot(events_per_day, aes(x = event_date, y = event_count)) +
geom_line() +
theme_minimal() +
labs(title = "Daily Conflict Events in Ukraine (2023)",
x = "Date", y = "Number of Events")
Step 5: Analyzing Territorial Control
The VIINA dataset also includes data on territorial control. You can use this data to analyze which areas were under control by different forces and how this control shifted over time.
# Download and load territorial control data
download.file("https://github.com/zhukovyuri/VIINA/raw/main/control_latest.zip",
destfile = "control_latest.zip")
unzip("control_latest.zip", exdir = "VIINA_data")
<- read.csv("VIINA_data/control_latest.csv")
control_data
# Visualize territorial control
plot(control_data$longitude, control_data$latitude,
main = "Territorial Control in Ukraine",
xlab = "Longitude", ylab = "Latitude",
col = control_data$control, pch = 20)
# Add a legend to the plot
legend("topright", legend = unique(control_data$control),
col = unique(control_data$control), pch = 20)
Step 6: Spatial Analysis and Reporting
Finally, you can perform more advanced spatial analysis, such as identifying hotspots of conflict or analyzing changes in territorial control. Summarize your findings and consider the following:
- Conflict Patterns: Where are the most intense areas of conflict?
- Territorial Control: How has control of key areas changed over time?
- Geospatial Insights: What can be inferred about the overall progress of the conflict?
Conclusion
Through this exercise, you have learned how to use the VIINA dataset to analyze near-real-time data from the Russian invasion of Ukraine. This exercise demonstrated how to download, visualize, and analyze geocoded event data, providing valuable insights into the spatial and temporal dynamics of the conflict.
Exercise: Analyzing Vaccine Hesitancy Using SUNGEO and High Frequency Phone Survey Data
Objective:
This exercise aims to guide you through the process of analyzing vaccine hesitancy in Low and Middle-Income Countries (LMICs) using the SUNGEO package and High Frequency Phone Survey (HFPS) data. You will learn how to integrate spatially misaligned datasets, perform an analysis on the impact of contextual factors like political violence and infrastructure on vaccine hesitancy, and interpret the results.
Step 1: Install and Load the Necessary Packages
Ensure that you have the SUNGEO package and other essential packages installed in R.
# Install necessary packages if you haven't already
install.packages("SUNGEO", dependencies = TRUE)
install.packages("sf")
install.packages("dplyr")
install.packages("ggplot2")
# Load the packages
library(SUNGEO)
library(sf)
library(dplyr)
library(ggplot2)
Step 2: Download and Load the HFPS Data
For this exercise, we will use the COVID-19 High Frequency Phone Survey data from Kenya, which is available from the Inter-university Consortium for Political and Social Research (ICPSR).
# Download the Kenya HFPS data
download.file("https://example.com/path_to_Kenya_HFPS_data.zip",
destfile = "Kenya_HFPS_data.zip")
# Unzip and load the dataset
unzip("Kenya_HFPS_data.zip")
<- read.csv("Kenya_HFPS_data.csv")
kenya_hfps
# View the first few rows of the dataset
head(kenya_hfps)
Step 3: Integrate Contextual Factors with HFPS Data Using SUNGEO
Using SUNGEO, we will integrate the HFPS data with contextual factors such as political violence and infrastructure availability.
# Example: Integrating political violence data
# Load the political violence data (hypothetical example)
<- st_read("path_to_political_violence_data.geojson")
political_violence
# Integrate with HFPS data using a spatial join
<- st_as_sf(kenya_hfps, coords = c("longitude", "latitude"), crs = 4326)
kenya_hfps_sf <- st_join(kenya_hfps_sf, political_violence, join = st_intersects)
integrated_data
# View the integrated dataset
head(integrated_data)
Step 4: Analyze the Impact of Contextual Factors on Vaccine Hesitancy
With the integrated dataset, you can now analyze how factors like political violence and infrastructure influence vaccine hesitancy.
# Example: Analyzing the impact of political violence on vaccine hesitancy
<- integrated_data %>%
hesitancy_by_violence group_by(political_violence) %>%
summarize(mean_hesitancy = mean(vaccine_hesitancy, na.rm = TRUE))
# Visualize the results
ggplot(hesitancy_by_violence, aes(x = political_violence, y = mean_hesitancy)) +
geom_bar(stat = "identity", fill = "steelblue") +
theme_minimal() +
labs(title = "Impact of Political Violence on Vaccine Hesitancy in Kenya",
x = "Political Violence (Yes/No)",
y = "Mean Vaccine Hesitancy")
Step 5: Interpret the Results
Consider the following questions to interpret your analysis:
- How does political violence affect vaccine hesitancy in Kenya?
- Are there significant differences in vaccine hesitancy between regions with and without political violence?
- How might infrastructure availability or other factors further influence these results?
Conclusion
In this exercise, you learned how to integrate spatially misaligned datasets using the SUNGEO package, and how to analyze the impact of contextual factors on vaccine hesitancy in Kenya. These skills are essential for conducting interdisciplinary public health research, especially in the context of the COVID-19 pandemic in LMICs.