11  Chapter 11: Spatial Autocorrelation

11.1 Introduction

Spatial autocorrelation is a fundamental principle in geospatial data science, describing the degree to which geographically proximate observations exhibit correlated attributes. The presence of spatial autocorrelation reveals underlying spatial structures, clusters, or dispersions that are often central to understanding geographic phenomena. This chapter thoroughly explores methods to quantify, interpret, and address spatial autocorrelation using sophisticated tools available in R and Python. Mastering these concepts will empower analysts to more accurately interpret spatial data and make informed, context-sensitive decisions.


11.2 Understanding Spatial Autocorrelation

Spatial autocorrelation captures how spatial proximity affects similarity among observed values. Spatial phenomena typically do not occur randomly but instead display patterns influenced by physical, environmental, economic, and social processes.

Types of Spatial Autocorrelation

Spatial autocorrelation can be categorized into three types:

  • Positive Autocorrelation: Spatially adjacent observations exhibit similar values, forming identifiable clusters (e.g., crime hotspots, regional economic growth).
  • Negative Autocorrelation: Adjacent observations demonstrate dissimilar values, creating dispersion or spatial repulsion patterns (e.g., competitive market locations).
  • Zero Autocorrelation: Spatial randomness where values exhibit no discernible spatial relationship.

Understanding these types is essential for designing appropriate spatial analyses and interpreting results effectively.


11.3 Global Spatial Autocorrelation

Global spatial autocorrelation assesses overall spatial relationships across an entire study area, providing a single measure of spatial dependency.

Moran’s I

Moran’s I is the most frequently applied measure of global spatial autocorrelation. This statistic evaluates whether observed spatial patterns significantly differ from spatial randomness.

Interpretation of Moran’s I:

  • Positive Moran’s I (> 0): Suggests clustering of similar values.
  • Negative Moran’s I (< 0): Indicates spatial dispersion or dissimilarity.
  • Moran’s I near 0: Implies randomness or absence of significant spatial patterns.

Implementing Moran’s I

Example in R:

library(spdep)
library(sf)

# Load spatial dataset
data <- st_read("data/regions.shp")

# Define neighborhood relationships
neighbors <- poly2nb(data)
weights <- nb2listw(neighbors)

# Calculate Moran's I
morans_i <- moran.test(data$income, weights)
print(morans_i)

Example in Python:

import geopandas as gpd
import libpysal
from esda.moran import Moran

# Load data
data = gpd.read_file("data/regions.shp")

# Define spatial weights (Queen contiguity)
w = libpysal.weights.Queen.from_dataframe(data)

# Calculate Moran's I
moran = Moran(data['income'], w)
print(f"Moran's I: {moran.I}, p-value: {moran.p_sim}")

11.4 Local Spatial Autocorrelation

Global measures may obscure localized variations. Local spatial autocorrelation methods, such as Local Indicators of Spatial Association (LISA), reveal specific locations contributing significantly to overall patterns.

Local Indicators of Spatial Association (LISA)

LISA identifies and categorizes areas as high-high clusters (hotspots), low-low clusters (coldspots), or spatial outliers (high-low, low-high).

Implementing LISA

Example in R:

library(spdep)

# Compute LISA statistics
local_moran <- localmoran(data$income, weights)
data$LISA_I <- local_moran[,1]

# Map LISA clusters
plot(data["LISA_I"], main="Local Moran's I Clusters")

Example in Python:

from esda.moran import Moran_Local

# Calculate Local Moran’s I
lisa = Moran_Local(data['income'], w)

# Attach results to GeoDataFrame
data['LISA'] = lisa.q
data.plot(column='LISA', categorical=True, legend=True)

11.5 Visualizing Spatial Autocorrelation

Visualization significantly aids the interpretation of spatial autocorrelation results, making complex patterns immediately understandable.

Moran Scatterplots

Moran scatterplots visualize spatial autocorrelation by plotting observed values against spatially lagged values, clearly identifying clusters or dispersions.

R Example:

moran.plot(data$income, weights, main="Moran Scatterplot")

Python Example:

import splot.esda as esdaplot
import matplotlib.pyplot as plt

fig, ax = esdaplot.moran_scatterplot(moran)
plt.title('Moran Scatterplot')
plt.show()

11.6 Applications of Spatial Autocorrelation Analysis

Spatial autocorrelation analysis is fundamental in multiple fields, enhancing understanding and decision-making by explicitly considering spatial structures.

Urban Planning

  • Identifying urban growth corridors and clustering of development.
  • Assessing spatial distribution of urban amenities or deficiencies.

Environmental Management

  • Locating pollution hotspots and sensitive ecological areas.
  • Assessing the spatial distribution of invasive species or disease outbreaks.

Public Health

  • Identifying disease clusters and targeting interventions.
  • Mapping health disparities and their geographic contexts.

Economic Geography

  • Pinpointing economic clusters or innovation hubs.
  • Evaluating spatial inequality and resource distribution.

11.7 Addressing Spatial Autocorrelation in Modeling

Ignoring spatial autocorrelation in modeling leads to biased estimates and incorrect conclusions. Spatial regression models explicitly integrate spatial autocorrelation, providing more accurate and reliable predictions.

Spatial Lag Models

Spatial lag models incorporate dependent variables’ spatially lagged values as predictors, capturing direct spatial dependencies.

Spatial Lag Model Example (R):

library(spatialreg)

# Fit Spatial Lag Model
lag_model <- lagsarlm(income ~ education + employment, data, listw=weights)
summary(lag_model)

Spatial Error Models

Spatial error models address spatial autocorrelation present within residual errors, indicating omitted spatially correlated variables.

Spatial Error Model Example (Python):

from spreg import ML_Error

# Fit Spatial Error Model
error_model = ML_Error(data['income'].values, data[['education','employment']].values, w=w)
print(error_model.summary)

11.8 Best Practices for Spatial Autocorrelation Analysis

Robust spatial autocorrelation analysis necessitates careful methodological choices and interpretive clarity:

  • Clearly define spatial weights and neighborhoods: These definitions significantly impact autocorrelation measures.
  • Use multiple visualization methods: Combine statistical tests with visualizations for comprehensive insights.
  • Carefully validate spatial regression models: Conduct residual diagnostics and sensitivity analyses.

11.9 Limitations and Critical Considerations

While spatial autocorrelation provides valuable insights, analysts must remain mindful of limitations:

  • Spatial Scale Sensitivity: Autocorrelation measures vary with the scale of analysis, influencing interpretation.
  • Edge Effects: Boundary definitions may artificially influence autocorrelation results.
  • Assumption of Spatial Homogeneity: Real-world conditions may violate this assumption, requiring advanced modeling approaches.

11.10 Conclusion

Spatial autocorrelation is pivotal in understanding spatial data and relationships, enhancing the accuracy and reliability of geospatial analyses. Mastering global and local spatial autocorrelation measurement, visualization, and modeling techniques allows analysts to discern complex geographic patterns clearly, identify meaningful clusters or outliers, and ensure robust analytical outcomes. By applying the comprehensive strategies detailed in this chapter, you will significantly advance your ability to analyze and interpret spatial phenomena, ultimately contributing to informed, spatially-aware decision-making processes.