1 Intro

Analysis of market access in Somalia. We will look at food market locations in Somalia and assess their accessibilty by regular sampled points.

2 Setup

This section for reproducibility. We use R in the version 4.2.2. In order to be able to reproduce this analysis you need an openrouteservice API key. Get one here. Put it in a text file file called config.txt in the working directory. It has no structure, just the key in the first line.

ZXCVBNMLKJHGFDSAQWERTYUIOP

2.1 libraries

Almost all of the libraries can be installed via CRAN with the command install.packages('packagename'). The openrouteservice package is not on CRAN and needs to be installed via github. With the command remotes::install_github("GIScience/openrouteservice-r"); Also tmap in its most recent version 4 is not yet available on CRAN so we need to download and install it via github too.

# main libraries
library(tidyverse)
library(glue)
library(sf)
library(units)
library(ggplot2)
library(tictoc)
library(openrouteservice) # remotes::install_github("GIScience/openrouteservice-r");
library(jsonlite)
library(terra)
library(exactextractr)
library(tmap) # remotes::install_github("r-tmap/tmap")
library(leaflet)
library(furrr)
library(purrr)
library(kableExtra)
library(mapsf)
library(RColorBrewer)
library(geonode4R)
library(scales)


#config <- readLines("config.txt")
#api_key <- config[1]
#user <- config[2]
#password <- config[3]
api_key <- ""
# Function to  adjust a list of coordinates to the location of the closest road segment via the snapping endpoint of ORS. ORS itself only allows for a maximum distance of 350m to snap.
ors_snap <- function(x, rowids, local=F) {
  x=coord_list
  rowids=grid5km$rowid
  local = T
  
  library(httr)
  
  # Define the body
  body <- list(
    locations = x,
    radius = 100000000 # apparently the maximum snapping distance is 5km
  )
  
  # Define the headers
  headers <- c(
    'Accept' = 'application/json, application/geo+json, application/gpx+xml, img/png; charset=utf-8',
    'Authorization' = api_key,
    'Content-Type' = 'application/json; charset=utf-8'
  )
  
  endpoint <- if(local){'http://0.0.0.0:8080/ors/v2/snap/driving-car'} else {'https://api.openrouteservice.org/v2/snap/driving-car'}
  # Make the POST request
  response <- POST( endpoint,
                    #'https://api.openrouteservice.org/v2/snap/driving-car', 
                    body = body, 
                    add_headers(headers), 
                    encode = "json")
  
  resp_content <- content(response)
  
  extract_data <- function(x, index) {
    if (is.null(x)) {
      return(data.frame(
        rowid = index,
        lon = NA,
        lat = NA,
        snapped_distance = NA
      ))
    } else {
      return(data.frame(
        rowid = index,
        lon = x$location[[1]],
        lat = x$location[[2]],
        snapped_distance = x$snapped_distance
      ))
    }
  }
  
  # Extract data from the list and create a dataframe
  df <- do.call(rbind, lapply(seq_along(resp_content$locations), function(i) extract_data(resp_content$locations[[i]], i)))
  df$rowid <- rowids
  # Convert the dataframe to an sf object
  sf_df <- df |> 
    drop_na(lon, lat) |>  # Drop rows with NA coordinates
    st_as_sf(coords = c("lon", "lat"), crs = 4326) |>  # Define the coordinate columns and set CRS
    st_sf()
  
  return(sf_df)
  
}

3 Processing pipeline

The workflow is as follows:

  • prepare market location data
  • generate hexagonal 5km grid in Somalia
  • add worldpop data per grid cell
  • snap the grid centroids to the nearest road segment
  • create a matrix for the distance and time for each market to the snapped centroids
  • detect the closest market for each grid cell

By snapping we refer to the process of changing the location of an origin or destination of a route to the nearest road segment. This is necessary because the openrouteservice API only allows for a maximum distance of 350m to snap. However in the context of Somalia greater snapping distances are desired as the road density in some regions might be low.

3.1 Data input

We use the following datasets as inputs:

  • pop: WorldPop population counts constrained
  • market locations data
  • admin boundaries level 2
pop <- rast('data/som_ppp_2020_UNadj_constrained.tif')
markets_som <- read_csv("data/wfp_food_prices_som.csv")
admin <- st_read('data/Som_Admbnda_Adm2_UNDP.shp', quiet = T)

3.2 Markets preprocessing

In order to use the market locations we need to create a sf object by converting the latitude and longitude columns to a point geometry column. By checking the distinct locations of the geometry column we got 44 food markets in Somalia.

# Clean the latitude and longitude columns
markets_som_clean <- markets_som |> 
  filter(!is.na(latitude) & !is.na(longitude)) |> 
  slice(-1) |> 
  mutate(latitude = as.numeric(latitude),
         longitude = as.numeric(longitude),
         date = as.Date(date, format = "%Y-%m-%d"))
         
#add year
markets_som_clean <- markets_som_clean |> 
  mutate(year = as.numeric(format(date, "%Y")))

# Convert to sf object
markets_som_sf <- markets_som_clean |> 
  st_as_sf(coords = c("longitude", "latitude"), crs = 4326) 

#unique markets
markets_som_asc_sf <- markets_som_sf |> 
  distinct(geometry, .keep_all = TRUE) 

#unique markets descending
markets_som_desc_sf <- markets_som_sf |> 
  arrange(desc(row_number())) |>    # Sort the dataframe such that the last rows come first
  distinct(geometry, .keep_all = TRUE)

#combine both
markets_som_sf <- markets_som_desc_sf |> left_join(markets_som_asc_sf |> st_drop_geometry() |> select(market, year), by ="market")

#prepare final market sf
markets_som_sf <- markets_som_sf |> rename(first_year = year.y, last_year = year.x) |> 
  select(-date) |>
  mutate(rowid = row_number())

4 Grid sampling

Finding the shortest/fastest way between a number of origins and destinations is a job for the Matrix endpoint of ors. Lets start with the grid sampling. We decide for a 5000m width hexagonal grid. A hexagonal grid is closer to a circle and therefore the better joice for our analysis. Within each grid cell we will use the centroid and search for road network locations to snap to.

# create 5km grid
grid5km <-
  st_make_grid(admin |> st_transform(20538), cellsize = 5000, square = F, flat_topped = T)
# convert to sf df
grid5km <- grid5km |> st_as_sf()
names(grid5km) <- 'geometry'
st_geometry(grid5km) <- 'geometry'
# get rid of all cells that are not within somalia
grid5km <-
  grid5km |> st_transform(4326) |> st_filter(admin |> st_union())
# join admin codes
grid5km <-
  grid5km |> st_join(admin |> select(admin2Name, admin2Pcod, admin1Name, admin1Pcod, admin0Name, admin0Pcod),
                     left = T,
                     largest = T)
# join pop information
grid5km$pop <- exact_extract(pop, grid5km, 'sum', progress = F)

# add rowid
grid5km <- grid5km |> mutate(rowid = row_number())

#filter for pop > 0
grid5km <- grid5km |> filter(pop > 0)

Time for this code chunk to run: 323.11 seconds

4.1 Snapping

The grid is created, now snap the centroids.

# convert
coord_list <-
  lapply(grid5km$geometry |> st_centroid(), function(x)
    c(st_coordinates(x)[, 1], st_coordinates(x)[, 2]))

# snap the list of cords
snapped_coords <- ors_snap(coord_list,rowids=grid5km$rowid, local = T)

# join back
grid5km_snapped <- grid5km |>
  left_join(snapped_coords |> tibble(), by = "rowid") |>
  mutate(centroid = st_centroid(geometry.x))

# refactor names in order to keep, the grid geometry, centorid and snapped centroid
names(grid5km_snapped)[10:11] <- c('geometry', 'snapped_centroid')
st_geometry(grid5km_snapped) <- 'geometry'

grid5km_snapped <- grid5km_snapped |> mutate(snapped = ifelse(is.na(snapped_distance), F, T)) 

#filter for snapped values
grid5km_snapped_NA <- grid5km_snapped |> filter(snapped == F)
grid5km_snapped_notNA <- grid5km_snapped |> filter(snapped == T)

Time for this code chunk to run: 22.18 seconds

Awesome we now got 17738 snapped and 3045 not snapped grid cells.

4.2 ORS matrix request and post processing

In the next chunk we will calculate the travel times and distances from each grid cell to each market. We will use the openrouteservice API for this.

## run 1 / 44: 9.52 sec elapsed
## run 2 / 44: 7.935 sec elapsed
## run 3 / 44: 6.953 sec elapsed
## run 4 / 44: 8.041 sec elapsed
## run 5 / 44: 8.159 sec elapsed
## run 6 / 44: 8.902 sec elapsed
## run 7 / 44: 6.408 sec elapsed
## run 8 / 44: 6.297 sec elapsed
## run 9 / 44: 6.924 sec elapsed
## run 10 / 44: 7.681 sec elapsed
## run 11 / 44: 9.629 sec elapsed
## run 12 / 44: 6.77 sec elapsed
## run 13 / 44: 7.055 sec elapsed
## run 14 / 44: 8.066 sec elapsed
## run 15 / 44: 8.987 sec elapsed
## run 16 / 44: 9.499 sec elapsed
## run 17 / 44: 7.035 sec elapsed
## run 18 / 44: 6.867 sec elapsed
## run 19 / 44: 10.687 sec elapsed
## run 20 / 44: 8.204 sec elapsed
## run 21 / 44: 6.9 sec elapsed
## run 22 / 44: 7.636 sec elapsed
## run 23 / 44: 6.417 sec elapsed
## run 24 / 44: 9.359 sec elapsed
## run 25 / 44: 10.075 sec elapsed
## run 26 / 44: 8.183 sec elapsed
## run 27 / 44: 6.431 sec elapsed
## run 28 / 44: 7.323 sec elapsed
## run 29 / 44: 6.654 sec elapsed
## run 30 / 44: 6.708 sec elapsed
## run 31 / 44: 7.185 sec elapsed
## run 32 / 44: 6.548 sec elapsed
## run 33 / 44: 8.213 sec elapsed
## run 34 / 44: 11.124 sec elapsed
## run 35 / 44: 7.165 sec elapsed
## run 36 / 44: 7.288 sec elapsed
## run 37 / 44: 6.707 sec elapsed
## run 38 / 44: 6.615 sec elapsed
## run 39 / 44: 9.709 sec elapsed
## run 40 / 44: 7.462 sec elapsed
## run 41 / 44: 7.559 sec elapsed
## run 42 / 44: 6.868 sec elapsed
## run 43 / 44: 6.913 sec elapsed
## run 44 / 44: 11.627 sec elapsed

Awesome we now have the distances from all 44 markets to all grids. The resulting dataframe has 780472 rows. Now we want to find the market that is closest from every grid cell in terms of traveltime.

5 Visualization and Assessment

In the following chunks some visualizations are executed to assess the results.

5.1 Non Spatial

What is the relation between Traveltime and population for each grid cell?

5.2 Spatial

Maps of population distribution, travel time to the closest market and the closest market with respective administrative boundaries and the markets locations.

5.2.1 Population distribution

## tmap mode set to 'view'

5.2.2 Traveltime to closest market

5.2.3 Closest market

6 Output

Grid export in geopackage format.