Analysis of market access in Somalia. We will look at food market locations in Somalia and assess their accessibilty by regular sampled points.
This section for reproducibility. We use R in the version 4.2.2. In
order to be able to reproduce this analysis you need an openrouteservice
API key. Get one here. Put it in a text
file file called config.txt
in the working directory. It
has no structure, just the key in the first line.
ZXCVBNMLKJHGFDSAQWERTYUIOP
Almost all of the libraries can be installed via CRAN with the
command install.packages('packagename')
. The
openrouteservice package is not on CRAN and needs to be installed via
github. With the command
remotes::install_github("GIScience/openrouteservice-r");
Also tmap in its most recent version 4 is not yet available on CRAN so
we need to download and install it via github too.
# main libraries
library(tidyverse)
library(glue)
library(sf)
library(units)
library(ggplot2)
library(tictoc)
library(openrouteservice) # remotes::install_github("GIScience/openrouteservice-r");
library(jsonlite)
library(terra)
library(exactextractr)
library(tmap) # remotes::install_github("r-tmap/tmap")
library(leaflet)
library(furrr)
library(purrr)
library(kableExtra)
library(mapsf)
library(RColorBrewer)
library(geonode4R)
library(scales)
#config <- readLines("config.txt")
#api_key <- config[1]
#user <- config[2]
#password <- config[3]
api_key <- ""
# Function to adjust a list of coordinates to the location of the closest road segment via the snapping endpoint of ORS. ORS itself only allows for a maximum distance of 350m to snap.
ors_snap <- function(x, rowids, local=F) {
x=coord_list
rowids=grid5km$rowid
local = T
library(httr)
# Define the body
body <- list(
locations = x,
radius = 100000000 # apparently the maximum snapping distance is 5km
)
# Define the headers
headers <- c(
'Accept' = 'application/json, application/geo+json, application/gpx+xml, img/png; charset=utf-8',
'Authorization' = api_key,
'Content-Type' = 'application/json; charset=utf-8'
)
endpoint <- if(local){'http://0.0.0.0:8080/ors/v2/snap/driving-car'} else {'https://api.openrouteservice.org/v2/snap/driving-car'}
# Make the POST request
response <- POST( endpoint,
#'https://api.openrouteservice.org/v2/snap/driving-car',
body = body,
add_headers(headers),
encode = "json")
resp_content <- content(response)
extract_data <- function(x, index) {
if (is.null(x)) {
return(data.frame(
rowid = index,
lon = NA,
lat = NA,
snapped_distance = NA
))
} else {
return(data.frame(
rowid = index,
lon = x$location[[1]],
lat = x$location[[2]],
snapped_distance = x$snapped_distance
))
}
}
# Extract data from the list and create a dataframe
df <- do.call(rbind, lapply(seq_along(resp_content$locations), function(i) extract_data(resp_content$locations[[i]], i)))
df$rowid <- rowids
# Convert the dataframe to an sf object
sf_df <- df |>
drop_na(lon, lat) |> # Drop rows with NA coordinates
st_as_sf(coords = c("lon", "lat"), crs = 4326) |> # Define the coordinate columns and set CRS
st_sf()
return(sf_df)
}
The workflow is as follows:
By snapping we refer to the process of changing the location of an origin or destination of a route to the nearest road segment. This is necessary because the openrouteservice API only allows for a maximum distance of 350m to snap. However in the context of Somalia greater snapping distances are desired as the road density in some regions might be low.
We use the following datasets as inputs:
pop <- rast('data/som_ppp_2020_UNadj_constrained.tif')
markets_som <- read_csv("data/wfp_food_prices_som.csv")
admin <- st_read('data/Som_Admbnda_Adm2_UNDP.shp', quiet = T)
In order to use the market locations we need to create a sf object by converting the latitude and longitude columns to a point geometry column. By checking the distinct locations of the geometry column we got 44 food markets in Somalia.
# Clean the latitude and longitude columns
markets_som_clean <- markets_som |>
filter(!is.na(latitude) & !is.na(longitude)) |>
slice(-1) |>
mutate(latitude = as.numeric(latitude),
longitude = as.numeric(longitude),
date = as.Date(date, format = "%Y-%m-%d"))
#add year
markets_som_clean <- markets_som_clean |>
mutate(year = as.numeric(format(date, "%Y")))
# Convert to sf object
markets_som_sf <- markets_som_clean |>
st_as_sf(coords = c("longitude", "latitude"), crs = 4326)
#unique markets
markets_som_asc_sf <- markets_som_sf |>
distinct(geometry, .keep_all = TRUE)
#unique markets descending
markets_som_desc_sf <- markets_som_sf |>
arrange(desc(row_number())) |> # Sort the dataframe such that the last rows come first
distinct(geometry, .keep_all = TRUE)
#combine both
markets_som_sf <- markets_som_desc_sf |> left_join(markets_som_asc_sf |> st_drop_geometry() |> select(market, year), by ="market")
#prepare final market sf
markets_som_sf <- markets_som_sf |> rename(first_year = year.y, last_year = year.x) |>
select(-date) |>
mutate(rowid = row_number())
Finding the shortest/fastest way between a number of origins and destinations is a job for the Matrix endpoint of ors. Lets start with the grid sampling. We decide for a 5000m width hexagonal grid. A hexagonal grid is closer to a circle and therefore the better joice for our analysis. Within each grid cell we will use the centroid and search for road network locations to snap to.
# create 5km grid
grid5km <-
st_make_grid(admin |> st_transform(20538), cellsize = 5000, square = F, flat_topped = T)
# convert to sf df
grid5km <- grid5km |> st_as_sf()
names(grid5km) <- 'geometry'
st_geometry(grid5km) <- 'geometry'
# get rid of all cells that are not within somalia
grid5km <-
grid5km |> st_transform(4326) |> st_filter(admin |> st_union())
# join admin codes
grid5km <-
grid5km |> st_join(admin |> select(admin2Name, admin2Pcod, admin1Name, admin1Pcod, admin0Name, admin0Pcod),
left = T,
largest = T)
# join pop information
grid5km$pop <- exact_extract(pop, grid5km, 'sum', progress = F)
# add rowid
grid5km <- grid5km |> mutate(rowid = row_number())
#filter for pop > 0
grid5km <- grid5km |> filter(pop > 0)
Time for this code chunk to run: 323.11 seconds
The grid is created, now snap the centroids.
# convert
coord_list <-
lapply(grid5km$geometry |> st_centroid(), function(x)
c(st_coordinates(x)[, 1], st_coordinates(x)[, 2]))
# snap the list of cords
snapped_coords <- ors_snap(coord_list,rowids=grid5km$rowid, local = T)
# join back
grid5km_snapped <- grid5km |>
left_join(snapped_coords |> tibble(), by = "rowid") |>
mutate(centroid = st_centroid(geometry.x))
# refactor names in order to keep, the grid geometry, centorid and snapped centroid
names(grid5km_snapped)[10:11] <- c('geometry', 'snapped_centroid')
st_geometry(grid5km_snapped) <- 'geometry'
grid5km_snapped <- grid5km_snapped |> mutate(snapped = ifelse(is.na(snapped_distance), F, T))
#filter for snapped values
grid5km_snapped_NA <- grid5km_snapped |> filter(snapped == F)
grid5km_snapped_notNA <- grid5km_snapped |> filter(snapped == T)
Time for this code chunk to run: 22.18 seconds
Awesome we now got 17738 snapped and 3045 not snapped grid cells.
In the next chunk we will calculate the travel times and distances from each grid cell to each market. We will use the openrouteservice API for this.
## run 1 / 44: 9.52 sec elapsed
## run 2 / 44: 7.935 sec elapsed
## run 3 / 44: 6.953 sec elapsed
## run 4 / 44: 8.041 sec elapsed
## run 5 / 44: 8.159 sec elapsed
## run 6 / 44: 8.902 sec elapsed
## run 7 / 44: 6.408 sec elapsed
## run 8 / 44: 6.297 sec elapsed
## run 9 / 44: 6.924 sec elapsed
## run 10 / 44: 7.681 sec elapsed
## run 11 / 44: 9.629 sec elapsed
## run 12 / 44: 6.77 sec elapsed
## run 13 / 44: 7.055 sec elapsed
## run 14 / 44: 8.066 sec elapsed
## run 15 / 44: 8.987 sec elapsed
## run 16 / 44: 9.499 sec elapsed
## run 17 / 44: 7.035 sec elapsed
## run 18 / 44: 6.867 sec elapsed
## run 19 / 44: 10.687 sec elapsed
## run 20 / 44: 8.204 sec elapsed
## run 21 / 44: 6.9 sec elapsed
## run 22 / 44: 7.636 sec elapsed
## run 23 / 44: 6.417 sec elapsed
## run 24 / 44: 9.359 sec elapsed
## run 25 / 44: 10.075 sec elapsed
## run 26 / 44: 8.183 sec elapsed
## run 27 / 44: 6.431 sec elapsed
## run 28 / 44: 7.323 sec elapsed
## run 29 / 44: 6.654 sec elapsed
## run 30 / 44: 6.708 sec elapsed
## run 31 / 44: 7.185 sec elapsed
## run 32 / 44: 6.548 sec elapsed
## run 33 / 44: 8.213 sec elapsed
## run 34 / 44: 11.124 sec elapsed
## run 35 / 44: 7.165 sec elapsed
## run 36 / 44: 7.288 sec elapsed
## run 37 / 44: 6.707 sec elapsed
## run 38 / 44: 6.615 sec elapsed
## run 39 / 44: 9.709 sec elapsed
## run 40 / 44: 7.462 sec elapsed
## run 41 / 44: 7.559 sec elapsed
## run 42 / 44: 6.868 sec elapsed
## run 43 / 44: 6.913 sec elapsed
## run 44 / 44: 11.627 sec elapsed
Awesome we now have the distances from all 44 markets to all grids. The resulting dataframe has 780472 rows. Now we want to find the market that is closest from every grid cell in terms of traveltime.
In the following chunks some visualizations are executed to assess the results.
What is the relation between Traveltime and population for each grid cell?
Maps of population distribution, travel time to the closest market and the closest market with respective administrative boundaries and the markets locations.
## tmap mode set to 'view'