Spatial regression analysis for the total number of covid-19 cases over the whole period

load("data/unionedListw_d_berlinNotSubdivided.Rdata")
load("data/kreiseWithCovidMeldedatumWeeklyPredictors.Rdata")

After the initial visual inspection of the temporal and spatial development of the covid-19 incidence at the districts level in Germany and the analysis of global and spatial autocorrelation we are now going to explore the association of the incidence rate with socio-economic predictors. We are starting with a GLM and check for spatial autocorrelation in the residuals. To deal with the spatial autocorrelation we use a spatial eigenvector approach. Furthermore, we investigate if regression coefficients vary in space. We use the combined neighborhood definition from the previous chapter for the analysis.

Explorative analysis

Maps of response and predictors

For reference we create a series of maps for the response and all predictors that we want to use in the following. We also characterise the spatial pattern by means of global Moran’s I.

kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
tm_shape() + tm_polygons(col=c("incidenceTotal"), 
                         legend.hist = TRUE, palette="-plasma",
                         legend.reverse = TRUE, 
                         title = "Covid-19 cases by\n100,000 inhabitants") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Incidence rate, total period") + 
  tm_scale_bar()

The accumulated number of covid-19 cases over the whole period is clearly spatially structured as indicated by the empirical bayes index modification of Moran’s I.

EBImoran.mc(n= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$sumCasesTotal,
              x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$EWZ, listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Empirical Bayes Index (mean subtracted)

data:  cases: kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$sumCasesTotal, risk population: kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$EWZ
weights: unionedListw_d
number of simulations + 1: 1000

statistic = 0.59645, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater
mForeigner <- kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
 tm_shape() + 
  tm_polygons(col=c("Auslaenderanteil"),
              legend.hist = TRUE, 
              legend.reverse = TRUE, 
              title = "Share of foreigners of inhabitants [%]",
              style= "pretty") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Foreign population") + 
  tm_scale_bar()

print(mForeigner)

The share of foreigners at the population is spatially clustered as indicated by Moran’s I.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Auslaenderanteil,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Auslaenderanteil 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = 0.45961, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater

The hypothesis is that the larger shares of foreigners indicate a larger share of inhabitants with limited skills in the German language a lower access to health and social distancing related information and that thereby districts with higher shares are associated with higher incidence rates.

kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
tm_shape() + 
  tm_polygons(col=c("Auspendler"),
              legend.hist = TRUE, 
              legend.reverse = TRUE, 
              title = "Outgoing commuters as share\nof total employees",
              style= "kmeans") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Outgoing commuters") + 
  tm_scale_bar()

The share of outgoing commuter is randomly distributed in space.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Auspendler,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Auspendler 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = -0.088668, observed rank = 1, p-value = 0.999
alternative hypothesis: greater
kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
tm_shape() + 
  tm_polygons(col=c("Einpendler"),   
              legend.hist = TRUE, 
              legend.reverse = TRUE, 
              title = "Incoming commuters as share\nof total employees",
              style= "kmeans") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Incoming commuters") + 
  tm_scale_bar()

The share of incoming commuter is weakly clustered.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Einpendler,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Einpendler 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = 0.22916, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater

The hypothesis for both incoming and outgoing commuters is that they indicate a spill-over of population between districts and thereby are associated with higher incidence rates.

mRural <- kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
 tm_shape() + 
  tm_polygons(col=c("Laendlichkeit"), 
              legend.hist = TRUE, palette = "Purples",
              legend.reverse = TRUE, 
              title = "Share of inhabitants in places\nwith less then 150 Inh/sqkm",
              style= "kmeans") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "rurality") + 
  tm_scale_bar()
print(mRural)

The share of inhabitants in places with less then 150 Inhabitants per sqkm is weakly spatially clustered.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Laendlichkeit,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Laendlichkeit 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = 0.21365, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater

The hypothesis is that lower population densities and other modes of transport (less public transport) might lead to lower incidence rates.

kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
tm_shape() + 
  tm_polygons(col=c("Studierende"), 
              legend.hist = TRUE, palette = "Purples",
              legend.reverse = TRUE, 
              title = "Students per \n1000 Inhabitants",
              style= "kmeans") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Students") + 
  tm_scale_bar()

The number of students per 1000 inhabitants is randomly distributed in space.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Studierende,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Studierende 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = -0.082863, observed rank = 1, p-value = 0.999
alternative hypothesis: greater

The hypothesis is that students have on average higher contact rates (under non-lockdown conditions) and also are spatially more mobile (moving between their living place and the town there their parents live for example) and districts with higher share of students might be associated with higher incidence rates.

kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
tm_shape() + 
  tm_polygons(col=c("Hochqualifizierte"), 
              legend.hist = TRUE, 
              legend.reverse = TRUE, 
              title = "Highly qualified employees/ total employees",
              style= "kmeans") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Highly qualified employees") + 
  tm_scale_bar()

The share of highly qualified employees (employees with Bachelor or Master degree) is weakly spatially structured.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Hochqualifizierte,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Hochqualifizierte 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = 0.15176, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater

The hypothesis is that employees with an academic degree more frequently work at home office and districts with a higher share are thereby associated with lower incidence rates.

kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
tm_shape() + 
  tm_polygons(col=c("Breitbandversorgung"), 
              legend.hist = TRUE, palette = "Purples",
              legend.reverse = TRUE, 
              title = "Share of households with internet \nconnectivity > 5 mBits/s",
              style= "kmeans") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Highspeed internet") + 
  tm_scale_bar()

The share of households with internet connectivity greater than 5 mBits per second is weakly spatially structured.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Breitbandversorgung,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Breitbandversorgung 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = 0.21852, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater

The hypothesis is that districts with better internet access offer higher potential for working in home office and thereby are associated with lower incidence rates.

kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
tm_shape() + 
  tm_polygons(col=c("Langzeitarbeitslosenquote"), 
              legend.hist = TRUE, palette="Oranges",
              legend.reverse = TRUE, 
              title = "Unemployment rate [%]") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Long term unemloyment rate") + 
  tm_scale_bar()

The long term unemployment rate is clearly spatially clustered.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Langzeitarbeitslosenquote,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Langzeitarbeitslosenquote 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = 0.59268, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater

The hypothesis is that the long term unemployment rate is associated with higher incidence rates as it is an indicator for economic conditions in general. Poorer districts might be facing higher incidence rates due to denser living conditions, less access to information and ressources.

kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded %>% 
tm_shape() + 
  tm_polygons(col=c("Stimmenanteile.AfD"),  
              legend.hist = TRUE, palette="Blues",
              legend.reverse = TRUE, 
              title = "Share of votes at national election 2017") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", 
            outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE, 
            main.title = "Right-winged party (AfD)") + 
  tm_scale_bar()

The share of votes for the biggest right-winged party (AfD) at the last election was clearly spatially clustured. The spatial distribution shows a clear east (former GDR)-west pattern with the the highest values in rural districts in Saxony.

moran.mc(x= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Stimmenanteile.AfD,
         listw = unionedListw_d, nsim =999)

    Monte-Carlo simulation of Moran I

data:  kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded$Stimmenanteile.AfD 
weights: unionedListw_d  
number of simulations + 1: 1000 

statistic = 0.73741, observed rank = 1000, p-value = 0.001
alternative hypothesis: greater

The hypothesis is that the share of right-winged voters is a proxy for the share of the population skeptical with respect to social distancing, mask wearing and vaccination and might therefor be associated with higher incidence rates at the district level.

Scatterplot matrix

As a start we create a scatterplot matrix for the response and a number of potential predictors. For this plot we drop the sticky geometry columns and select the variables of interest. For the meaning of the variables and their units we refer to the choreplethe maps above.

st_drop_geometry(kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded) %>% 
  dplyr::select(incidenceTotal, Langzeitarbeitslosenquote, 
                Hochqualifizierte, Breitbandversorgung, Stimmenanteile.AfD,
                Studierende, Einpendler, Auspendler, Einpendler, 
                Auslaenderanteil, Laendlichkeit) %>% 
  ggpairs() 

We see some predictors correlating with the incidence rate as well as some moderate collinearity between some of our predictors. We highlight a few observations and refer to the scatterplot matrix for further details.

The incidence rate is positively associated with the votes for the right-winged party and the share of foreigners at the district level. The long term unemployment rate is negatively associated with the share of outgoing and incoming commuters. The share of employees with an academic degree is positively associated with the availability of high-speed internet access, the number of students per population and the share of foreigners and negatively associated with the share of outgoing commuters and the share of the population living in rural areas. The share of votes for the right-winged party is negatively associated with the availability of highs-peed internet access and the share of foreigners. The share of foreigners is alo negatively associated with the rurality of the district. The share of students is negatively associated with the share of outgoing commuters. The availability of high-speed internet access is negatively associated with rurality, share of foreigners and the share of outgoing commutes and positively associated with the share of students.

Normal GLM

We start with a normal GLM before checking for spatial autocorrelation in the residuals. Since we have count data a Poisson GLM with an offset for the population at risk seems a natural choice.

Poisson GLM

modelGlm <- glm(sumCasesTotal ~ Stimmenanteile.AfD + 
                  Auslaenderanteil + Hochqualifizierte + 
                  Langzeitarbeitslosenquote + Einpendler + 
                  Studierende + Laendlichkeit +  
                  offset(log(EWZ)), 
                data= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
                family=poisson)
summary(modelGlm)

Call:
glm(formula = sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
    Hochqualifizierte + Langzeitarbeitslosenquote + Einpendler + 
    Studierende + Laendlichkeit + offset(log(EWZ)), family = poisson, 
    data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-75.552  -13.633   -2.309   10.382   70.683  

Coefficients:
                            Estimate Std. Error z value Pr(>|z|)    
(Intercept)               -3.449e+00  4.460e-03 -773.31   <2e-16 ***
Stimmenanteile.AfD         4.122e-02  9.852e-05  418.38   <2e-16 ***
Auslaenderanteil           3.289e-02  1.226e-04  268.35   <2e-16 ***
Hochqualifizierte         -1.276e-02  1.185e-04 -107.66   <2e-16 ***
Langzeitarbeitslosenquote -2.364e-02  3.704e-04  -63.81   <2e-16 ***
Einpendler                -1.599e-03  4.323e-05  -36.99   <2e-16 ***
Studierende                1.509e-04  1.211e-05   12.46   <2e-16 ***
Laendlichkeit             -1.327e-03  2.565e-05  -51.72   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for poisson family taken to be 1)

    Null deviance: 344479  on 399  degrees of freedom
Residual deviance: 156226  on 392  degrees of freedom
AIC: 160590

Number of Fisher Scoring iterations: 4

All selected predictors seem highly significant but the need to investigate overdispersion.

sumModelGlm <- summary(modelGlm) 
(phi <- sumModelGlm$deviance/sumModelGlm$df.residual)
[1] 398.5368

A quick check reveals serious overdisperion that needs to be taken into account.

Negative binomial GLM to account for overdispersion

Since we should be suspicious with respect to overdispersion we will run a negative binomial and afterwards a quasi-poisson GLM to account for that. Since the negative binomial GLM triggers some complications when using it with the spatial eigenvector mapping we will stay with the quasi-poisson model afterwards. While spatial eigenvector mapping can be use with a negative binomial GLM, we need to write code for that - due to shortage of time we will leave this for now.

modelGlmNb <- glm.nb(sumCasesTotal ~ Stimmenanteile.AfD + 
                       Auslaenderanteil + Hochqualifizierte + 
                       Langzeitarbeitslosenquote + Einpendler + 
                       Studierende + Laendlichkeit + 
                       offset(log(EWZ)), 
                     data= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded)
summary(modelGlmNb)

Call:
glm.nb(formula = sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
    Hochqualifizierte + Langzeitarbeitslosenquote + Einpendler + 
    Studierende + Laendlichkeit + offset(log(EWZ)), data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
    init.theta = 21.58561699, link = log)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.4933  -0.7228  -0.0782   0.5668   3.1286  

Coefficients:
                            Estimate Std. Error z value Pr(>|z|)    
(Intercept)               -3.5407641  0.1201151 -29.478  < 2e-16 ***
Stimmenanteile.AfD         0.0458598  0.0022967  19.968  < 2e-16 ***
Auslaenderanteil           0.0376844  0.0030124  12.510  < 2e-16 ***
Hochqualifizierte         -0.0144682  0.0029854  -4.846 1.26e-06 ***
Langzeitarbeitslosenquote -0.0480101  0.0090316  -5.316 1.06e-07 ***
Einpendler                -0.0012145  0.0013595  -0.893    0.372    
Studierende                0.0002775  0.0002649   1.048    0.295    
Laendlichkeit             -0.0008254  0.0005037  -1.639    0.101    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(21.5856) family taken to be 1)

    Null deviance: 851.01  on 399  degrees of freedom
Residual deviance: 403.29  on 392  degrees of freedom
AIC: 7156.5

Number of Fisher Scoring iterations: 1

              Theta:  21.59 
          Std. Err.:  1.52 

 2 x log-likelihood:  -7138.506 

Considering overdispersion renders three regression coefficients non significant. We will stepwise reduce model complexity based on the AIC. We use the convenience function drop1 that drops each term at a time and reports the AIC. Smaller AIC values indicate a better agreement of the model with the data.

\[ AIC = - 2* log-likelihood + 2*P\] there \(p\) indicates the number of parameters in the model.

drop1(modelGlmNb)
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + Hochqualifizierte + 
    Langzeitarbeitslosenquote + Einpendler + Studierende + Laendlichkeit + 
    offset(log(EWZ))
                          Df Deviance    AIC
<none>                         403.29 7154.5
Stimmenanteile.AfD         1   812.34 7561.6
Auslaenderanteil           1   556.93 7306.1
Hochqualifizierte          1   427.05 7176.3
Langzeitarbeitslosenquote  1   431.45 7180.7
Einpendler                 1   404.07 7153.3
Studierende                1   404.45 7153.7
Laendlichkeit              1   405.90 7155.1

Based on AIC we drop the share of incoming commuters.

modelGlmNb <- update(modelGlmNb, ~ . - Einpendler)
drop1(modelGlmNb)
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + Hochqualifizierte + 
    Langzeitarbeitslosenquote + Studierende + Laendlichkeit + 
    offset(log(EWZ))
                          Df Deviance    AIC
<none>                         403.30 7153.3
Stimmenanteile.AfD         1   810.92 7558.9
Auslaenderanteil           1   557.11 7305.1
Hochqualifizierte          1   426.54 7174.5
Langzeitarbeitslosenquote  1   435.98 7184.0
Studierende                1   404.80 7152.8
Laendlichkeit              1   405.85 7153.8

Next we drop the share of students.

modelGlmNb <- update(modelGlmNb, ~ . - Studierende)
drop1(modelGlmNb)
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + Hochqualifizierte + 
    Langzeitarbeitslosenquote + Laendlichkeit + offset(log(EWZ))
                          Df Deviance    AIC
<none>                         403.31 7152.8
Stimmenanteile.AfD         1   810.49 7558.0
Auslaenderanteil           1   556.70 7304.2
Hochqualifizierte          1   425.93 7173.4
Langzeitarbeitslosenquote  1   434.56 7182.0
Laendlichkeit              1   406.05 7153.5
summary(modelGlmNb)

Call:
glm.nb(formula = sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
    Hochqualifizierte + Langzeitarbeitslosenquote + Laendlichkeit + 
    offset(log(EWZ)), data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
    init.theta = 21.46423243, link = log)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.4607  -0.7231  -0.0854   0.5758   3.1371  

Coefficients:
                            Estimate Std. Error z value Pr(>|z|)    
(Intercept)               -3.6403759  0.0624123 -58.328  < 2e-16 ***
Stimmenanteile.AfD         0.0454243  0.0022782  19.939  < 2e-16 ***
Auslaenderanteil           0.0377774  0.0030208  12.506  < 2e-16 ***
Hochqualifizierte         -0.0126774  0.0026518  -4.781 1.75e-06 ***
Langzeitarbeitslosenquote -0.0424379  0.0075784  -5.600 2.15e-08 ***
Laendlichkeit             -0.0008477  0.0005042  -1.681   0.0927 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for Negative Binomial(21.4642) family taken to be 1)

    Null deviance: 846.24  on 399  degrees of freedom
Residual deviance: 403.31  on 394  degrees of freedom
AIC: 7154.8

Number of Fisher Scoring iterations: 1

              Theta:  21.46 
          Std. Err.:  1.51 

 2 x log-likelihood:  -7140.788 

Rurality is only marginally significant. The direction of the effects is in line with our hypothesis with the exception of the long term unemployment rate that is associated with lower incidence rates.

Quasi-Poisson GLM to account for overdispersion

As an alternative approach we use a quasi-poisson GLM. As only a pseudo-likelihood is defined for the quasi distribution families we cannot use the AIC for model comparison anymore. Instead we use an F-test to compare nested models.

modelGlmQp <- glm(sumCasesTotal ~ Stimmenanteile.AfD + 
                    Auslaenderanteil + Hochqualifizierte + 
                    Langzeitarbeitslosenquote + Einpendler  + 
                    Studierende + Laendlichkeit + 
                    offset(log(EWZ)), 
                  data= kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
                  family = quasipoisson)
summary(modelGlmQp)

Call:
glm(formula = sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
    Hochqualifizierte + Langzeitarbeitslosenquote + Einpendler + 
    Studierende + Laendlichkeit + offset(log(EWZ)), family = quasipoisson, 
    data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-75.552  -13.633   -2.309   10.382   70.683  

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -3.4489644  0.0884963 -38.973  < 2e-16 ***
Stimmenanteile.AfD         0.0412170  0.0019548  21.085  < 2e-16 ***
Auslaenderanteil           0.0328903  0.0024319  13.524  < 2e-16 ***
Hochqualifizierte         -0.0127585  0.0023516  -5.426 1.01e-07 ***
Langzeitarbeitslosenquote -0.0236376  0.0073502  -3.216  0.00141 ** 
Einpendler                -0.0015993  0.0008578  -1.864  0.06300 .  
Studierende                0.0001509  0.0002402   0.628  0.53040    
Laendlichkeit             -0.0013267  0.0005090  -2.607  0.00949 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 393.7124)

    Null deviance: 344479  on 399  degrees of freedom
Residual deviance: 156226  on 392  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4

The different distributional assumption leads to somewhat differt regression coefficient etstimates and stadard errors.

drop1(modelGlmQp, test = "F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + Hochqualifizierte + 
    Langzeitarbeitslosenquote + Einpendler + Studierende + Laendlichkeit + 
    offset(log(EWZ))
                          Df Deviance  F value    Pr(>F)    
<none>                         156226                       
Stimmenanteile.AfD         1   322415 416.9966 < 2.2e-16 ***
Auslaenderanteil           1   225877 174.7648 < 2.2e-16 ***
Hochqualifizierte          1   167833  29.1225 1.179e-07 ***
Langzeitarbeitslosenquote  1   160333  10.3051  0.001436 ** 
Einpendler                 1   157593   3.4279  0.064854 .  
Studierende                1   156381   0.3869  0.534276    
Laendlichkeit              1   158920   6.7577  0.009687 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
modelGlmQp <- update(modelGlmQp, ~ . - Studierende)
drop1(modelGlmQp, test = "F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + Hochqualifizierte + 
    Langzeitarbeitslosenquote + Einpendler + Laendlichkeit + 
    offset(log(EWZ))
                          Df Deviance  F value    Pr(>F)    
<none>                         156381                       
Stimmenanteile.AfD         1   325141 424.1105 < 2.2e-16 ***
Auslaenderanteil           1   225948 174.8296 < 2.2e-16 ***
Hochqualifizierte          1   168755  31.0982  4.57e-08 ***
Langzeitarbeitslosenquote  1   160334   9.9362  0.001745 ** 
Einpendler                 1   157783   3.5239  0.061231 .  
Laendlichkeit              1   159117   6.8761  0.009075 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Even if the share of incoming commuters is only marginaly significant we are going to stay with it for the moment.

summary(modelGlmQp)

Call:
glm(formula = sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
    Hochqualifizierte + Langzeitarbeitslosenquote + Einpendler + 
    Laendlichkeit + offset(log(EWZ)), family = quasipoisson, 
    data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-75.563  -13.869   -2.382   10.295   70.520  

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -3.4499273  0.0883910 -39.030  < 2e-16 ***
Stimmenanteile.AfD         0.0410287  0.0019311  21.246  < 2e-16 ***
Auslaenderanteil           0.0328809  0.0024315  13.523  < 2e-16 ***
Hochqualifizierte         -0.0122169  0.0021875  -5.585 4.37e-08 ***
Langzeitarbeitslosenquote -0.0228180  0.0072267  -3.157  0.00171 ** 
Einpendler                -0.0016177  0.0008560  -1.890  0.05953 .  
Laendlichkeit             -0.0013363  0.0005084  -2.628  0.00891 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 393.3036)

    Null deviance: 344479  on 399  degrees of freedom
Residual deviance: 156381  on 393  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4

Explained deviance:

1 - modelGlmQp$deviance / modelGlmQp$null.deviance
[1] 0.5460368

We end up with a model of decent quality. Directions of effect seem to be mostly aligned with expectations:

  • higher share of votes for right winged party is associated with higher incidence. Presumably a proxy for the share of population opposing mask wearing and social distancing and vaccination
  • higher share of foreigners is associated with higher incidence. Foreigners might not be reach by information campaigns with respect to social distancing due to language problems
  • higher share of highly qualified work force is associated with lower incidence rates. For those employees it might be easier to work from home office and to avoid close contacts during work hours at office
  • higher rurality (higher share of population living in rural areas) is associated with lower incidence rates. This might be due to lower contact rates e.g. by lower share of public transport.

The longterm unemployment rate and the share of incoming commuters is unexpectedly associated with lower incidence rates.

Checking for spatial autocorrelation in the residuals

Global Moran’s I

For regression residuals we need to use a different test as residuals are centered around zero and will sum up to zero.

lm.morantest(modelGlmNb, listw = unionedListw_d)

    Global Moran I for regression residuals

data:  
model: glm.nb(formula = sumCasesTotal ~ Stimmenanteile.AfD +
Auslaenderanteil + Hochqualifizierte + Langzeitarbeitslosenquote +
Laendlichkeit + offset(log(EWZ)), data =
kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, init.theta =
21.46423243, link = log)
weights: unionedListw_d

Moran I statistic standard deviate = 22.666, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Observed Moran I      Expectation         Variance 
    0.5078558227    -0.0003562342     0.0005027223 
(moranGlmQp <- lm.morantest(modelGlmQp, listw = unionedListw_d))

    Global Moran I for regression residuals

data:  
model: glm(formula = sumCasesTotal ~ Stimmenanteile.AfD +
Auslaenderanteil + Hochqualifizierte + Langzeitarbeitslosenquote +
Einpendler + Laendlichkeit + offset(log(EWZ)), family = quasipoisson,
data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded)
weights: unionedListw_d

Moran I statistic standard deviate = 21.736, p-value < 2.2e-16
alternative hypothesis: greater
sample estimates:
Observed Moran I      Expectation         Variance 
    4.903729e-01    -7.092485e-07     5.089760e-04 

Both model suffer from significant global spatial autocorrelation.

This implies that the usual assumption about independence of errors is violated. In turn, our standard errors might be too low, p-values too small, size (and potentially even sign) of the regression coefficients might be wrong. So we need to incorporate spatial autocorrelation in our analysis.

Plot residuals

First we create centrods that we use in turn for a proportional symbol map of the residuals.

kreiseCentroids <- st_centroid(kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
                               of_largest_polygon = TRUE)

Add residuals to the centroids for plotting. In addition to the (deviance) residuals we also store their absolute values as this is need to scale the symbols.

kreiseCentroids$residGlmNb <- residuals(modelGlmNb)
kreiseCentroids$residGlmNbAbs <- abs(kreiseCentroids$residGlmNb)

kreiseCentroids$residGlmQp <- residuals(modelGlmQp)
kreiseCentroids$residGlmQpAbs <- abs(kreiseCentroids$residGlmQp)

The size of the symbols is taken from the absolute value while the color is assigned based on the deviance residuals.

m1 <- tm_shape(kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded) + 
  tm_polygons(col="grey") +
  tm_shape(kreiseCentroids) + 
  tm_bubbles(size = "residGlmNbAbs", col= "residGlmNb", palette = "-RdBu", 
             alpha=.9, perceptual=TRUE, scale=.8, border.alpha=.3, 
             title.size = "Abs residual", title.col="Residuals", n=3) + 
  tm_layout(main.title = "Pearson residuals, GLM NB", bg="darkgrey", 
            legend.outside = TRUE, attr.outside = TRUE) +
  tm_scale_bar()

m2 <- tm_shape(kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded) + 
  tm_polygons(col="grey") +
  tm_shape(kreiseCentroids) + 
  tm_bubbles(size = "residGlmQpAbs", col= "residGlmQp", palette = "-RdBu", 
             alpha=.9, perceptual=TRUE, scale=.8, border.alpha=.3, 
             title.size = "Abs residual", title.col="Residuals", n=3) + 
  tm_layout(main.title = "Pearson residuals, GLM QP", bg="darkgrey", 
            legend.outside = TRUE, attr.outside = TRUE) +
  tm_scale_bar()

tmap_arrange(m1,m2)

As indicated by global Moran’s I we see that large positive and large negative residuals form some cluster. The higher complexity of the quasi-poisson GLM leads to smaller residuals for some districts.

Spatial Eigenvector Mapping

The idea behind the spatial eigenvector mapping approach is to use additional covariates that aborb the spatial autocorrelation, leading to unbiased estimators for the other predictors. The additional covariates are based on the eigenfunction decomposition of the spatial weight matrix \(W\). Eigenvectors of \(W\) represent the decompositions the spatial weight Matrix into all mutually orthogonal eigenvectors. Those with positive eigenvalues represent positive autocorrelation, whereas eigenvectors with negative eigenvalues represent negative autocorrelation. Only eigenvectors with positive eigenvalues are used for the selection.

Selection of eigenvectors

The function ME uses brute force eigenvector selection to reach a subset of such vectors to be added to the RHS of the GLM model to reduce residual autocorrelation to below the specified alpha value (defaults to 0.05). Since eigenvector selection only works on symmetric weights, the weights are made symmetric beforehand.

meQp <- spatialreg::ME(sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil +
                         Hochqualifizierte + Langzeitarbeitslosenquote + 
                         Einpendler + Laendlichkeit, 
                       family = quasipoisson, 
                       data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
                       offset = log(EWZ), listw = unionedListw_d)

Refitting GLM under incorporation of the selected spatial eigenvectors. The selected spatial eigenvectors are added by fitted(meQp).

modelSevmQp <- glm(sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil +
                         Hochqualifizierte + Langzeitarbeitslosenquote + 
                         Einpendler + Laendlichkeit + fitted(meQp),
                   family = quasipoisson, 
                   data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
                   offset = log(EWZ))
summary(modelSevmQp)

Call:
glm(formula = sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
    Hochqualifizierte + Langzeitarbeitslosenquote + Einpendler + 
    Laendlichkeit + fitted(meQp), family = quasipoisson, data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
    offset = log(EWZ))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-64.319  -10.813   -1.173    7.044   60.276  

Coefficients:
                            Estimate Std. Error t value Pr(>|t|)    
(Intercept)               -3.5686172  0.0762497 -46.802  < 2e-16 ***
Stimmenanteile.AfD         0.0335164  0.0020307  16.505  < 2e-16 ***
Auslaenderanteil           0.0398662  0.0020078  19.856  < 2e-16 ***
Hochqualifizierte         -0.0103186  0.0017849  -5.781 1.54e-08 ***
Langzeitarbeitslosenquote -0.0393161  0.0061192  -6.425 3.92e-10 ***
Einpendler                 0.0002148  0.0007280   0.295 0.768072    
Laendlichkeit             -0.0001591  0.0004204  -0.378 0.705280    
fitted(meQp)vec4          -2.0999358  0.1604029 -13.092  < 2e-16 ***
fitted(meQp)vec10          0.6514284  0.1668556   3.904 0.000112 ***
fitted(meQp)vec6          -0.6939561  0.1809248  -3.836 0.000146 ***
fitted(meQp)vec5           0.5601058  0.1687742   3.319 0.000991 ***
fitted(meQp)vec18         -0.6205227  0.1549663  -4.004 7.47e-05 ***
fitted(meQp)vec3           0.7554643  0.2013032   3.753 0.000202 ***
fitted(meQp)vec53          0.6670917  0.1613533   4.134 4.38e-05 ***
fitted(meQp)vec49          0.8408968  0.1751992   4.800 2.28e-06 ***
fitted(meQp)vec38         -0.4813610  0.1740430  -2.766 0.005954 ** 
fitted(meQp)vec89          0.4039978  0.1495802   2.701 0.007223 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 240.971)

    Null deviance: 344479  on 399  degrees of freedom
Residual deviance:  93018  on 383  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4

The procedure added 10 spatial eigenvectors to the model. Together this leads to a more than satisfying amount of explained deviance. However, we need to keep in mind that a good share of that come from the spatial eigenvectors.

1 - modelSevmQp$deviance / modelSevmQp$null.deviance
[1] 0.7299735

Rurality of the district and the share of commuter now became insignificant so we might want to drop tzem step by step from the model.

meQp <- spatialreg::ME(sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
                         Hochqualifizierte + Laendlichkeit,
                       family = quasipoisson, 
                       data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
                       offset = log(EWZ), 
                       listw = unionedListw_d)

Refitting GLM under incorporation of the selected spatial eigenvectors:

modelSevmQp <- glm(sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
                     Hochqualifizierte + Laendlichkeit + fitted(meQp),
                   family = quasipoisson, offset = log(EWZ), 
                   data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded)
summary(modelSevmQp)

Call:
glm(formula = sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
    Hochqualifizierte + Laendlichkeit + fitted(meQp), family = quasipoisson, 
    data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
    offset = log(EWZ))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-54.945  -10.368   -1.551    6.689   58.593  

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)        -3.6865335  0.0453202 -81.344  < 2e-16 ***
Stimmenanteile.AfD  0.0411685  0.0018518  22.232  < 2e-16 ***
Auslaenderanteil    0.0293059  0.0022670  12.927  < 2e-16 ***
Hochqualifizierte  -0.0057613  0.0018273  -3.153  0.00174 ** 
Laendlichkeit      -0.0005475  0.0004117  -1.330  0.18438    
fitted(meQp)vec2    1.2042313  0.1940489   6.206 1.41e-09 ***
fitted(meQp)vec6   -0.7841047  0.1780411  -4.404 1.38e-05 ***
fitted(meQp)vec4   -1.7225029  0.1491201 -11.551  < 2e-16 ***
fitted(meQp)vec10   0.1257958  0.1636061   0.769  0.44243    
fitted(meQp)vec1   -0.7449725  0.1592799  -4.677 4.03e-06 ***
fitted(meQp)vec18  -0.7010511  0.1491663  -4.700 3.63e-06 ***
fitted(meQp)vec21  -1.4645606  0.1716293  -8.533 3.31e-16 ***
fitted(meQp)vec11  -0.3173441  0.1492974  -2.126  0.03417 *  
fitted(meQp)vec19  -0.9967625  0.1699892  -5.864 9.75e-09 ***
fitted(meQp)vec5    0.5444928  0.1672192   3.256  0.00123 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 229.5023)

    Null deviance: 344479  on 399  degrees of freedom
Residual deviance:  88113  on 385  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4
meQp <- spatialreg::ME(sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
                         Hochqualifizierte,
                       family = quasipoisson, 
                       data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
                       offset = log(EWZ), 
                       listw = unionedListw_d)

Refitting GLM under incorporation of the selected spatial eigenvectors:

modelSevmQp <- glm(sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
                     Hochqualifizierte  + fitted(meQp),
                   family = quasipoisson, offset = log(EWZ), 
                   data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded)
summary(modelSevmQp)

Call:
glm(formula = sumCasesTotal ~ Stimmenanteile.AfD + Auslaenderanteil + 
    Hochqualifizierte + fitted(meQp), family = quasipoisson, 
    data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, 
    offset = log(EWZ))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-53.356  -10.321   -1.947    6.936   56.477  

Coefficients:
                    Estimate Std. Error  t value Pr(>|t|)    
(Intercept)        -3.721873   0.036744 -101.293  < 2e-16 ***
Stimmenanteile.AfD  0.041288   0.001849   22.333  < 2e-16 ***
Auslaenderanteil    0.030151   0.002176   13.853  < 2e-16 ***
Hochqualifizierte  -0.004930   0.001718   -2.870  0.00433 ** 
fitted(meQp)vec2    1.229067   0.193236    6.360 5.69e-10 ***
fitted(meQp)vec6   -0.791620   0.177978   -4.448 1.14e-05 ***
fitted(meQp)vec4   -1.718902   0.149249  -11.517  < 2e-16 ***
fitted(meQp)vec10   0.132188   0.163657    0.808  0.41975    
fitted(meQp)vec1   -0.661805   0.146610   -4.514 8.46e-06 ***
fitted(meQp)vec18  -0.713091   0.149129   -4.782 2.48e-06 ***
fitted(meQp)vec21  -1.467720   0.171605   -8.553 2.85e-16 ***
fitted(meQp)vec11  -0.332178   0.148986   -2.230  0.02635 *  
fitted(meQp)vec19  -1.001783   0.169929   -5.895 8.17e-09 ***
fitted(meQp)vec5    0.549126   0.167337    3.282  0.00113 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 229.6598)

    Null deviance: 344479  on 399  degrees of freedom
Residual deviance:  88520  on 386  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4

The eigenvectors selected changed then we dropped the coefficients.

Plotting selected spatial eigenvectors

Presumably we have missed a lot of other predictors that are now partially absorbed into different eigenvectors. It might be worth to plot and investigate the eigenvectors that made it into the model. Therefore, we attach the selected eigenvectors to the sf object and plot them.

summary(fitted(meQp))
      vec2               vec6                vec4               vec10          
 Min.   :-0.17702   Min.   :-0.190177   Min.   :-0.107585   Min.   :-0.148895  
 1st Qu.:-0.04621   1st Qu.:-0.014842   1st Qu.:-0.030753   1st Qu.:-0.018094  
 Median : 0.02081   Median : 0.003256   Median :-0.008749   Median : 0.003103  
 Mean   : 0.00000   Mean   : 0.000000   Mean   : 0.000000   Mean   : 0.000000  
 3rd Qu.: 0.03945   3rd Qu.: 0.013008   3rd Qu.: 0.040463   3rd Qu.: 0.017146  
 Max.   : 0.07715   Max.   : 0.185980   Max.   : 0.112574   Max.   : 0.141635  
      vec1               vec18                vec21           
 Min.   :-0.157693   Min.   :-0.2348596   Min.   :-0.1337760  
 1st Qu.:-0.028969   1st Qu.:-0.0206895   1st Qu.:-0.0257369  
 Median : 0.001129   Median : 0.0003868   Median :-0.0003304  
 Mean   : 0.000000   Mean   : 0.0000000   Mean   : 0.0000000  
 3rd Qu.: 0.031472   3rd Qu.: 0.0263951   3rd Qu.: 0.0304928  
 Max.   : 0.115554   Max.   : 0.2985828   Max.   : 0.2642922  
     vec11                vec19                 vec5          
 Min.   :-0.2190326   Min.   :-0.1735843   Min.   :-0.224152  
 1st Qu.:-0.0184901   1st Qu.:-0.0303940   1st Qu.:-0.026723  
 Median : 0.0007727   Median :-0.0002972   Median :-0.007429  
 Mean   : 0.0000000   Mean   : 0.0000000   Mean   : 0.000000  
 3rd Qu.: 0.0151115   3rd Qu.: 0.0299778   3rd Qu.: 0.021318  
 Max.   : 0.1915285   Max.   : 0.1358331   Max.   : 0.162426  
sevQp <- fitted(meQp)
kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem <- st_sf(data.frame(kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded, sevQp))
tm_shape(kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem) + 
  tm_polygons(col= colnames(sevQp), palette = "-RdBu", lwd=.5,   
             n=6, midpoint=0,  legend.show = FALSE) + 
  tm_layout(main.title = "Selected spatial eigenvectors", legend.outside = TRUE,
            attr.outside = TRUE, panel.show = TRUE, 
            panel.labels = colnames(sevQp)) +
  tm_scale_bar()

The spatial eigenvectors included in the model capture broadly speaking: - north west gradients - differences between regions in the northern part - patterns that involve Mecklenburg-Vorpommern, Saxony and parts of Bavaria - some of these might remind us of clusters we have seen in the local Moran’s I analysis

The spatial eigenvectors might help us to derive hypothesis about missing covariates that could be incorporated in the model. In any case they absorbe a good share of the spatial autocorrelation in the residuals.

Rechecking spatial autocorrelation

(moranSevmQp <- lm.morantest(modelSevmQp, listw = unionedListw_d))

    Global Moran I for regression residuals

data:  
model: glm(formula = sumCasesTotal ~ Stimmenanteile.AfD +
Auslaenderanteil + Hochqualifizierte + fitted(meQp), family =
quasipoisson, data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded,
offset = log(EWZ))
weights: unionedListw_d

Moran I statistic standard deviate = 7.3156, p-value = 1.281e-13
alternative hypothesis: greater
sample estimates:
Observed Moran I      Expectation         Variance 
    1.680228e-01    -2.875261e-06     5.275358e-04 

We see that were is still some left over spatial autocorrelation not absorbed by the spatial eigenvectors. However, the amount of spatial autocorrelation has been reduced by a strong degree, from 0.49 to 0.17.

kreiseCentroids$residSevmQp <- residuals(modelSevmQp)
kreiseCentroids$residSevmQpAbs <- abs(kreiseCentroids$residSevmQp)

tm_shape(kreiseWithCovidMeldeWeeklyCleanedPredictorsAdded) + 
  tm_polygons(col="grey") +
  tm_shape(kreiseCentroids) + 
  tm_bubbles(size = "residSevmQpAbs", col= "residSevmQp", palette = "-RdBu", 
             alpha=.9, perceptual=TRUE, scale=.8, border.alpha=.3, 
             title.size = "Abs residual", title.col="Residuals", n=5) + 
  tm_layout(main.title = "Pearson residuals, SEVM QP", bg="darkgrey", 
            legend.outside = TRUE, attr.outside = TRUE) +
  tm_scale_bar()

Spatial varying coefficients?

Share of foreigners

We could use the eigenvectors to analyse is regression coefficients (and thereby the effect of predictors) vary in space. We will show this at the example of the share of foreigners.

modelSevmQpInt <- glm(sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + 
                        (vec1 + vec2 + vec4 + vec5 + vec6 + vec10 + 
                           vec11 + vec18 + vec19 + vec21) * Auslaenderanteil,
                   family = quasipoisson, offset = log(EWZ), 
                   data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem)
summary(modelSevmQpInt)

Call:
glm(formula = sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + 
    (vec1 + vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + 
        vec19 + vec21) * Auslaenderanteil, family = quasipoisson, 
    data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem, 
    offset = log(EWZ))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-48.485   -9.314   -1.985    7.526   58.011  

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)            -3.709104   0.044139 -84.032  < 2e-16 ***
Stimmenanteile.AfD      0.041830   0.002369  17.659  < 2e-16 ***
Hochqualifizierte      -0.005620   0.001689  -3.328 0.000961 ***
vec1                   -1.183936   0.398696  -2.970 0.003174 ** 
vec2                    2.795996   0.405771   6.891 2.35e-11 ***
vec4                   -1.920999   0.400329  -4.799 2.31e-06 ***
vec5                    0.772453   0.315487   2.448 0.014804 *  
vec6                   -1.852276   0.421817  -4.391 1.47e-05 ***
vec10                   0.721893   0.322485   2.239 0.025771 *  
vec11                  -0.701577   0.311309  -2.254 0.024794 *  
vec18                  -0.705391   0.318133  -2.217 0.027201 *  
vec19                  -1.207329   0.339269  -3.559 0.000421 ***
vec21                  -0.243836   0.382893  -0.637 0.524625    
Auslaenderanteil        0.029712   0.002281  13.027  < 2e-16 ***
vec1:Auslaenderanteil   0.045935   0.027841   1.650 0.099799 .  
vec2:Auslaenderanteil  -0.117309   0.033779  -3.473 0.000575 ***
vec4:Auslaenderanteil   0.029258   0.027653   1.058 0.290712    
vec5:Auslaenderanteil  -0.031821   0.036066  -0.882 0.378168    
vec6:Auslaenderanteil   0.110721   0.038308   2.890 0.004072 ** 
vec10:Auslaenderanteil -0.051295   0.031245  -1.642 0.101482    
vec11:Auslaenderanteil  0.025580   0.037768   0.677 0.498648    
vec18:Auslaenderanteil  0.009136   0.032409   0.282 0.778183    
vec19:Auslaenderanteil  0.027595   0.033572   0.822 0.411609    
vec21:Auslaenderanteil -0.079033   0.035188  -2.246 0.025281 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 197.5302)

    Null deviance: 344479  on 399  degrees of freedom
Residual deviance:  73677  on 376  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4
drop1(modelSevmQpInt, test="F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + (vec1 + 
    vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + vec19 + 
    vec21) * Auslaenderanteil
                       Df Deviance  F value    Pr(>F)    
<none>                       73677                       
Stimmenanteile.AfD      1   135675 316.3959 < 2.2e-16 ***
Hochqualifizierte       1    75871  11.1972 0.0009018 ***
vec1:Auslaenderanteil   1    74216   2.7478 0.0982201 .  
vec2:Auslaenderanteil   1    76057  12.1466 0.0005498 ***
vec4:Auslaenderanteil   1    73898   1.1273 0.2890409    
vec5:Auslaenderanteil   1    73831   0.7875 0.3754265    
vec6:Auslaenderanteil   1    75342   8.4959 0.0037726 ** 
vec10:Auslaenderanteil  1    74209   2.7162 0.1001672    
vec11:Auslaenderanteil  1    73768   0.4631 0.4966093    
vec18:Auslaenderanteil  1    73693   0.0800 0.7773966    
vec19:Auslaenderanteil  1    73811   0.6808 0.4098215    
vec21:Auslaenderanteil  1    74674   5.0849 0.0247091 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
modelSevmQpInt <- update(modelSevmQpInt, ~. - vec18:Auslaenderanteil)
drop1(modelSevmQpInt, test="F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + vec1 + 
    vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + vec19 + 
    vec21 + Auslaenderanteil + vec1:Auslaenderanteil + vec2:Auslaenderanteil + 
    vec4:Auslaenderanteil + vec5:Auslaenderanteil + vec6:Auslaenderanteil + 
    vec10:Auslaenderanteil + vec11:Auslaenderanteil + vec19:Auslaenderanteil + 
    vec21:Auslaenderanteil
                       Df Deviance  F value    Pr(>F)    
<none>                       73693                       
Stimmenanteile.AfD      1   136759 322.6378 < 2.2e-16 ***
Hochqualifizierte       1    75871  11.1448 0.0009266 ***
vec18                   1    77276  18.3329 2.356e-05 ***
vec1:Auslaenderanteil   1    74237   2.7850 0.0959785 .  
vec2:Auslaenderanteil   1    76058  12.0988 0.0005635 ***
vec4:Auslaenderanteil   1    73907   1.0935 0.2963770    
vec5:Auslaenderanteil   1    73901   1.0648 0.3027796    
vec6:Auslaenderanteil   1    75346   8.4588 0.0038479 ** 
vec10:Auslaenderanteil  1    74215   2.6722 0.1029475    
vec11:Auslaenderanteil  1    73781   0.4530 0.5013160    
vec19:Auslaenderanteil  1    73832   0.7122 0.3992391    
vec21:Auslaenderanteil  1    74716   5.2352 0.0226876 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
modelSevmQpInt <- update(modelSevmQpInt, ~. - vec11:Auslaenderanteil)
drop1(modelSevmQpInt, test="F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + vec1 + 
    vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + vec19 + 
    vec21 + Auslaenderanteil + vec1:Auslaenderanteil + vec2:Auslaenderanteil + 
    vec4:Auslaenderanteil + vec5:Auslaenderanteil + vec6:Auslaenderanteil + 
    vec10:Auslaenderanteil + vec19:Auslaenderanteil + vec21:Auslaenderanteil
                       Df Deviance  F value    Pr(>F)    
<none>                       73781                       
Stimmenanteile.AfD      1   136841 323.0708 < 2.2e-16 ***
Hochqualifizierte       1    75912  10.9180 0.0010433 ** 
vec11                   1    75853  10.6157 0.0012228 ** 
vec18                   1    77277  17.9064 2.915e-05 ***
vec1:Auslaenderanteil   1    74455   3.4488 0.0640749 .  
vec2:Auslaenderanteil   1    76231  12.5487 0.0004461 ***
vec4:Auslaenderanteil   1    73976   0.9979 0.3184492    
vec5:Auslaenderanteil   1    74162   1.9506 0.1633452    
vec6:Auslaenderanteil   1    75402   8.3004 0.0041897 ** 
vec10:Auslaenderanteil  1    74542   3.8961 0.0491257 *  
vec19:Auslaenderanteil  1    73892   0.5645 0.4529303    
vec21:Auslaenderanteil  1    74782   5.1279 0.0241096 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
modelSevmQpInt <- update(modelSevmQpInt, ~. - vec19:Auslaenderanteil)
drop1(modelSevmQpInt, test="F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + vec1 + 
    vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + vec19 + 
    vec21 + Auslaenderanteil + vec1:Auslaenderanteil + vec2:Auslaenderanteil + 
    vec4:Auslaenderanteil + vec5:Auslaenderanteil + vec6:Auslaenderanteil + 
    vec10:Auslaenderanteil + vec21:Auslaenderanteil
                       Df Deviance  F value    Pr(>F)    
<none>                       73892                       
Stimmenanteile.AfD      1   136846 322.9036 < 2.2e-16 ***
Hochqualifizierte       1    76046  11.0511 0.0009727 ***
vec11                   1    76000  10.8137 0.0011017 ** 
vec18                   1    77281  17.3838 3.788e-05 ***
vec19                   1    79480  28.6624 1.496e-07 ***
vec1:Auslaenderanteil   1    74486   3.0505 0.0815209 .  
vec2:Auslaenderanteil   1    76233  12.0070 0.0005907 ***
vec4:Auslaenderanteil   1    74011   0.6106 0.4350480    
vec5:Auslaenderanteil   1    74179   1.4743 0.2254280    
vec6:Auslaenderanteil   1    75403   7.7526 0.0056326 ** 
vec10:Auslaenderanteil  1    74864   4.9877 0.0261115 *  
vec21:Auslaenderanteil  1    75383   7.6520 0.0059486 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
modelSevmQpInt <- update(modelSevmQpInt, ~. - vec4:Auslaenderanteil)
drop1(modelSevmQpInt, test="F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + vec1 + 
    vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + vec19 + 
    vec21 + Auslaenderanteil + vec1:Auslaenderanteil + vec2:Auslaenderanteil + 
    vec5:Auslaenderanteil + vec6:Auslaenderanteil + vec10:Auslaenderanteil + 
    vec21:Auslaenderanteil
                       Df Deviance  F value    Pr(>F)    
<none>                       74011                       
Stimmenanteile.AfD      1   143517 356.8727 < 2.2e-16 ***
Hochqualifizierte       1    76145  10.9588 0.0010207 ** 
vec4                    1    96567 115.8129 < 2.2e-16 ***
vec11                   1    76196  11.2224 0.0008891 ***
vec18                   1    77474  17.7816 3.100e-05 ***
vec19                   1    80311  32.3478 2.575e-08 ***
vec1:Auslaenderanteil   1    74600   3.0268 0.0827096 .  
vec2:Auslaenderanteil   1    76316  11.8368 0.0006452 ***
vec5:Auslaenderanteil   1    74242   1.1893 0.2761663    
vec6:Auslaenderanteil   1    75499   7.6400 0.0059867 ** 
vec10:Auslaenderanteil  1    74965   4.9025 0.0274106 *  
vec21:Auslaenderanteil  1    75678   8.5615 0.0036396 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
modelSevmQpInt <- update(modelSevmQpInt, ~. - vec5:Auslaenderanteil)
drop1(modelSevmQpInt, test="F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + vec1 + 
    vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + vec19 + 
    vec21 + Auslaenderanteil + vec1:Auslaenderanteil + vec2:Auslaenderanteil + 
    vec6:Auslaenderanteil + vec10:Auslaenderanteil + vec21:Auslaenderanteil
                       Df Deviance  F value    Pr(>F)    
<none>                       74242                       
Stimmenanteile.AfD      1   143634 356.1092 < 2.2e-16 ***
Hochqualifizierte       1    76213  10.1141 0.0015919 ** 
vec4                    1    97901 121.4155 < 2.2e-16 ***
vec5                    1    75787   7.9283 0.0051196 ** 
vec11                   1    76208  10.0899 0.0016123 ** 
vec18                   1    77476  16.5968 5.627e-05 ***
vec19                   1    81596  37.7387 2.037e-09 ***
vec1:Auslaenderanteil   1    74747   2.5913 0.1082781    
vec2:Auslaenderanteil   1    77107  14.7022 0.0001473 ***
vec6:Auslaenderanteil   1    75648   7.2166 0.0075393 ** 
vec10:Auslaenderanteil  1    75215   4.9924 0.0260382 *  
vec21:Auslaenderanteil  1    75881   8.4114 0.0039449 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
modelSevmQpInt <- update(modelSevmQpInt, ~. - vec1:Auslaenderanteil)
drop1(modelSevmQpInt, test="F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + vec1 + 
    vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + vec19 + 
    vec21 + Auslaenderanteil + vec2:Auslaenderanteil + vec6:Auslaenderanteil + 
    vec10:Auslaenderanteil + vec21:Auslaenderanteil
                       Df Deviance  F value    Pr(>F)    
<none>                       74747                       
Stimmenanteile.AfD      1   154155 405.8199 < 2.2e-16 ***
Hochqualifizierte       1    77023  11.6330 0.0007171 ***
vec1                    1    78262  17.9625 2.828e-05 ***
vec4                    1    99309 125.5247 < 2.2e-16 ***
vec5                    1    76573   9.3286 0.0024143 ** 
vec11                   1    76600   9.4676 0.0022421 ** 
vec18                   1    77962  16.4301 6.118e-05 ***
vec19                   1    82818  41.2439 3.996e-10 ***
vec2:Auslaenderanteil   1    77850  15.8546 8.189e-05 ***
vec6:Auslaenderanteil   1    76475   8.8296 0.0031513 ** 
vec10:Auslaenderanteil  1    75654   4.6325 0.0319985 *  
vec21:Auslaenderanteil  1    76415   8.5213 0.0037179 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

To avoid inclusion of too many eigenvectors in the interaction we might drop vec10 from it as well.

modelSevmQpInt <- update(modelSevmQpInt, ~. - vec10:Auslaenderanteil)
drop1(modelSevmQpInt, test="F")
Single term deletions

Model:
sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + vec1 + 
    vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + vec19 + 
    vec21 + Auslaenderanteil + vec2:Auslaenderanteil + vec6:Auslaenderanteil + 
    vec21:Auslaenderanteil
                       Df Deviance  F value    Pr(>F)    
<none>                       75654                       
Stimmenanteile.AfD      1   187202 564.7166 < 2.2e-16 ***
Hochqualifizierte       1    77830  11.0204 0.0009876 ***
vec1                    1    78779  15.8199 8.331e-05 ***
vec4                    1   102080 133.7867 < 2.2e-16 ***
vec5                    1    78178  12.7781 0.0003955 ***
vec10                   1    76224   2.8854 0.0901952 .  
vec11                   1    77446   9.0738 0.0027652 ** 
vec18                   1    79517  19.5577 1.273e-05 ***
vec19                   1    83068  37.5378 2.227e-09 ***
vec2:Auslaenderanteil   1    82359  33.9436 1.204e-08 ***
vec6:Auslaenderanteil   1    77716  10.4429 0.0013378 ** 
vec21:Auslaenderanteil  1    77104   7.3403 0.0070453 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(modelSevmQpInt)

Call:
glm(formula = sumCasesTotal ~ Stimmenanteile.AfD + Hochqualifizierte + 
    vec1 + vec2 + vec4 + vec5 + vec6 + vec10 + vec11 + vec18 + 
    vec19 + vec21 + Auslaenderanteil + vec2:Auslaenderanteil + 
    vec6:Auslaenderanteil + vec21:Auslaenderanteil, family = quasipoisson, 
    data = kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem, 
    offset = log(EWZ))

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-52.195   -9.857   -1.991    7.150   57.219  

Coefficients:
                        Estimate Std. Error  t value Pr(>|t|)    
(Intercept)            -3.770701   0.037291 -101.116  < 2e-16 ***
Stimmenanteile.AfD      0.045359   0.001928   23.528  < 2e-16 ***
Hochqualifizierte      -0.005444   0.001648   -3.304 0.001045 ** 
vec1                   -0.567920   0.142786   -3.977 8.33e-05 ***
vec2                    3.228949   0.372712    8.663  < 2e-16 ***
vec4                   -1.647744   0.144071  -11.437  < 2e-16 ***
vec5                    0.558888   0.156431    3.573 0.000398 ***
vec6                   -1.973910   0.411062   -4.802 2.26e-06 ***
vec10                   0.263843   0.156001    1.691 0.091596 .  
vec11                  -0.442877   0.147499   -3.003 0.002853 ** 
vec18                  -0.609971   0.138855   -4.393 1.45e-05 ***
vec19                  -1.020030   0.166830   -6.114 2.39e-09 ***
vec21                  -0.231237   0.374973   -0.617 0.537814    
Auslaenderanteil        0.031829   0.002067   15.398  < 2e-16 ***
vec2:Auslaenderanteil  -0.158122   0.027338   -5.784 1.52e-08 ***
vec6:Auslaenderanteil   0.116384   0.036307    3.206 0.001461 ** 
vec21:Auslaenderanteil -0.086399   0.031952   -2.704 0.007157 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for quasipoisson family taken to be 199.106)

    Null deviance: 344479  on 399  degrees of freedom
Residual deviance:  75654  on 383  degrees of freedom
AIC: NA

Number of Fisher Scoring iterations: 4
1 - modelSevmQpInt$deviance / modelSevmQpInt$null.deviance
[1] 0.7803822

Let’s map the resulting regression coefficient for the share of foreigners.

First we map the results of the interaction between foreigners and each of the three eigenvectors.

mForeignerVec2 <- kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem %>%
  mutate(vec2rurality =  vec2 * Auslaenderanteil) %>%
  tm_shape() + tm_polygons(col=c("vec2rurality"),
                           legend.hist = TRUE, palette="-RdBu", 
                           style = "fisher", n = 6, midpoint = 0, 
                         legend.reverse = TRUE, 
                         title = "Share foreigners moderated by eigenvector 2") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE,  legend.outside.position = "left",
            main.title = "Share foreigners by sev 2") + 
  tm_scale_bar()

mForeignerVec6 <- kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem %>%
  mutate(vec2rurality =  vec6 * Auslaenderanteil) %>%
  tm_shape() + tm_polygons(col=c("vec2rurality"),
                           legend.hist = TRUE, palette="-RdBu", 
                           style = "fisher", n = 6, midpoint = 0, 
                         legend.reverse = TRUE, 
                         title = "Share foreigners moderated by eigenvector 6") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE,  legend.outside.position = "left",
            main.title = "Share foreigners by sev 6") + 
  tm_scale_bar()

mForeignerVec21 <- kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem %>%
  mutate(vec2rurality =  vec21 * Auslaenderanteil) %>%
  tm_shape() + tm_polygons(col=c("vec2rurality"),
                           legend.hist = TRUE, palette="-RdBu", 
                           style = "fisher", n = 6, midpoint = 0, 
                         legend.reverse = TRUE, 
                         title = "Share foreigners moderated by eigenvector 21") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE,  legend.outside.position = "right",
            main.title = "Share foreigners by sev 21") + 
  tm_scale_bar()

tmap_arrange(mForeignerVec2, mForeigner, mForeignerVec6, mForeignerVec21,
             ncol = 2, nrow=2)

We can calculate the resulting regression coefficient for each district.

\[ y = \beta_1 * Auslaenderanteil + \beta_2 * Auslaenderanteil * vec2 + + \beta_3 * Auslaenderanteil * vec6 + \beta_4 * Auslaenderanteil * vec21 + ... = \] \[ = Auslaenderanteil * (\beta_1 + \beta_2 * vec2 + \beta_3 * vec6 + \beta_4 * vec21) + ...\] where \((\beta_1 + \beta_2 * vec2 + \beta_3 * vec6 + \beta_4 * vec21)\) represents the resulting regression coefficient per district.

kreiseWithCovidMeldeWeeklyCleanedPredictorsAddedSvem %>% 
  mutate(coefForeigner = coef(modelSevmQpInt)["Auslaenderanteil"] + 
           vec2*coef(modelSevmQpInt)["vec2:Auslaenderanteil"] + 
           vec6*coef(modelSevmQpInt)["vec6:Auslaenderanteil"] + 
           vec21*coef(modelSevmQpInt)["vec21:Auslaenderanteil"]  ) %>%
  tm_shape() + tm_polygons(col = "coefForeigner",
                           title = "Regression coefficient") +
  tm_layout(legend.outside = TRUE, bg.color = "darkgrey", outer.bg.color = "lightgrey",
            legend.outside.size = 0.5, attr.outside = TRUE,  legend.outside.position = "right",
            main.title = "Spatial varying regression coefficient for share foreigners") + 
  tm_scale_bar()

We see that the share of foreigners is always associated with higher incidence rates but that the strength of this relationship varies in space. This might reflect the different communities formed by foreigners (different education, integration, language,…) as well as how well the different groups of foreigners are addressed by e.g. the health administration.

The effect on incidence rates is largest in Hamburg and adjacen districts, Berlin as well as in Cottbus.

Of course there is much more potential to explore relationships between predictors and spatial eigenvectors and we have not even touched on interactions between “normal” predictors or higher order effects. We could (and should) try other predictor eigenvector combinations as well as interactions between the predictors.


Back to overview

All contents are licensed under GNU General Public License v3.0