Exercise 1: Creating an overview map of the health system and vaccination coverage#

Background#

Over the past month, health authorities in Chad have reported a surge in measles cases, particularly in Mandoul, Mayo-Kebbi Est, and Logone Oriental regions. The surveillance unit has provided line-list data and existing health facility data. Your first task is to create a base map showing health facility distribution and classify them by service capacity to understand the available response infrastructure.

Available Data#

Dataset name

Original title

Publisher

Downloaded from

tcd_admbnda_adm0_20250212_AB.shp (Polygons)

Chad Subnational Administrative Boundaries (level 0: country)

United Nations Office for the Coordination of Humanitarian Affairs (OCHA)

HDX

tcd_admbnda_adm1_20250212_AB.shp (Polygons)

Chad Administrative Boundaries (level 1: regions)

United Nations Office for the Coordination of Humanitarian Affairs (OCHA)

HDX

tcd_admbnda_adm2_20250212_AB.shp (Polygons)

Chad Administrative Boundaries (level 2: province)

United Nations Office for the Coordination of Humanitarian Affairs (OCHA)

HDX

hotosm_tcd_health_facilities_points_gpkg.gpkg (Points)

Chad Health Facilities (OpenStreetMap Export)

Humanitarian OpenStreetMap Team

HOTOSM

tcd_roads_ocha.shp (Lines)

Chad - Road Network

United Nations Office for the Coordination of Humanitarian Affairs (OCHA)

HDX

tcd_healthsite_capacities.csv

Healthsite Capacities

HeiGIT

This is a fictional dataset generated for the purpose of this exercise.

vaccination_coverage_adm2.csv

Measles vaccination coverage

HeiGIT

This is a fictional dataset generated for the purpose of this exercise.

Note

In this exercise, we will download real datasets from the Humanitarian Data Exchange (HDX) to identify and analyse relevant information. However, the Healthsite Capacities and Vaccination Coverage datasets used here are fictional and created solely for training purposes. They do not represent real-world data.

Download all datasets provided by HeiGIT here and save the folder on your computer and unzip the file.

https://nexus.heigit.org/repository/gis-training-resource-center/public_health/GIS_Training_Public_Health.zip

Tasks#

Task 1: Setting up the folder structure and creating a new QGIS project#

Standard Folder Structure

The single most important geodata management practice is to use a standardised folder structure that contains all parts of the QGIS project. We will save all of our data that we use or create within our QGIS project inside of this folder structure. The paths from a QGIS project to the geodata are by default relative. This means when the data and the project are in a fixed folder structure, you can move the whole structure without impacting the QGIS project or the paths to the data.

A standard folder structure has two principal advantages:

  1. If we share the whole project folder, we can expect the project to run without problems on a different computer.

  2. The folder structure supports the proper organisation of project data and helps ensure the QGIS project will work as intended.

  1. Create a new folder on your computer with the name “GIS_Training_Public_Health_Day_1-2”. In the folder create the following folder structure:

GIS_Training_Public_Health
├── project
├── results
├── styles
└── data
    ├── input
    ├── interim
    └── output
  1. Open QGIS and create a new project.

  2. Save the project via Project → Save As.... Navigate to the folder for this training and save it in the /project subfolder. Give it a name (e.g., GIS_Training_Public_Health_Part_1) and click Save.

  3. Now we should set up the Project CRS.

    • In the bottom right corner of the QGIS window, click on the Projection icon. Let’s choose a metric CRS that depicts Chad without distorting too much. For this exercise, we will use “Albers Equal Area Conic” (EPSG: 102022). In the Filter bar, enter the name or the EPSG number. The CRS should appear in the “Predefined Coordinate Reference Systems” Box. Select it and click Apply and OK.

    ../../_images/en_3.40_m3_ex_8_pub_health_1_project_crs.png

    Setting the Project CRS in QGIS#

Attention

The project CRS determines which coordinate reference system is being used to display the geodata on the QGIS map canvas. However, it does not change the CRS of layers. Each layer, or dataset, is encoded with a CRS. QGIS reprojects these layers “on the fly” to display layers with different CRS on the map canvas. This does not change the units of measurements or distortion of the actual layers. To perform distance calculations, you will need to reproject the layer to a metric coordinate reference system.

Setting the Project CRS to your desired CRS can help you choosing the correct CRS quicker when running algorithms.

Task 2: Downloading the relevant data#

  1. In your browser, head over to humdata.org

  2. Search for the following datasets

    • Chad Administrative Boundaries (OCHA): ADM0, ADM1, ADM2

    • Chad Health Facilities (OpenStreetMap Export)

    • Chad Roads (OCHA)

  1. Download the layers.

    • On the download page, you can usually select different data formats. The formats are indicated by their file endings (e.g., .shp, .gpkg, .gdb).

    • Sometimes the data is still zipped, so the file extension isn’t visible.

    • Choose the following formats:

      • Chad Administrative Boundaries (OCHA): Shapefile

      • Chad Health Facilities (OpenStreetMap Export): GeoPackage for Points, we don’t need the polygons information for this example

      • Chad Roads (OCHA): Shapefile

      ../../_images/en_m3_ex_8_public_health_part_1_hdx_data_formats.png
  2. Unzip the folders and make sure to save them in the standard folder structure into the data/input/-folder.

Task 3: Importing the datasets#

  1. In your QGIS, project, import the following datasets via drag-and-drop:

    • tcd_admbnda_adm0_20250212_AB.shp

    • tcd_admbnda_adm1_20250212_AB.shp

    • tcd_admbnda_adm2_20250212_AB.shp

    • hotosm_tcd_health_facilities_points_gpkg.gpkg

    • tcd_trs_roads_OCHA.shp

Video: Importing Shapefiles into a QGIS project via drag-and-drop

Attention

Imported files are not saved within the QGIS project. If you move or delete the original file, QGIS will no longer find the respective dataset.

Task 4: The layers panel and the layer concept#

  1. Once we’ve imported all the relevant layers, lets start by arranging the layers logically so we can work with them more easily. On the left, there is the Layers-panel. Here you can see all the datasets we’ve imported so far.

    • QGIS displays geodata in layers, where each dataset is represented in one layer. The layers are stacked on top of each other.

    • Lets arrange the layers so we work with them more easily:

      • The ADM0-layer should go at the bottom, followed by ADM1, then ADM2.

      • Then we can add the road network.

      • The healthsites should go on top.

  2. Let’s add a basemap:

    • In the file browser, scrool down until you see XYZ-Tiles

    • Uncollapse it and double-click on OpenStreetMap. A new layer will be added to your layers-panel, usually at the bottom. Make sure the layer sits at the bottom of the layers panel so all your other layers are visible.

Your QGIS window should look similar to this (with different colours for the layers).

../../_images/en_3.40_m3_ex_8_pub_health_1_ordering_layers.png
  1. Let’s investigate the layers that we have added so far. Each vector layer has an attribute table, where each row represents a geometric feature on the map canvas.

    • Open the attribute table by right-clicking on the ADM2 layer in the layers panel on the left → Open Attribute Table.

    • A new window will open. This is the attribute table. It shows the vector layer in a tabular format, allowing you to see the attribute values, sort the table, and edit the values using the tools in the top bar.

    • Take a look at the different columns in the attribute table. What do they show?

    • Try sorting the attribute table by clicking on the

    • Open the attribute tables for the layers hotosm_tcd_health_facilities_points_gpkg and tcd_admbnda_adm2_20250212_AB.shp familiarise yourself with the data.

    • Right-click on each layer and select

Task 5: Joining Vaccination Coverage Data with administrative boundaries#

In our data/input-folder, we can find a csv file called vaccination_coverage_adm2. This file includes the vaccination coverage of both the mcv1 and mcv2 vaccine. Thankfully, the dataset includes the district name (amd2_name) and the adm2 pcode. With this information, we can perform a non-spatial join in order to add the vaccination coverage data to our district boundaries layer (adm2).

Attention

Admin Pcodes are well suited for non-spatial joins in QGIS because they provide unique, standardized identifiers that avoid name mismatches and ensure accurate, reliable data linking.

  1. Import the vaccination_coverage_adm2 into your QGIS project:

    • In the top bar, navigate to Layer → Add Layer → Add Delimited Text Layer...

    • To the right of the File name-field, click on the three points and navigate to the data/input/vaccination_coverage_adm2.csv file and click Open.

    • In the import window, you will see sample data in the sample data field. Take a look at the columns and data available. What kind of data is present in each column?

    • Unfortunately, there are no columns with the coordinates for the individual healthsites in this data table. Under Geometry Definition select No geometry (attribute only table).

    • Click Add. The layer will appear in your layers tab as a data table, but will not be shown in the map canvas.

    ../../_images/en_3.40_m3_ex_8_pub_health_1_add_vacc_coverage_csv.png
  2. Investigate the new vaccination coverage further:

    • Right-click on the new layer and open the attribute table. What information is available? How is the table structured. We can see that we are able to use the column ADM2_PCODE to perform a [non-spatial join]

  3. In the processing toolbox on the right, search for the tool “Join attributes by key value” and double-click on it.

    • A new window will open. Here we can specify the parameters for the Join attributes by field value-tool.

    • As “Input layer”, select the layer tcd_admbnda_adm2_20250212_AB.

    • Under “Table field”, select ADM2_PCODE.

    • As “Input layer 2”, select vaccination_coverage_adm2.

    • Under “Table field 2”, select adm2_pcode.

    • Under “Layer 2 fields to copy”, we can select which columns we want to copy. Click on the three dots to the right of the field and select vaccination_rate_mcv1 and vaccination_rate_mcv2. Then, click OK.

    • Finally, to execute the algorithm, click on Run.

    ../../_images/en_3.40_m3_ex_8_pub_health_1_join_attr_vaccine_coverage.png

A new layer called “Joined Layer” will appear in the layers panel. To the right of it, you will see a symbol. This symbol indicates that the layer is a temporary scratch layer. This means it will be deleted once you close your QGIS project, even if you save the project.

  1. We can save the scratch layer by right-clicking on it and selecting Make permament....

    • A new window will open. Here we need to specify the file location and the layer name.

    • Leave the Format on “GeoPackage”.

    • Click on the three dots , navigate to the data/interim/-folder and enter a file name such as tcd_adm2_vacc_coverage. Click Save.

    • Enter the same name into Layer name field (This will be the name of the layer in the layers panel).

    • Leave the rest as it is and click Ok.

Great! We have added the information on vaccination coverage to our adm2-layer. Now, we can visualise the information by adding a graduated symbology to the layer

Task 6: Visualising the vaccination coverage#

Saving your progress

Remember to save your project intermittently to keep your progress by clicking . QGIS is constantly being developed by the open source community and is known to crash from time to time.

Now that we have the vaccination coverage information in our adm2-layer, we can visualise the information in order to understand the spatial distribution of the vaccination coverage.

  1. Open the symbology tab via the Properties window for the layer:

    • Right-click on the tcd_adm2_vacc_coverage-layer → Properties.

    • Navigate to “Symbology” in the tab section on the left.

    • Here we can change the symbology method from Single Symbol to Graduated.

    • Next, we need to select the value which will be used for the classification. Under Value, select the column vaccination_rate_mcv1 and click on classify. We want to use the Mode Equal Interval and use 5 Classes for a first assessment of the vaccination coverage.

../../_images/en_3.40_m3_ex_8_pub_health_1_vacc_coverage_map.png

Screenshot of classified vaccination_rate_mcv1 variable#

Task 7: Enriching the Healthsites dataset#

In this step, we want to enrich the layer containing the healthsites with additional data on the capacity of the healthsites. The layer tcd_healthsite_capacities.csv contains information about the bed capacity in the pediatric care unit as well as the cold chain capacity. This information is valuable to identify the capacity of the health sector to treat acute measles cases and coordinate a vaccination campaign.

Gathering the information on capacities

In a realistic scenario, this data might have been collected during a rapid facility assessment led by the Ministry of Health and Red Cross volunteers. Because data collection was decentralised and partially paper-based, some facility names differ slightly across datasets (e.g., spelling variants, abbreviations). When performing joins, pay attention to such inconsistencies.

  1. Let’s import the tcd_healthsite_capacities.csv into your QGIS project:

    • In the top bar, navigate to Layer → Add Layer → Add Delimited Text Layer...

    • To the right of the File name-field, click on the three points and navigate to the data/input/tcd_healthsite_capacities.csv file and click Open.

    • In the import window, you will see sample data in the sample data field. Take a look at the columns and data available. What kind of data is present in each column?

    • Unfortunately, there are no coordinates for the individual healthsites in this datatable. Under Geometry Definition select No geometry (attribute only table).

    • Click Add. The layer will appear in your layers tab as a data table, but will not be shown in the map canvas.

  2. Let’s investigate the capacities table further.

    • Right-click on the layer and open the attribute table.

    • In the top bar, you can see how many entries the dataset contains (148 features)

    • The datatable includes a column called name which contains the name of the health facilities. These names are the same names that are also stored in the healthsites point layer we imported earlier.

    • This means that we can join both tables using the attribute values of the name-column.

  3. In the processing toolbox, search for the tool Join attributes by field value and open it by double-clicking on it.

    • A new window will open. Here we can specify the parameters for the Join attributes by field value-tool.

    • As “Input layer”, select the layer hotosm_tcd_health_facilities_points_gpkg.

    • Under “Table field”, select name.

    • As “Input layer 2”, select tcd_healthsite_capacities.

    • Under “Table field 2”, select name.

    • Under “Layer 2 fields to copy”, we can select which columns we want to copy. Click on the three dots to the right of the field and select cold_chain, measles_vaccination, measles_treatment, beds_total, pediatric_beds, staff_total, and remarks. Then, click OK.

    • Finally, to execute the algorithm, click on Run.

    ../../_images/en_3.40_m3_ex_8_pub_health_1_join_attr_by_field_value.png

    Note

    After running the algorithm, the window will switch to the Log window. Here you can see if the algorithm encountered any problem. In our case, we can see that 149 features were successfully joined while 183 features were unable to be joined. This happens when the identifying value (table field) is missing from the corresponding column in layer 2. It may occur because the data is unavailable or because of inconsistencies in the identifying value, such as typos or different spellings.

    • After reviewing the Log, we can close the tool-window. A new layer called Joined layer should appear in your layers panel. Rename it to healthsites_points_capacities and move it to the top.

We now have a new point layer with the capacities of relevant healthsites. With this information, we can create a map showing the capacities of the health sector.

Task 8: Cleaning the Healthsite Data#

  1. Let’s take a look at the new layer we just created including the healthsite capacities by opening the attribute table.

    • Right-Click on the layer and open the attribute table.

    • In the attribute table, if you scroll to the right, you will see the new columns with the information we added using the “join attributes by field value”-tool.

    • Sort the attribute table for the new columns. As you can see, not every feature has information about the capacity.

    • We can remove the healthsites without additional information, as they are already available in the original dataset.

    • Sort the beds_total column ascendingly, so the features with “NULL”-values appear at the top.

    • Click on row column (on the left), and select the first feature. If selected, the feature should appear blue.

    • Now scroll down until you see the first feature with a value different then “NULL” in the beds_total column.

    • Hold Shift and click on the row number of the last feature with the value NULL.

    • In the toolbar of the attribute table, click on the Toggle Editing Mode-button to enter the editing mode for the attribute table.

    • Next, click on Delete selected features to delete the points with no capacity information.

    • Click on to save and exit the editing mode.

    • Save the cleaned healthsite capacity layer by right-clicking on it and selecting Make permament.... Select “Geopackage” as the output format and save the layer to the data/interim/-folder and enter a file name such as tcd_healthsites_points_capacities. Click Save.

Our new healthsites point layer now includes only the healthsites for which we received additional data.

Task 9: Classifying the Healthsites#

  1. Now, we can classify the healthsites points to indicate which healthsites have a cold chain in order to store measles vaccines.

    • Right-click on the tcd_healthsites_points_capacities and select Properties. A new window will open.

    • On the left, navigate to the Symbology-tab.

    ../../_images/en_3.40_m3_ex_8_pub_health_1_classifying_healthpoint_capacity.png
    • Instead of Single Symbol, we will now select Categorised as the visualisation method.

    • As “Value”, select cold_chain

    • Next, click on Classify.

    • If you want, you can adjust the symbology for the classes by double clicking on one.

    • As we are not interested in cold_chain values which equal NULL, we can remove this classification entry by selecting it and clicking on the red minus right next to the Classify button.

    • Click Apply, then close the properties window.

    • Analyze how healthsites with a true cold_chain value are distributed.