Let's get the SNIS RDC health pyramid in R
I have many visions for this blog, but we have to start somewhere and here seems as good a place as any. Today's challenge is to create the SNIS RDC "health pyramid" in R.
For anyone unfamiliar, the "health pyramid" is an affectionate term for the geopolitical organizational structure of the Democratic Republic of Congo's health system. The national Ministry of Health oversees provincial departments, who overseen in tern, health zones. Each health zone is subdivided into health areas, the basic geographic unit of health distribution. In general, health areas contain clinics and hospitals, while health zone offices coordinate and monitor health care delivery across health areas within the health zone.
When working on DHIS2, the data collection platform powering the DRC's routine health information system, we referred to it as "the health pyramid." I'm not sure if anyone else called it that, but the name has stuck with me.
Before we begin, it is worth noting that DHIS2 has documentation on how to integrate DHIS2 with R. We won't be using that.
To a programmer, this organizational schema looks like a tree - health facilities are nested in health areas, contained in health zones, contained in provinces, all linked to the root national node. With this in mind, we can use DHIS2's API to try and recreate the pyramid by starting at the root and walking down.
In order to query the API, we'll be using the httr2 library which allows compositions of queries with tidyverse-like syntax. Below shows a simple request to the API for all the dataElements
registered in the database.
library(httr2)
library(dplyr)
require(jsonlite)
# SECRET!
# These are the credentials to the DHIS2 API
dhis2.user <- "MyUserName"
dhis2.pass <- "MyPassword"
# launch a request to the API
resp <- request("https://snisrdc.com/api/dataElements.json?paging=false") %>%
req_auth_basic(dhis2.user, password=dhis2.pass) %>%
req_perform() %>%
resp_body_string() %>%
jsonlite::fromJSON()
# play around with the returned response!
The DHIS2 models the SNIS RDC health pyramid using organizational units. These units each have an associated level and parentID. By querying all organizational units at "level 1", we can get the root node. Here's code to do that:
# Compose the URL for level "1". Only grabs the id, name, and parent id for each
# OrgUnit
url <- sprintf("https://snisrdc.com/api/organisationUnits.json?filter=level:eq:%d&paging=false&fields=id,name,parent[id]", 1)
# fire off the request
response <- request(url) %>%
req_auth_basic(dhis2.user, password=dhis2.pass) %>%
req_perform() %>%
resp_body_string() %>%
jsonlite::fromJSON()
# the response is in JSON format with all items contained in "organisationUnits"
units <- response$organisationUnits
# units now produces:
# name id
# 1 République Démocratique du Congo ymGeqzoPhN3
As shown above, we have gotten the root element, its ID, and an ignored field - the parentID. This is because we are at the top level. Let's query one level below to confirm:
# Testing out Org Level 2
url <- sprintf("https://snisrdc.com/api/organisationUnits.json?filter=level:eq:%d&paging=false&fields=id,name,parent[id]", 2)
# fire off the request
response <- request(url) %>%
req_auth_basic(dhis2.user, password=dhis2.pass) %>%
req_perform() %>%
resp_body_string() %>%
jsonlite::fromJSON()
units <- response$organisationUnits
# units now produces:
# name id id
# 1 bu Bas Uele Province rWrCdr321Qu ymGeqzoPhN3
# 2 eq Equateur Province XjeRGfqHMrl ymGeqzoPhN3
# ...
Great! As you can see, we now have each organizational unit, the ID, and a column for the parent's ID. Though not visualized here, the parent's ID is stored in units$parent$id
. We can easily abstract this logic away into a function:
# grabs the organisation units at the determined level
getOrgUnitLevel <- function(lvl) {
url <- sprintf("https://snisrdc.com/api/organisationUnits.json?filter=level:eq:%d&paging=false&fields=id,name,parent[id]", lvl)
response <- request(url) %>%
req_auth_basic(dhis2.user, password=dhis2.pass) %>%
req_perform() %>%
resp_body_string() %>%
jsonlite::fromJSON()
units <- response$organisationUnits
units$parentID <- units$parent$id
units$parent <- NULL
rm(response)
return(units)
}
Now, for each level, we can simply do a function call! It's easy to see how we can take advantage of this:
# returns a data frame of the entire DHIS2 OrgUnit tree with proper names and
# IDs. Consolidates into an excel-like frame of:
# country -> province -> zone -> area -> facility
getDHIS2OrgUnitTree <- function () {
country <- getOrgUnitLevel(1)
provinces <- getOrgUnitLevel(2)
zones <- getOrgUnitLevel(3)
areas <- getOrgUnitLevel(4)
facilities <- getOrgUnitLevel(5)
combined <- merge(areas, facilities, by.x = "id", by.y = "parentID")
names(combined) <- c("ha_code", "ha_name", "parentID", "hf_name", "hf_code")
combined <- merge(zones, combined, by.x = "id", by.y = "parentID")
names(combined) <- c("hz_code", "hz_name", "parentID", "ha_code", "ha_name","hf_name", "hf_code")
combined <- merge(provinces, combined, by.x = "id", by.y = "parentID")
names(combined) <- c("prov_code", "prov_name", "parentID", "hz_code", "hz_name", "ha_code", "ha_name","hf_name", "hf_code")
combined <- merge(country, combined, by.x = "id", by.y = "parentID")
names(combined) <- c("country_code", "country_name", "prov_code", "prov_name", "hz_code", "hz_name", "ha_code", "ha_name","hf_name", "hf_code")
rm(country, provinces, zones, areas, facilities)
return (combined)
}
With a few HTTP queries from a single function call, we now have access to the entire SNIS RDC pyramid. Huzzah!
# Grab the SNIS RDC Health Pyramid
OrgUnits <- getDHIS2OrgUnitTree()
In a future post, I'll look at how to use this org unit tree to begin creating a time-series database for different data elements. Until then, ciao!