Background
Cancer during childhood is a life-threatening disease that develops rapidly and affects about 20000 patients in the EU every year. Since childhood cancer requires early and intensive treatment, fewer deaths from the disease (mortality) may reflect the overall impact of factors contributing to favourable outcomes. Such factors may be health expenditures, childhood cancer awareness, access to care, capacities, and availability of effective anti-cancer medicines.
Over the recent decades following the seminal demonstration that childhood cancer can successfully be treated, survival outcomes improved and conversely mortality decreased. For example, mortality descreased from 70-80 per million person years in 1962 to about 20 in 2002 cancer during childhood (birth to less than 15 years) in the US and certain countries in Europe .
Methods
The purpose of this post is to continue the analysis with the latest available data. The WHO mortality database (MDB) is used for this analysis. Data from 14 Member States of the European Union could be included. The mortality figures are presented across sexes and countries. Data for children from birth to 19 years of age were included. These age-standardised figures were averaged across countries weighted by the inverse of the standard deviation of a country"s figures across years. For commenting, the R code is included.
Acknowledgement
The WHO is responsible only for the provision of the original information (the MDB) and is acknowledged as the source of the published data. However, the author of this blog is responsible for all analyses, interpretations and conclusions using the published data.
Results
Fewer children die from cancer every year, and this trend seems slow yet continuous in Europe.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 | ### Initialise # load libraries library(ggplot2) ### Load new data # download from WHO web pages # - old: # http://www.who.int/healthinfo/statistics/mortality_rawdata/en/ # - current: # https://www.who.int/data/data-collection-tools/who-mortality-database if (!is.null(curl::nslookup("r-project.org", error = FALSE))) { who.webpage <- httr::GET(url = "https://www.who.int/data/data-collection-tools/who-mortality-database") if (httr::status_code(who.webpage) != 200) stop("Could not retrieve WHO web page.") who.webpage <- httr::content(who.webpage, "text") who.webpage <- gsub("\n", " ", who.webpage) # create dataframe with urls and update Sys.setlocale(category = "LC_ALL", locale = "C") who.links <- stringr::str_extract_all(who.webpage,"<div *>.*?Last update: (.*)<.*?href=\"(.*?)\".*")[[1]] who.links <- data.frame(date = sub(".*Last update: (.*?)<.*", "\\1", who.links), url = sub(".*href=\"(.*?)\".*", "\\1", who.links), stringsAsFactors = FALSE) who.links$date <- strptime(who.links$date, format = "%d %b %Y") # download files that need downloading and process data # files are zipped, names may not correspond to url if (!file.exists("last_download_marker") || (file.info("last_download_marker")$mtime < max(who.links$date))) { ## obtain data # do download, unzip, remove for (i in who.links$url) { if (!grepl("pop|country|icd10", i)) next tmp <- sub(".*/(.+zip)", "\\1", i) download.file(i, destfile = tmp) unzip(tmp, exdir = "./inputdata/", setTimes = TRUE) unlink(tmp) } # touch file to record last download file.create("last_download_marker") ## process data # create base data set who.population <- data.table::fread("inputdata/pop") who.countries <- data.table::fread("inputdata/country_codes") who.mortality <- NULL for (i in dir("./inputdata", "*icd10*", full.names = TRUE)) { tmp <- data.table::fread(i) who.mortality <- rbind(who.mortality, tmp) rm(tmp) } rm(list = c("who.mortality1", "who.mortality2")) save.image(file = "data-mortality.Data") } # update available # clean up rm(list = c("who.webpage", "who.links")) } # online ### Prepare data # load previously saved work load("data-mortality.Data") # https://en.wikipedia.org/wiki/European_Union # all countries with 10 million or more except # Romania, Czech republic due to registratin uncertainties countries.selected <- c("Germany", "France", "United Kingdom", "Italy", "Spain", "Poland", "Netherlands", "Belgium", "Ireland", "Austria", "Denmark", "Finland", "Sweden", "Norway") # "Portugal" excluded, different age format # select interesting countries and change into vector countries <- subset(who.countries, subset = name %in% countries.selected) # coding data by country if (FALSE) table(who.mortality$Country, who.mortality$List) # merge with mortality database and keep only selected countries who.mortality <- merge(who.mortality, countries, by.x = "Country", by.y = "country", all.x = FALSE) # merge with population database who.population <- merge(who.population, countries, by.x = "Country", by.y = "country", all.x = FALSE) # clean to avoid confusion rm(list = c("countries.selected")) ## WHO data definitions and subset selection # Q. How are the number of deaths and population presented? # A. They are all shown as absolute numbers. # ICD 10 Mortality Tabulation List 1 # code: 1026 = C00-D48 = Neoplasms # Mortality values were adjusted by the age distribution of the # European standard population11 to obtain European standardised rates. # 11 Waterhouse J, Muir C, Correa P, Powel J, eds. Cancer incidence in # five continents. Vol III. IARC scientific publication number 15. WHO 1976. # Data for the European countries are combined from Ireland, Netherlands, UK. # All data are from the UN6 7 and WHO. # Age-standardised mortality measures were weighted by their # country-specific and sex-specific standard errors. # Lists: # # 10M = ICD10 3 and 4 (detailed) character list # 101 = ICD10 Mortality Tabulation List 1 (condensed) # 103 = ICD10 3 (detailed) character list # 104 = ICD10 4 (detailed) character list # if (FALSE) table(who.mortality$name, who.mortality$List) # 103 104 10M # Austria 0 36227 0 # Belgium 2928 49186 0 # Denmark 0 54366 0 # Finland 0 45140 0 # France 0 71874 0 # Germany 0 107932 0 # Greece 0 2153 0 # Ireland 0 15683 0 # Italy 0 58619 0 # Netherlands 0 50545 12914 # Norway 0 40387 0 # Poland 0 72686 0 # Spain 0 78631 0 # Sweden 0 43618 5200 # United Kingdom 0 72534 0 # if (FALSE) table(who.mortality$name, who.mortality$Year) # 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 # Austria 0 0 0 0 0 0 0 0 2378 2258 2318 2434 2397 2371 # Belgium 0 0 0 0 1514 1414 3331 3221 3282 3304 3178 3251 3294 3325 # Denmark 2368 2366 2363 2338 2337 2407 2327 2269 2521 2399 2325 2355 2329 2809 # Finland 0 0 2287 2272 2361 2313 2347 2276 2249 2288 2267 2262 2249 2270 # France 0 0 0 0 0 0 4888 4900 4880 4963 4928 4862 4917 4885 # Germany 0 0 0 0 6098 5948 5982 5916 5805 5875 5942 5901 6025 6003 # Greece 0 0 0 0 0 0 0 0 0 0 0 0 0 0 # Ireland 0 0 0 0 0 0 0 0 0 0 0 0 0 2021 # Italy 0 0 0 0 0 0 0 0 0 5018 4909 4824 4845 4838 # Netherlands 0 0 3319 3193 3178 3224 3255 3321 3371 3278 3303 3287 3266 3175 # Norway 0 0 2219 2268 2168 2206 2196 2196 2187 2172 2000 1951 1885 1741 # Poland 0 0 0 0 0 4649 4585 4519 4425 4415 4320 4295 4268 4345 # Spain 0 0 0 0 0 4555 4563 4617 4607 4663 4662 4540 4477 4501 # Sweden 0 0 0 2592 2608 2579 2575 2509 2574 2536 2607 2545 2548 2590 # United Kingdom 0 0 0 0 0 0 0 4951 4886 4914 4900 4889 4861 4779 # if (FALSE) table(who.mortality$Year, who.mortality$List) # 103 104 10M # 1994 0 2368 0 # 1995 0 2366 0 # 1996 0 6869 3319 # 1997 0 6878 5785 # 1998 1514 12964 5786 # 1999 1414 24657 3224 # 2000 0 36049 0 # 2001 0 40695 0 # 2002 0 43165 0 # 2003 0 48083 0 # 2004 0 47659 0 # 2005 0 47396 0 # # select 104 for which data are most complete who.mortality <- subset(who.mortality, subset = who.mortality$List == "104") # Cause: # # Cause of death – For details consult Part 2 below or ICD publications # see http://apps.who.int/classifications/icd10/browse/2016/en # extract only the following from all mortalities: # C00-C97 Malignant neoplasms if (FALSE) table(who.mortality$Cause) who.mortality <- subset(who.mortality, subset = grepl("^C[0-9]+$", Cause)) # Frmat: # # Age-group format for breakdown of deaths at 0-95+ yrs # see Annex Table 1 = Frmat: 00 is chosen if (FALSE) table(who.mortality$name, who.mortality$Frmat) # 0 1 2 # Austria 5845 0 0 # Belgium 6710 0 0 # Denmark 7779 772 780 # Finland 10090 0 0 # France 9072 0 0 # Germany 13071 0 0 # Greece 368 0 0 # Ireland 2536 0 0 # Italy 7024 0 0 # Netherlands 6602 0 0 # Norway 7066 0 0 # Poland 11453 0 0 # Spain 10196 0 0 # Sweden 6788 0 0 # United Kingdom 8081 0 0 # Age: # # Deaths1 Deaths at all ages # Deaths2 Deaths at age 0 year # Deaths3 Deaths at age 1 year # Deaths4 Deaths at age 2 years # Deaths5 Deaths at age 3 years # Deaths6 Deaths at age 4 years # Deaths7 Deaths at age 5-9 years # Deaths8 Deaths at age 10-14 years # Deaths9 Deaths at age 15-19 years # Deaths10 Deaths at age 20-24 years # Deaths11 Deaths at age 25-29 years # Deaths12 Deaths at age 30-34 years # Deaths13 Deaths at age 35-39 years # Deaths14 Deaths at age 40-44 years # Deaths15 Deaths at age 45-49 years # Deaths16 Deaths at age 50-54 years # Deaths17 Deaths at age 55-59 years # Deaths18 Deaths at age 60-64 years # Deaths19 Deaths at age 65-69 years # Deaths20 Deaths at age 70-74 years # Deaths21 Deaths at age 75-79 years # Deaths22 Deaths at age 80-84 years # Deaths23 Deaths at age 85-89 years # Deaths24 Deaths at age 90-94 years # Deaths25 Deaths at age 95 years and above # Deaths26 Deaths at age unspecified ### Calculations # sum up cancer mortality in age groups by: country, year; across: Cause, Sex cancer.deaths <- aggregate(subset(who.mortality, select = grep("^Deaths.*$", names(who.mortality))), by = list("country" = who.mortality$name, "year" = who.mortality$Year), sum) # select WHO population data # table(who.population$Country, who.population$Frmat) # sum up population by: country, year; across: sex (sic) cancer.population <- aggregate(subset(who.population, select = grep("^Pop.*", names(who.population))), by = list("country" = who.population$name, "year" = who.population$Year, "format" = who.population$Frmat), sum) # check if more than one format (ie row) per country and year? no. if(FALSE) xtabs( ~ year + country, data = cancer.population) # merge for calculation of age(group)-specific empiric mortality rate mortality.rates <- merge(cancer.deaths, cancer.population) rm(list = c("cancer.deaths", "cancer.population")) # identify relevant columns tmp.dea <- grep("^Dea.*", names(mortality.rates)) tmp.pop <- grep("^Pop.*", names(mortality.rates)) if (length(tmp.dea) != length(tmp.pop)) stop("Something went wrong: different numbers of columns for deaths and populations.") # all columns with paediatric groups (Deaths2 ... Deaths9 = 15-19 years!) tmp.dea <- grep("^Deaths[2-9]$", names(mortality.rates)) tmp.pop <- grep("^Pop[2-9]$", names(mortality.rates)) ## For weighting calculate # # "Age-standardised mortality measures were weighted # by their country-specific and sex-specific standard errors." # # - calculate sd by age group, by country, across years # - use 1 / sd(sex, country) as weight when averaging mortality rates per year # 1. mortality by age group (age-standardised mortality) by country mortality.calc <- mortality.rates[tmp.dea] / mortality.rates[tmp.pop] * 10^6 names(mortality.calc) <- paste0("Mort", sub("Deaths", "", names(mortality.rates[tmp.dea]))) mortality.rates <- cbind(mortality.rates, mortality.calc) rm("mortality.calc") # if (FALSE) head(mortality.rates) # country year Deaths1 Deaths1 Deaths2 ... Pop1 ... Mort9 # 1 Austria 2002 18623 2 2 29.13001 # 2 Austria 2003 19231 2 2 35.30941 # 2. mortality across all age groups # # where an entry is NA for deaths or population numbers, # make corresponding death or population number also NA # to ensure summing up only entries with corresponding numbers if (FALSE) mortality.rates[36, c(1, 2, 5, 6, 7, 32, 33, 34)] if (FALSE) mortality.rates[67, c(1, 2, 5, 6, 7, 32, 33, 34)] # country year Deaths3 Deaths4 Deaths5 Pop3 Pop4 Pop5 # 36 Denmark 1999 9 NA NA 66379 68906 69456 # country year Deaths3 Deaths4 Deaths5 Pop3 Pop4 Pop5 # 120 Italy 2011 15 15 15 2229374 NA NA # # multiplication with NA will result in NA mortality.rates[tmp.pop][is.na(mortality.rates[tmp.dea])] <- NA mortality.rates[tmp.dea][is.na(mortality.rates[tmp.pop])] <- NA # # sum up mortality.rates$DeathsPaed <- rowSums(mortality.rates[tmp.dea], na.rm = TRUE) mortality.rates$PopPaed <- rowSums(mortality.rates[tmp.pop], na.rm = TRUE) mortality.rates$MortPaed <- mortality.rates$DeathsPaed / mortality.rates$PopPaed * 10^6 # 3. standard error of total number by country across years # helper function standarderror <- function(x) sqrt(var(x, na.rm = TRUE) / length(na.omit(x))) # country.sd <- aggregate(mortality.rates$MortPaed, by = list("country" = mortality.rates$country), standarderror) names(country.sd) <- c("country", "MortPaedSD") mortality.rates <- merge(mortality.rates, country.sd) rm("country.sd") # 4. weighted average over years tmp.by <- by(mortality.rates, list("year" = mortality.rates$year), function(x) weighted.mean(x = x$MortPaed, w = 1 / x$MortPaedSD, na.rm = TRUE)) mortality.paed <- data.frame("year" = as.numeric(dimnames(tmp.by)$year), "mort" = as.vector(tmp.by), stringsAsFactors = FALSE) # clean up rm("tmp.by") ### Plot results # all paediatric mortalities by country if (FALSE) { # ggplot(data = mortality.rates, aes(year, MortPaed, colour = country)) + geom_line() + geom_point() + ylim(0, 100) + xlim(1995, 2020) # } # prepare plot annotation plotlabel <- paste0( "Weighted average of ", nrow(countries), " EU Member States\n(", paste0(toupper(substr(countries$name, 1, 2)), collapse = ", "), ")", ifelse(file.exists("last_download_marker"), paste0("\nWHO data: latest year reported is ", max(who.mortality$Year, na.rm = TRUE), "; last retrieved on ", as.Date(file.info("last_download_marker")$mtime)), "")) # show external graphic if (FALSE) knitr::include_graphics(normalizePath("./pritchard-jones_2013.png")) # From: [zotpress items="86M642UG" style="apa"] # plot mortality rates ggplot(data = mortality.paed, aes(year, mort)) + geom_line() + geom_point() + ylim(0, 50) + theme_bw() + labs( y = "Mortality rate per Million", x = "Year", title = "Mortality from cancer in childhood (birth to less than 20 years)") + scale_x_continuous(breaks = seq(from = 1990, to = 2020, by = 5), minor_breaks = 1990:2020, limits = c(1994, max(who.mortality$Year, na.rm = TRUE) + 1)) + annotate("text", x = 1994, y = 0, size = 4, colour = "black", hjust = "left", vjust = "bottom", label = plotlabel) |
Plotted data
For the plot above, data of a variable number of countries were available:
1 2 3 4 5 6 7 | knitr::kable(t( tapply( who.mortality$name, who.mortality$Year, function(x) length(unique(x)), simplify = TRUE) )) |
1994 | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | 2016 | 2017 | 2018 | 2019 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1 | 2 | 2 | 3 | 6 | 9 | 10 | 11 | 12 | 12 | 12 | 12 | 13 | 13 | 13 | 13 | 13 | 13 | 13 | 13 | 13 | 12 | 8 | 6 | 2 |
Outlook
Factors contributing to reduce childhood cancer mortality may be identified and strengthened. Quality of survival is more and more important. Initiatives are ongoing to improve the outcomes of children across the globe and of young adults.
European Health Information explorer
The figure above can be compared with an official automated analysis of the WHO Europe databases as follows. Using these databases, the standardised death rate (SDR) from a malignant neoplasm (deaths per 100000) in the European Union population from 1 to 19 years (that is, less than 20 years) is shown below.
Hi, the WHO has updated their data, when will you update this post?
Thanks, Dave. I see data was added for 2017, but just for one European country; will update the post once more countries’ data have become available.