Overview
Our package ctrdata
is continually developed and offered for the R system since 2015. It facilitates investigating and understanding trends in design and conduct of trials, their availability for participants and using their protocols and results for research and meta-analyses. The package can be used with information and documents available in the
- EU Clinical Trials Register (“EUCTR”, https://www.clinicaltrialsregister.eu/)
- EU Clinical Trials Information System (“CTIS”, https://euclinicaltrials.eu/)
- ClinicalTrials.gov (“CTGOV2” since 2023, “CTGOV” from 2015 to 2024)
- ISRCTN Registry (https://www.isrctn.com/)
Its features include,
- Protocol- and results-related trial information is easily downloaded, including any trial documents available in registers.
- Information is stored as
JSON
in a document-centric database (DuckDB
,PostgreSQL
,RSQLite
orMongoDB
), for fast offline access. - Find active substance synonyms, identify unique (de-duplicated) records across registers, merge and recode fields, easily access deeply-nested fields
Download
Package ctrdata
is on CRAN
: https://cran.r-project.org/package=ctrdata Within R, package ctrdata can be installed with: install.packages("ctrdata")
Documentation
Start here to find the Reference documentation and several Articles with detailed trial analysis: https://rfhb.github.io/ctrdata/
Support
The preferred way to flag issues is via https://github.com/rfhb/ctrdata/issues. Alternatively, use the comments form at the bottom of this page.
Milestones
- In November 2024, database and trial data functions accelerated through contributions to
RSQLite 2.3.8
and my packagenodbi 0.11.0
- Since 2024-06-30, CTIS2 relaunched on 2024-06-17 is supported, and queries to CTGOV (retired on 2024-06-25) are made to work with CTGOV2.
- Version 1.18.0 of package
ctrdata
can retrieve historic versions of studies as structured data from CTGOV2 (example), published on 2024-05-13. - By November 2023,
ctrdata
was freed from all dependencies on command line tools of the operating system. - The European Union's new Clinical Trials Information System (CTIS) is supported since March 2023.
- Since 2019,
ctrdata
supports several databases (through packagenodbi
, now maintained by the same author). - Various refactoring efforts, such as factoring out functionality, make downloading more robust, handle nested information, generate data frames.
- Results from the EU Clinical Trials Register are supported and imported since 2017 (included from ClinicalTrials.Gov since the beginning).
- On 15 September 2015, package
ctrdata
was first published on CRAN.
Disclaimer
When using package ctrdata
, the registers’ terms and conditions need to be respected and are shown with ctrOpenSearchPagesInBrowser(copyright = TRUE).
Please cite package ctrdata
in any publication as: “Ralf Herold (2024). ctrdata: Retrieve and Analyze Clinical Trials in Public Registers. R package version 1.18.0, https://cran.r-project.org/package=ctrdata”.
References
Package ctrdata
has been used for unpublished work and for:
- Lasch et al. (2022) The Impact of COVID‐19 on the Initiation of Clinical Trials in Europe and the United States. https://doi.org/10.1002/cpt.2534
- Blogging on Innovation coming to paediatric research
- Cancer Research UK (2017) The impact of collaboration: The value of UK medical research to EU science and health
Code example
This covers how to obtain information of trials of interest from all supported registers, for plotting their start and completion over time. For more sophisticated examples, see the Articles under Documentation above.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 | # Load our package ctrdata from the library library(ctrdata) # Connect to (or newly create) an SQLite database # - See help(nodbi) for how to store in the file # system and for connecting with other databases db <- nodbi::src_sqlite( collection = "some_collection_name" ) # Retrieve trials from public register: ctrLoadQueryIntoDb( queryterm = "https://www.clinicaltrialsregister.eu/ctr-search/search?query=neuroblastoma&phase=phase-three", euctrresults = TRUE, con = db ) # Retrieve trials from another register: ctrLoadQueryIntoDb( queryterm = 'searchCriteria={"containAll":","containAny":"neuroblastoma","containNot":"}', register = "CTIS", con = db ) # Retrieve trials from another register: ctrLoadQueryIntoDb( queryterm = "https://clinicaltrials.gov/search?cond=Neuroblastoma&aggFilters=ages:child,phase:3,studyType:int", con = db ) # Retrieve trials from another register: ctrLoadQueryIntoDb( queryterm = "https://www.isrctn.com/search?q=neuroblastoma", con = db ) # Names of all fields / variables in the collection: length(dbFindFields(".*", con = db, sample = FALSE)) # Finding fields in database collection (may take some time) . . . . . # Field names cached for this session. # [1] 3953 dbFindFields("(start.*date)|(date.*decision)", con = db) # Using cache of fields. # - Get trial data result <- dbGetFieldsIntoDf( fields = c( "ctrname", "record_last_import", # CTGOV2 "protocolSection.statusModule.startDateStruct.date", "protocolSection.statusModule.overallStatus", # EUCTR "n_date_of_competent_authority_decision", "trialInformation.recruitmentStartDate", # needs above: 'euctrresults = TRUE' "p_end_of_trial_status", # ISRCTN "trialDesign.overallStartDate", "trialDesign.overallEndDate", # CTIS "authorizedPartI.trialDetails.trialInformation.trialDuration.estimatedRecruitmentStartDate", "ctStatus" ), con = db ) # use helper packages for plotting library(dplyr) library(tidyr) library(ggplot2) # - Deduplicate trials and obtain unique identifiers # for trials that have records in several registers # - Calculate trial start date # - Calculate simple status for ISRCTN # - Update end of trial status for EUCTR result %<>% filter(`_id` %in% dbFindIdsUniqueTrials(preferregister = c("CTGOV2", "EUCTR"), con = db)) %>% rowwise() %>% mutate(start = max(c_across(matches("(date.*decision)|(start.*date)")), na.rm = TRUE)) %>% mutate(isrctnStatus = if_else(trialDesign.overallEndDate < record_last_import, "Ongoing", "Completed")) %>% mutate(p_end_of_trial_status = if_else( is.na(p_end_of_trial_status) & !is.na(n_date_of_competent_authority_decision), "Ongoing", p_end_of_trial_status)) %>% ungroup() # - Merge fields from different registers with re-leveling statusValues <- list( "ongoing" = c( # EUCTR "Recruiting", "Active", "Ongoing", "Temporarily Halted", "Restarted", # CTIS "Ongoing, recruiting", "Ongoing, recruitment ended", "Ongoing, not yet recruiting", "Authorised, not started" ), "completed" = c( "Completed", "COMPLETED", "Ended"), "other" = c( "GB - no longer in EU/EEA", "Trial now transitioned", "Withdrawn", "Suspended", "No longer available", "Terminated", "TERMINATED", "UNKNOWN", "Prematurely Ended", "Under evaluation") ) result[["state"]] <- dfMergeVariablesRelevel( df = result, colnames = c( "overall_status", "p_end_of_trial_status", "protocolSection.statusModule.overallStatus", "ctStatus", "isrctnStatus" ), levelslist = statusValues ) # - Plot example ggplot(result) + stat_ecdf(aes(x = start, colour = state)) + labs( title = "Evolution over time of neuroblastoma phase 3 trials", subtitle = "Data from EUCTR, CTIS, ISRCTN, CTGOV2", x = "Date of start (proposed or realised)", y = "Cumulative proportion of trials", colour = "Current status", caption = Sys.Date() ) |
Graph example
This example shows the plot resulting from the script above.
Data model used with ctrdata
Package ctrdata
uses the data models that are implicit in data retrieved from the different registers. The approach is further explained here, together with the reasons for this choice. A possible future development is to provide a mapping to a canonical data model, which however does not exist at the moment and will require an international approach and alliance.
Here is the model of data from CTIS for a given trial, as an example of how ctrdata
downloads, transforms and stores information from registers: