Script | Output | Analysis Dataset & Variables | Selection Criteria |
---|---|---|---|
tlf-demographic.r | tlf-demographic-pilot5.out | AGE.ADSL; AGEGR1.ADSL; RACE.ADSL; HEIGHTBL.ADSL; WEIGHTBL.ADSL; BMIBL.ADSL; MMSETOT.ADSL; STUDYID.ADSL; ITTFL.ADSL; TRT01P.ADSL | ADSL.STUDYID == "CDISCPILOT01"; ADSL.ITTFL == "Y" |
tlf-efficacy.r | tlf-efficacy-pilot5.rtf | ADSL.STUDYID; ADSL.USUBJID | ADSL.ITTFL == "Y"; ADLB.TRTPN %in% c(0, 81); ADLB.PARAMCD == "GLUC"; !is.na(ADLB.AVISITN); ADLB.AVISITN == 20; !is.na(ADLB.CHG); !is.na(ADLB.BASE); ADLB.AVISITN == 0 |
tlf-kmplot.r | tlf-kmplot-pilot5.pdf | ADSL.STUDYID; ADSL.USUBJID; ADSL.TRT01A; | ADSL.SAFFL == "Y"; ADSL.STUDYID == "CDISCPILOT01"; ADTTE.PARAMCD == "TTDE"; ADTTE.STUDYID == "CDISCPILOT01" |
tlf-primary.r | tlf-primary-pilot5.rtf | ADADAS.EFFFL; ADADAS.ITTFL; ADADAS.PARAMCD; ADADAS.ANL01FL; ADADAS.TRTP; ADADAS.AVAL; ADADAS.AVISITN; ADADAS.CHG; ADADAS.TRTPN | ADAS.EFFFL == "Y"; ADAS.ITTFL == "Y"; ADAS.PARAMCD == "ACTOT"; ADAS.ANL01FL == "Y"; ADSL.EFFFL == "Y" & ADSL.ITTFL == "Y"; ADAS.AVISITN == 0; ADAS.AVISITN == 24 |
Analysis Data Reviewer’s Guide
R Consortium R Submission Pilot 5
1 Introduction
1.1 Purpose
This document provides context for the analysis datasets and terminology that benefit from additional explanation beyond the Data Definition document (define.xml). In addition, this document provides a summary of ADaM conformance findings. Section 9 provides detailed procedures for installing and configuring a local R environment.
1.2 Study Data Standards and Dictionary Inventory
Standard or Dictionary | Versions Used |
---|---|
SDTM | SDTM Implementation Guide Version 3.1.2 |
SDTM Version 1.2 | |
SDTM Controlled Terminology | CDISC SDTM Controlled Terminology, 2022-12-16 |
ADaM | ADaM-IG v1.1 |
ADaM v2.1 | |
ADaM Controlled Terminology | CDISC ADaM Controlled Terminology, 2022-06-24 |
Data Definitions | Define-XML v2.0 |
Medical Events Dictionary | MedDRA version 8.0 |
1.3 Source Data Used for Analysis Dataset Creation
The ADaM datasets were derived from SDTM version 1.2. For traceability, the SDTM is publicly available at the PHUSE Github Repository.
Which can be traced back to the original CDISC SDTM & ADaM Pilot Project.
2 Protocol Description
2.1 Protocol Number and Title
- Protocol Number: CDISCPilot1
- Protocol Title: Safety and Efficacy of the Xanomeline Transdermal Therapeutic System (TTS) in Patients with Mild to Moderate Alzheimer’s Disease
The reference documents can be found here.
2.2 Protocol Design in Relation to ADaM Concepts
2.2.1 Objectives:
The objectives of the study were to evaluate the efficacy and safety of transdermal xanomeline, 50cm2 and 75cm2, and placebo in subjects with mild to moderate Alzheimer’s disease.
2.2.2 Methodology:
This was a prospective, randomized, multi-center, double-blind, placebo-controlled, parallel-group study. Subjects were randomized equally to placebo, xanomeline low dose, or xanomeline high dose. Subjects applied 2 patches daily and were followed for a total of 26 weeks.
2.2.3 Number of Subjects Planned:
300 subjects total (100 subjects in each of 3 groups)
2.2.4 Study schema:
4 Analysis Data Creation and Processing Issues
4.1 Split Datasets
There were no datasets that required splitting due to size constraints.
4.2 Data Dependencies
Analysis Dataset | Dependent on Following Analysis Datasets |
---|---|
ADAE | ADSL |
ADTTE | ADSL, ADAE |
ADADAS | ADSL |
ADLBC | ADSL |
4.3 Intermediate Datasets
No intermediate datasets were created for this trial.
5 Analysis Dataset Descriptions
5.1 Overview
The following provides detailed information for each analysis dataset included in the Pilot 3 submission, which were used to generate the outputs in Pilot 1. These ADaM datasets are ADSL, ADAE, ADTTE, ADADAS, ADLBC.
5.2 Analysis Datasets
Dataset - Dataset Label |
Class | Efficacy | Safety | Baseline or other subject characteristics |
Primary Objective |
Structure |
---|---|---|---|---|---|---|
ADSL - Subject-Level Analysis Dataset | SUBJECT LEVEL ANALYSIS DATASET | x | One record per subject | |||
ADADAS - ADAS-COG Analysis Dataset | BASIC DATA STRUCTURE | x | x | One or more records per subject per analysis parameter per analysis timepoint | ||
ADAE - Adverse Events Analysis Dataset | OCCURRENCE DATA STRUCTURE | x | One record per subject per adverse event | |||
ADLBC - Analysis Dataset Lab Blood Chemistry | BASIC DATA STRUCTURE | x | One or more records per subject per analysis parameter per analysis timepoint | |||
ADTTE - AE Time To 1st Derm. Event Analysis | BASIC DATA STRUCTURE | x | x | One or more records per subject per analysis parameter per analysis timepoint |
5.2.1 ADSL - Subject-Level Analysis Dataset
The subject level analysis dataset (ADSL) contains required variables for demographics, treatment groups, and population flags. In addition, it contains other baseline characteristics that were used in both safety and efficacy analyses. All patients in DM were included in ADSL. The following are the key population flags are used in analyses for patients:
SAFFL – Safety Population Flag (all patients having received any study treatment)
ITTFL – Intent-to-Treat Population Flag (all randomized patients)
5.2.2 ADADAS - ADAS-COG Analysis Dataset
ADADAS contains analysis data from the ADAS-Cog questionnaire, one of the primary efficacy endpoints. It contains one record per subject per parameter (ADAS-Cog questionnaire item) per VISIT. Visits are placed into analysis visits (represented by AVISIT and AVISITN) based on the date of the visit and the visit windows.
5.2.3 ADAE - Adverse Events Analysis Dataset
ADAE contains one record per reported event per subject. Subjects who did not report any Adverse Events are not represented in this dataset. The data reference for ADAE is the SDTM AE (Adverse Events) domain and there is a 1-1 correspondence between records in the source and this analysis dataset. These records can be linked uniquely by STUDYID, USUBJID, and AESEQ. Events of particular interest (dermatologic) are captured in the customized query variable (CQ01NAM) in this dataset. Since ADAE is a source for ADTTE, the first chronological occurrence based on the start dates (and sequence numbers) of the treatment emergent dermatological events are flagged (AOCC01FL) to facilitate traceability between these two analysis datasets.
5.2.4 ADLBC - Analysis Dataset Lab Blood Chemistry
ADLBC contains one record per lab analysis parameter, per time point, per subject. ADLBC contains lab chemistry parameters and these data are derived from the SDTM LB (Laboratory Tests) domain. Two sets of lab parameters exist in ADLBC. One set contains the standardised lab value from the LB domain and the second set contains change from previous visit relative to normal range values. In some of the summaries the derived end-of-treatment visit (AVISITN=99) is also presented.
5.2.5 ADTTE - AE Time To 1st Derm. Event Analysis
ADTTE contains one observation per parameter per subject. ADTTE is specifically for safety analyses of the time to the first dermatologic adverse event. Dermatologic AEs are considered an adverse event of special interest. The key parameter used for the analysis of time to the first dermatological event is with PARAMCD of “TTDE”.
6 Data Conformance Summary
6.1 Conformance Inputs
Were the analysis datasets evaluated for conformance with CDISC ADaM Validation Checks?
Yes, Version of CDISC ADaM Validation Checks and software used: Pinnacle 21®
Community 4.0.2
Were the ADaM datasets evaluated in relation to define.xml?
Yes
Was define.xml evaluated?
Yes
6.2 Issues Summary
Check ID | Diagnostic Message | Dataset | Count (Issue Rate) | Explanation |
---|---|---|---|---|
AD1012 | Secondary custom variable is present but its primary variable is not present | ADSL | 1 (50.00%) | This is a Sponsor Extension to the ADaM Model. The VISNUMEN [End of Trt Visit (Vis 12 or Early Term.)] variable is a integer variable which is not related to any character variable. |
6.3 QC Findings and Common Issues
In this Pilot 3 study, our focus was to create a subset of ADaMs based on the CDSICPILOT data, using R. We compared our R generated ADaMs against the CDISCPILOT ADaMs, created in SAS, as a QC step. With these comparisons we listed the QC Findings with explanations as to why these findings exist. We also came across common issues throughout the ADaM generation process, which could be helpful for improvements utilising the CDISC Pilot data in the future. More details can be found in the appendix (Appendix 2 and Appendix 3).
7 Submission of Programs
7.1 Description
The sponsor has provided all programs for analysis results. They are all created on a Linux platform using R version 4.4.3.
7.2 ADaM Programs
The following table contains the list of programs that generate the analysis datasets in Pilot 3. It shows the program file name, the analysis dataset name and the label of the analysis dataset. The recommended steps to execute the analysis results using R are described in the Appendix.
Program Name | Analysis Dataset Name | Analysis Dataset Label |
---|---|---|
adsl.r | adsl.json | Subject-Level Analysis Dataset |
adadas.r | adas.json | ADAS-Cog Analysis |
adlbc.r | adlb.json | Analysis Dataset Lab Blood Chemistry |
adae.r | adae.json | Adverse Events Analysis Dataset |
adtte.r | adtte.json | AE Time to 1st Derm. Event Analysis |
7.3 Analysis Output Programs
The following table contains a list of programs that generate outputs used in the R consortium R submission Pilot 1. These outputs were rerun in Pilot 3 using the analysis datasets generated by the Dataset-JSON programs. It shows the program file names, the related outputs, the input datasets and variables used, and any data selection criteria that need to be applied per Pilot 1.
For reference, below is a description of the analysis programs utilized and outputs generated in Pilot 1.
Program Name | Output Table Number | Title |
---|---|---|
tlf-demographic.r | Table 14-2.01 | Summary of Demographic and Baseline Characteristics |
tlf-primary.r | Table 14-3.01 | Primary Endpoint Analysis: ADAS Cog (11) - Change from Baseline to Week 24 - LOCF |
tlf-efficacy.r | Table 14-3.02 | ANCOVA of Change from Baseline at Week 20 |
tlf-kmplot.r | Figure 14-1 | KM plot for Time to First Dermatologic Event: Safety population |
7.4 Open-source R Packages
Package | Version | Description |
---|---|---|
admiral | 1.3.0 | This R package provides tools for creating Clinical Data Interchange Standards Consortium (CDISC) compliant Analysis Data Model (ADaM) datasets, essential for submissions to the United States FDA, following the guidelines of the CDISC Analysis Data Model Implementation Guide. |
cowplot | 1.2.0 | This package offers tools for enhancing 'ggplot2' with themes, plot alignment, complex figure arrangement, annotations, and image mixing, originally created for the Wilke lab and featured in the book "Fundamentals of Data Visualization." |
diffdf | 1.1.1 | This package offers tools to comprehensively compare two data frames, detailing their differences and providing utilities to identify sources of discrepancies. |
dplyr | 1.1.4 | The package provides a robust and consistent toolset for managing and manipulating data frame-like structures efficiently, both in-memory and out-of-memory. |
emmeans | 1.11.2 | The package provides tools to obtain estimated marginal means (EMMs) for a variety of linear, generalized linear, and mixed models, along with functions to perform contrasts, trend analysis, and comparisons of slopes, as well as visualization options. |
ggplot2 | 3.5.2 | The package provides a declarative approach to creating graphics by allowing users to map data variables to aesthetics and specify graphical primitives, automating the intricate details based on the principles of "The Grammar of Graphics." |
haven | 2.5.5 | The package facilitates importing foreign statistical file formats into R by leveraging the 'ReadStat' C library. |
lubridate | 1.9.4 | The 'lubridate' package provides tools for fast and user-friendly parsing, extraction, updating, and algebraic manipulation of date-time and time-span objects in R. |
metacore | 0.2.0 | The package provides an immutable container for metadata to enhance programming activities and functionality within the clinical programming workflow. |
metatools | 0.1.6 | This package utilizes metadata information from 'metacore' objects to validate and construct metadata-related columns. |
pharmaRTF | 0.1.4 | This package provides an enhanced RTF wrapper for R tables created with packages like 'Huxtable' or 'GT', allowing the addition of metadata and features essential for regulatory reports, such as multiple levels of titles, footnotes, landscape formatting, and margin control. |
r2rtf | 1.1.4 | This package facilitates the creation of production-ready Rich Text Format (RTF) tables and figures with customizable formatting options. |
rtables | 0.6.13 | The 'rtables' package provides a framework for creating complex, multi-level reporting tables with hierarchical, tree-like structures, enabling advanced data tabulation, grouping, and contextual summary computations. |
stringr | 1.5.1 | The package provides a uniform, user-friendly set of wrappers for the 'stringi' package, ensuring consistent function and argument usage, seamless handling of "NA" values and zero length vectors, and facilitating easy integration between functions. |
tidyr | 1.3.1 | The package "tidyr" provides tools for restructuring and cleaning data into a tidy format, with capabilities for pivoting, nesting, unnesting, handling nested lists, string extraction, and managing missing values. |
Tplyr | 1.2.1 | The package is designed to streamline data manipulation processes for generating clinical summaries, with a focus on traceability. |
visR | 0.3.1 | This package provides fit-for-purpose, reusable visualizations and tables tailored for clinical and medical research, incorporating sensible defaults and following established graphical principles. |
xportr | 0.4.3 | The package provides tools to create CDISC-compliant datasets and verify their compliance with CDISC standards. |
datasetjson | 0.3.0 | The package provides tools for reading, constructing, writing, and validating CDISC Dataset JSON files according to the Dataset JSON schema standards set by CDISC. |
8 Directory Structure
Study datasets and the R programs are organized in accordance to Study Data Technical Conformance Guide.
├── m1
│ └── us
│ └── cover-letter.pdf
└── m5
└── datasets
└── rconsortiumpilot5
├── analysis
│ └── adam
│ ├── datasets
│ │ ├── adadas.json
│ │ ├── adae.json
│ │ ├── adlbc.json
│ │ ├── adrg.pdf
│ │ ├── adsl.json
│ │ └── adtte.json
│ └── programs
│ ├── adadas.r
│ ├── adae.r
│ ├── adlbc.r
│ ├── adsl.r
│ ├── adtte.r
│ ├── pilot5-helper-fcns.r
│ ├── renv-lock.txt
│ ├── tlf-demographic.r
│ ├── tlf-efficacy.r
│ ├── tlf-kmplot.r
│ └── tlf-primary.r
└── tabulations
└── sdtm
├── ae.json
├── cm.json
├── dm.json
├── ds.json
├── ex.json
├── lb.json
├── mh.json
├── qs.json
├── relrec.json
├── sc.json
├── se.json
├── suppae.json
├── suppdm.json
├── suppds.json
├── supplb.json
├── sv.json
├── ta.json
├── te.json
├── ti.json
├── tv.json
└── vs.json
9 Appendix 1: Pilot 5 R Environment Installation and Usage
To execute the R programs included in this Pilot, follow all of the procedures below. Ensure that you note the location of where you downloaded the Pilot 5 eCTD submission files. For demonstration purposes, the procedures below assume the transfer has been saved to this location: C:\pilot5
.
In addition, create a new directory to hold the unpacked Pilot 5 data files and associated programs. For demonstration purposes, the procedures below assume the new directory is this location: C:\pilot5-files
.
9.1 Installation of R and RStudio
Download and install R 4.4.3 for Windows from https://cloud.r-project.org/bin/windows/base/old/4.4.3/R-4.4.3-win.exe.
Download and install RStudio for Windows from https://posit.co/download/rstudio-desktop/#download.
9.2 Installation of Rtools
Due to certain R packages requiring compilation from source, it is also required that you install the Rtools Windows utility from CRAN. You can download Rtools built for R version 4.4.3
by visiting https://cloud.r-project.org/bin/windows/Rtools/rtools44/files/rtools44-6459-6401.exe. During the installation procedure, keep the default choices in the settings presented in the installation dialog.
Once the installation is complete, launch a new R session (if you have an existing session open, close that session first) and in the console, run the following command, Sys.which("make")
to verify that the installation of Rtools was successful:
> Sys.which("make")
1] "C:\\rtools44\\usr\\bin\\make.exe" [
9.3 Initialize R Program Execution Environment
The dependencies for executing the R programs are managed by the renv
R package management system. To bootstrap the customized R package library, launch a new R session in the directory where you unpacked the source files in the previous step.
Launching RStudio
Create a new RStudio project within the pilot5-files
directory using the following procedure:
- Launch RStudio
- Select
File -> New Project
- In the Create Project dialog box, choose Existing Directory
- In the Create Project from Existing Directory dialog box, click the Browse button and navigate to the
C:\pilot5-files
directory. - Once the location has been confirmed, click the Create Project button. A new directory called
.Rproj.user
and the project filepilot5-files.Rproj
will appear in the directory.
It is possible that the .Rproj.user
folder may not have generated for you or or may not be visible as it is a hidden folder. If so, this is fine as it will not be necessary in order to run the R programs below.
9.4 Installation of R Packages
A minimum set of R packages are required to ensure the Pilot 5 R programs can be executed correctly. Use the following procedure to configure the Pilot 5 R package environment:
- Run the following commands in the R console to install the
remotes
andrenv
packages:
install.packages("remotes")
# install version 1.1.4 of the renv package:
::install_version("renv", version = "1.1.4") remotes
- If you receive a warning showing “cannot open URL https://cran.rstudio.com/src/contrib/PACKAGES‘”, this is due to the default RStudio option ‘Use secure download method for HTTP’. In RStudio, go to Tools → Global Options → Packages, then uncheck the ‘Use secure download method for HTTP’ option, then retry the installation.
If not already set, please verify that the working directory is already set to the project folder:
- Run the following command in the R console:
getwd()
- If the output of this command does not match
C:\pilot5-files
, run the following command to set the working directory:setwd("C:/pilot5-files")
- Move the
renv-lock.txt
file to the root project directory and rename the file torenv.lock
by typing the following command in the R console:
file.copy(
"C:/pilot5-files/m5/datasets/rconsortiumpilot5/analysis/adam/programs/renv-lock.txt",
"C:/pilot5-files/renv.lock"
)
- Restart the R Session in RStudio using the following methods:
- Select
Session -> Restart R
- Within the new R session, run the following command in the R console:
::init() renv
The function will prompt you to make a choice due to the lockfile being present. Enter 1
in the console to choose Restore the project from the lockfile.
- To install the packages managed by
renv
, run the following command in the R console:
::restore(prompt = FALSE) renv
Due to certain R packages requiring compilation from their source versions, the entire package restoration procedure may require at least ten minutes or longer to complete depending on internet bandwidth and your computer’s hardware profile.
After all packages have been installed, you should Restart your Session.
- Select
Session -> Restart R
A similar message should appear in your console. This indicates that your R Session is synced to all Pilot 5 packages needed to reproduce the Pilot 5 analysis.
Restarting R session...- Project 'C:/pilot5-files' loaded. [renv 1.1.4]
9.5 Execute R Programs
To reproduce the analysis results from the JSON transport files, set up and run the following programs in the order below:
- Setting up .Rprofile
Edit the .Rprofile
file created in the working directory to match the following contents:
source("renv/activate.R")
Sys.setenv(RENV_DOWNLOAD_FILE_METHOD = "libcurl")
# File locations
<- list(
path sdtm = file.path(getwd(), "m5/datasets/rconsortiumpilot5/tabulations/sdtm"),
adam = file.path(getwd(), "m5/datasets/rconsortiumpilot5/analysis/adam/datasets"),
output = file.path(getwd(), "m5/datasets/rconsortiumpilot5/analysis/adam/programs"),
adam_json = file.path(getwd(), "m5/datasets/rconsortiumpilot5/analysis/adam/datasets"),
programs = file.path(getwd(), "m5/datasets/rconsortiumpilot5/analysis/adam/programs")
)
- Restart R Session
- Select
Session -> Restart R
- This will ensure that the list of paths in your Global Environment is populated.
Double check that path object has been created in your Global Environment using the following code exists("path")
.
You should receive the following message in your console:
> exists("path")
1] TRUE [
- Using the source function, run the
pilot5-helper-fcns.r
program, which will load all helper functions for datasets and displays into your global environment.
source(file.path(path$programs, "pilot5-helper-fcns.r"))
- Convert sdtm JSON files to rds files. The sdtm files are in json transport file format and need to be converted to rds files to run the ADaM programs.
Run the following code:
<- list.files(
sdtm_files path = file.path(path$sdtm),
pattern = "\\.json$",
full.names = TRUE
)
convert_json_to_rds(sdtm_files, output_dir = file.path(path$sdtm))
- Execute ADaM programs as seen in the order below:
adsl.r
adadas.r
adae.r
adlbc.r
adtte.r
You can use the following command to quickly execute each ADaM dataset. Just change the name of the dataset in the command. Rds files will be created for each ADaM in the adamdata
folder and in your global environment.
source(file.path(path$programs, "adsl.r"))
- Execute Display programs as seen in the order below:
tlf-demographic.r
tlf-efficacy.r
tlf-kmplot.r
tlf-primary.r
Similarly to the ADaMs, you can run this command to quickly execute the display programs. The newly run display outputs will be available in the pilot5-tlfs
folder.
source(file.path(path$programs, "tlf-demographic.r"))
10 Appendix 2
Cross-check if anything has changed from Pilot 3 to Pilot 5 for QC Findings https://github.com/RConsortium/submissions-pilot5-datasetjson/wiki/QC-Findings