What are SDTM supplemental qualifiers?
Another blog on the 9 SDTM mapping scenarios you need to know ended on the cliffhanger of the supplemental qualifier! These are variables in non-CDISC datasets that cannot be mapped to a variable that matches the SDTM standard. If you’re interested in learning more about SDTM, you can read our blog post All you need to know about SDTM.
The SDTM has a rule that new variables cannot be added to a data domain. If a user has additional data for a domain, which cannot be entered into the domain using the standard variables, then a supplemental qualifier dataset must be used. This is a separate dataset from the “parent” domain in question, and it has a vertical structure that allows the user to add supplemental data in a “variable name – variable value” format.
Here is an SDTM dataset, DM (Demographics):
So, in DM we have the standard SDTM data:
- STUDYID – this is the Study ID
- DOMAIN – the dataset domain code
- USUBJID – the unique subject identifier
- AGE, SEX, RACE – the patient’s age, sex and race
And in addition, we have two variables that aren’t included in the SDTM – ITTFL and PPROTFL. These are the population flags Intent to Treat and Per Protocol.
In this instance, we create a supplemental qualifier dataset, SUPPDM, which is shown below. The naming convention for supplemental qualifier datasets must be adhered to. It will always begin with “SUPP” and is followed by two characters that represent the SDTM domain they were created for. So, as in this example, the supplemental qualifier for DM is SUPPDM. Creating a supplemental qualifier dataset for the domain EX (exposure) would result in SUPPEX.
Each supplemental qualifier dataset contains 10 variables. Variables in a supplemental qualifier domain are either “required” or “expected”. There are five key variables that reference a specific record in its parent domain and five Q-variables that contain the supplemental data itself.
The key variables are:
- STUDYID – this is the Study ID
- RDOMAIN – Related domain
- USUBJID – The unique subject identifier
- IDVAR – Variable which identifies the related records (usually the Sequence variable)
- IDVARVAL – The value of IDVAR (in SUPPDM, IDVAR and IDVARVAL are blank – the SDTM dataset DM contains only one observation per subject and USUBJID is sufficient enough to reference the records)
The supplemental data is:
- QNAM – The variable name
- QLABEL – The variable label
- QVAL – The data value
- QORIG – The origin (CRF/derived, etc.)
- QEVAL – The evaluator
Each domain that has non-SDTM standard variables needs a supplemental qualifier dataset. So, if there are 10 datasets that contain non-SDTM standard variables, 10 supplemental qualifier datasets would be created.
In the course of researching this blog, I stumbled upon the code list for reporting the race of a patient and noticed that the NCI preferred terms only included American Indian or Alaska Native, Asian, African American, Native Hawaiian or Other Pacific Islander, or White.
So what happens in a situation where a patient is, say, Australian Aborigine. The SDTM Implementation Guide answers this question perfectly with the following example.
DM Example 3 – Multiple Race Choices
In this example, the subject is permitted to check all applicable races.
Row 1 (DM) and Row 1 (SUPPDM): Subject 001 checked “Other, Specify” and entered “Brazilian” as race.
Row 2 (DM) and Row 2, 3, 4, 5 (SUPPDM): Subject 002 checked 3 races, including an “Other, Specify” value. The 3 values are reported in SUPPDM using QNAM values RACE1 – RACE3. The specified information describing other race is submitted in the same manner as to subject 001.
Row 3 (DM): Subject 003 refused to provide information on race.
Row 4 (DM): Subject 004 checked “Asian” as their only race.
In the SDTM dataset dm.xpt, we see the demographics of Patients 001 – 004.
Here, Patient 001 chose “Other”. Because his specified race cannot be matched to SDTM, a supplemental qualifier is created where the additional data is stored. So we can see from suppdm.xpt that the patient has specified “Brazilian”.
Patient 002 has a much more diverse background. They ticked “Black or African American,” “American Indian or Alaska Native,” and “Other”. And in the “Other” field, they have specified “Aborigine”.
Patient 003 has not answered the question, and Patient 004 has chosen the SDTM-compliant “Asian,” so their information is not included in the supplemental qualifier.
Creating SUPPQUAL datasets can be challenging and time-consuming. But, if you have datasets containing variables that cannot be mapped to standard SDTM variables and want to make them SDTM-compliant, it’s completely necessary!