7 Type 1 Input Files

The documentation pages linked below provide all the information needed for building a Sentinel Routine Querying System package using the Calculate Background Rates cohort identification strategy. Table 7.1 (a) lists required input files, Table 7.1 (b) lists optional files used for reporting, and Table 7.1 (c) lists additional optional files. Specifications for each listed file can be found in Section 7.1.2.

Note: to read the documentation in logical order, make selections from left to right.

Table 7.1: Type 1 Input Files

(a) Type 1 Input Files—Required

Cohort File	Type 1 File	Monitoring File
Cohort Codes File	User-defined Strata Levels Lookup Table	QRP Parameters File

(b) Type 1 Input Files—Optional Reporting Files

Report Parameters File

Inclusion/Exclusion Codes File	Covariate Codes File	Risk Score File
Utilization File	Stockpiling File	Most Frequent Utilization File
Combo Codes Input File	Combo Input File	Combo Stockpiling File

7.1 Type 1 Input File Specifications

7.1.1 Cohort File

The Cohort File is required. It is used to define enrollment and demographic requirements, select the type of cohort identification strategy for the request, and indicate if extraction should be restricted to individuals for whom medical records may be requested. Table 7.2 contains detailed specifications for this file.

Table 7.2: Cohort Codes File Specifications

Parameter	Field Name	Description
Name of Cohort (Scenario)	COHORTGRP	Standardized name used to differentiate cohorts. Notes Multiple cohorts can be defined within the same Cohort File. In this case all cohorts are queried independently and results are reported separately and labeled using each COHORTGRP name specified. COHORTGRP is the primary key linking cohorts across input files; COHORTGRP values in the COHORTFILE must match (including case) GROUP values in other input files. COHORTGRP values must remain consistent during the course of a surveillance activity. It is recommended to limit the value of COHORTGRP to a maximum of 20 characters. No special characters (e.g., commas, periods, hyphens, spaces, etc.) allowed, and underscores must be used to mark spaces. COHORTGRP values must be lowercase. Format: SAS character $40 Example: `insulin`
Coverage Type Requirement	COVERAGE	Indicates medical and drug coverage type requirements for the cohort. Valid values are: M: only enrollment spans with at least medical coverage should be considered by the QRP algorithm D: only enrollment spans with at least drug coverage should be considered by the QRP algorithm MD: only enrollment spans with both medical and drug coverage should be considered by the QRP algorithm (default value) Notes Users must specify multiple COHORTGRPs if different COVERAGE requirements are needed. Rhe type of coverage specified is used when creating continuous enrollment periods and assessing cohort eligibility requirements. If the COVERAGE value is left blank, or contains invalid values—i.e., values other than `M`, `D`, or `MD`—the QRP algorithm will consider only enrollment spans with both medical and drug coverage by default. Format: SAS character $2 Example: `MD`
Enrollment Gap	ENROLGAP	Sets the number of days that will be bridged between two consecutive enrollment periods to create a ‘continuously enrolled’ period. For example, if ENROLGAP=30 and a member is eligible for medical and drug coverage in periods 1/1/2007-3/27/2007 and 4/1/2007-12/21/2007 (i.e., a 4-day gap between two consecutive enrollment episodes), the member will be considered continuously enrolled from 1/1/2007 to 12/21/2007. Any gaps in enrollment greater than 30 days will result in a new enrollment period, and all the days in the gap will be considered un-enrolled. Notes A gap of 45-days is recommended for most uses. Multiple continuous enrollment periods per member may be assessed. Format: Numeric Example: `45` (gaps less than or equal to 45 days will be ‘bridged’ to form one ‘continuously enrolled’ sequence)
Minimum Pre-Index Enrollment Days	ENRDAYS	Optional parameter to specify the number of days of continuous enrollment required before the index date. Notes ENRDAYS should be left blank for a Type 4 query and T4PREGENRDAYS (below) should be specified, instead. For all other query Types, if left blank, ENRDAYS will default to 0. ENRDAYS is a timeline label. If ENRDAYS = 1, then the member must be enrolled at least 1 day prior to the index date. This parameter requires users to specify enrollment criteria that is greater or equal in duration than any washout period, covariate evaluation window, exclusion criteria, minimum cumulative dose look back window, most-frequent utilization window, secondary episode observation windows, or HDPS window specified. The value of ENRDAYS must meet the following criteria. If any criteria is violated, QRP execution will abort: If a Type 1 File is specified: ENRDAYS ≥ T1WASHPER If a Type 2 file is specified: ENRDAYS ≥ T2WASHPER and ENRDAYS ≥ T2FUPWASHPER and ENRDAYS ≥ T2CUMDOSEPER If a Type 3 File is specified: ENRDAYS ≥ T3WASHPER and ENRDAYS ≥ T3FUPWASHPER (if control window is after exposure) and ENRDAYS ≥ \|T3CTRLFROM- T3FUPWASHPER\| (if control window is before exposure) If a Type 5 File is specified: ENRDAYS ≥ T5WASHPER If a Type 6 File is specified: ENRDAYS ≥ T6WASHPER If an Inclusion Codes File is specified: For Types 1, 2, 3, 5, 6: ENRDAYS ≥ \|CONDFROM\| (when CONDFROM <0, if specified) If a Covariate Codes File is specified: ENRDAYS ≥ \|COVFROM\| (when COVFROM <0) If a Most Frequent Utilization File is specified: ENRDAYS ≥ \|MFUFROM\| (when MFUFROM <0) If the PSA module is being utilized and a HDPS analysis is requested: ENRDAYS ≥ \|HDPSWINFROM\| (when HDPSWINFROM <0) If a Multiple Events File is specified: ENRDAYS ≥ \|OBSFROM\| (when OBSFROM <0 and OBSFROMANCHOR = Index) The following criteria is NOT applied to the primary episode and will not cause QRP execution to abort, but is applied when computing the Multiple Event episodes. If a primary episode doesn’t meet this criteria, it is included in the original CIDA cohort, but is not considered for the Multiple Events cohort: ENRDAYS ≥ \|OBSFROM\| + length of episode (when OBSFROM <0 and OBSFROMANCHOR = EpisodeEnd) If an Overlap File is specified: ENRDAYS ≥ \|OBSFROM\| (when OBSFROM <0 and OBSFROMANCHOR = Index) The following criteria is not applied to the primary episode and will NOT cause QRP execution to abort, but is applied when computing the Overlap episodes. If a primary episode doesn’t meet this criteria, it is included in the original CIDA cohort, but is not considered for the Overlap cohort: ENRDAYS ≥ \|OBSFROM\| + length of episode (when OBSFROM <0 and OBSFROMANCHOR = EpisodeEnd) In a manufacturer-level product utilization and switching patterns cohort identification strategy (Type 6 analysis), ENRDAYS is only assessed when computing incident episodes. Prevalent episodes do not have an enrollment requirement. In a signal identification analysis (one in which ANALYSIS=TREE), then In a Type 2 analysis performed in conjunction with a propensity-score matched analysis, ENRDAYS ≥ TREEWASHPER) In a Type 3 (self-control risk interval design) analysis, ENRDAYS ≥ TREEWASHPER (if observation window is after exposure) and ENRDAYS ≥ \|T3CTRLFROM-TREEWASHPER\| (if observation window is before exposure) Format: Numeric Example: `365`
Maternal Enrollment Days in Relation to Estimated Pregnancy Start	T4PREGENRDAYS	Type 4 parameter to specify the number of days the mother is required to be enrolled in relation to the estimated pregnancy start. The mother must be continuously enrolled from then until the date of pregnancy outcome (live birth delivery date or non-live birth pregnancy outcome date). T4PREGENRDAYS = -90 requires enrollment from 90 days prior to estimated start until the pregnancy outcome T4PREGENRDAYS = 0 requires enrollment from the estimated pregnancy start to the pregnancy outcome T4PREGENRDAYS = 196 requires enrollment from 196 days after estimated pregnancy start until the pregnancy outcome. In this case, mothers are not required to be enrolled during the first two trimesters. Notes T4PREGENRDAYS is a timeline label and is bi-directional. if the length of gestational period is shorter than the T4PREGENRDAYS period, then the mother is only required to be enrolled on the day of the pregnancy outcome. If no pre-outcome enrollment is required, then T4PREGENRDAYS should be set to missing; no enrollment will be enforced. T4PREGENRDAYS must cover the entire evaluation window (prior to the pregnancy outcome) for all specified inclusion/exclusion criteria (CONDFROM), covariates (COVFROM), high dimensional propensity score (HDPSWINFROM) and most frequent utilization (MFUFROM) criteria. For inferential queries, enrollment will be assessed during the exposure assessment period and any pre-pregnancy outcome HOI assessment period. The patient will be excluded if enrollment is not met. This parameter requires users to specify enrollment criteria that is greater or equal in duration than any washout period, covariate evaluation window, exclusion criteria, minimum cumulative dose look back window, most-frequent utilization window, secondary episode observation windows, or HDPS window specified. The value of T4PREGENRDAYS must meet the following criteria. If any criteria is violated, QRP execution will abort: If a Covariate Codes File is specified: T4PREGENRDAYS ≥ \|COVFROM\| (when COVFROM <0) If a Most Frequent Utilization File is specified: T4PREGENRDAYS ≥ \|MFUFROM\| (when MFUFROM <0) If the PSA module is being utilized and a HDPS analysis is requested: T4PREGENRDAYS ≥ \|HDPSWINFROM\| (when HDPSWINFROM <0) Format: Numeric Example: `-180`
Minimum Post-Index Required Days	REQDAYSAFTIND	Optional parameter to specify the number of days a patient is required to be in the data after the index date. Note that the program does not, by default, require patients to be in the data if the assessment period for covariates, exclusion criteria, most frequent utilization analyses, or high dimensional propensity score calculation extend beyond the index date. If requiring patients to be in the data post-index is desired, REQDAYSAFTIND must be specified for the appropriate duration. Notes May be left blank if patients are not required to be in the data post-index. REQDAYSAFTIND is a timeline label. If REQDAYSAFTIND = 1, then the member must be in the data the day after the index date Post-index requirement is assessed after censoring due to evidence of death, disenrollment, end of study period, and end of data, when applicable. If conducting a type 4 analysis, REQDAYSAFTIND will specify the number of days mothers and linked infants are required to be in the data after the pregnancy outcome. Format: Numeric Example: `183`
Cohort Identification Strategy Indicator	TYPE	Indicates cohort identification strategy to be performed. Valid values are: 1: Type 1 (background rate cohort identification strategy) 2: Type 2 (exposures and follow-up time cohort identification strategy) 3: Type 3 (self-controlled risk interval design cohort identification strategy) 4: Type 4 (pregnancy episodes cohort identification strategy) 5: Type 5 (medical product utilization cohort identification strategy) 6: Type 6 (product utilization and switching analysis) Notes The corresponding Type File must be specified and included in the program package. For example, if TYPE = 1, a TYPE1FILE must be specified and included in the program package. Format: Numeric Example: `1`
Chart Availability Restriction Indicator	CHARTRES	Indicates if extraction should exclude members for whom medical charts cannot be requested for the entire study period. Valid values are: Y: exclude the members for whom the charts cannot be requested in the entire study period N: do not exclude the members for whom the charts cannot be requested in the entire study period Notes If CHARTRES=`Y` the program will exclude individuals with at least one enrollment span with the SCDM variable Chart=`N` during the study period. Format: SAS character $1 Example: `N`
Sex criteria to apply to cohort	SEX	Optional parameter to restrict cohort to only specified Sex values. Blank will ensure that all Sex values are included in analyses with the exception of Type 4 analyses. Valid values are: A: ambiguous F: female M: male U: unknown Notes Valid values will be separated by a space. For a Type 4 analysis, if SEX is left blank, the pregnant cohort (and non-pregnant cohort if applicable) will be restricted to individuals with SEX = Female. Linked infants will not be restricted by sex. Restriction on sex of the infant should be made in the MILCOHORT file. Format: SAS character $7 Example: `F M A U`
Race criteria to apply to cohort	RACE	Optional parameter to restrict cohort to only specified Race values. Blank will ensure that all Race values are included in analyses. Valid values are: 0: Unknown 1: American Indian or Alaska Native 2: Asian 3: Black or African American 4: Native Hawaiian or Other Pacific Islander 5: White M: Multi-racial Notes Valid values will be separated by a space. For a Type 4 analysis, restriction by RACE values does not ensure that identification of non-pregnant matched episodes is performed within the values specified for RACE. Format: SAS character $13 Example: `2 3 4`
Hispanic criteria to apply to cohort	HISPANIC	Optional parameter to restrict cohort to only specified Hispanic values. Blank will ensure that all Hispanic values are included in analyses. Valid values are: N: no U: unknown Y: yes Notes Valid values will be separated by a space. For a Type 4 analysis, restriction by HISPANIC values does not ensure that identification of non-pregnant matched episodes is performed within the values specified for HISPANIC. Format: SAS character $5 Example: `Y N`
Age Groups	AGESTRAT	Age group categories for reporting. Specifying this parameter will (1) restrict to certain age groups and (2) specify how age groups will be stratified in result tables. For example, to have results stratified by 20 year increments for members 40-99 years of age, enter AGESTRAT=40-59 60-79 80-99. Valid Values are: D: days W: weeks Q: quarters M: months Y: years (default value) Notes For Type 1, Type 2, Type 3, Type 5, and Type 6 analyses, age is calculated at index date. For Type 4 analysis, age is calculated at date of pregnancy outcome. If no unit of time is specified, AGESTRAT will default to years. Lower value is binding. If AGESTRAT=0-5 5-10, then all 5 year olds will be placed in the second age group. If AGESTRAT=0-5 6-10, then all 5 year olds will be placed in the first age group. For example, to have results stratified by 6 month increments for the first two years of life and then by 2 year increments until the age of 6, AGESTRAT = 00M-05M 06M-11M 12M-17M 18M-23M 02Y-03Y 04Y-05Y needs to be entered. Using an open ended age category (e.g., 85+) imposes an age ceiling of 110 years. If age >110 is desired, the final age category ceiling must be specified (e.g., 85-125). Age groups must be mutually exclusive (i.e., non overlapping). When constructing age categories that only include one age, the lower and upper values are equal. For example, 00M-<01M, 01M-<02M, 02M-<03M, should be specified as 00M-00M 01M-01M 02M-02M For PSA, age groups should be the same for both the exposure and control COHORTGRPs. If left blank, AGESTRAT will default to 00-01 02-04 05-09 10-14 15-18 19-21 22-44 45-64 65-74 75+ in years. Format: SAS character $100 Example: `40-59 60-79 80-99`
Produce Baseline Table	CREATEBASELINE	Indicates whether to produce a baseline table for corresponding COHORTGRP. Valid values are: Y: baseline table will be produced for corresponding COHORTGRP N: baseline table will NOT be produced for corresponding COHORTGRP Notes CREATEBASELINE must be “Y” for any group specified in the PS Estimation file or the Covariate Stratification file. If requesting stratification by a COVARNUM, then CREATEBASELINE must be “Y” for all groups for which the user requires to report that stratification. For COHORTGRPs where CREATEBASELINE = Y, enrollment is enforced for the length of the baseline characteristic lookback period. For COHORTGRPs where CREATEBASELINE = N, enrollment is NOT enforced for the length of the baseline characteristic lookback period. If conducting a Type 4 non-inferential analysis without Mother Infant Linkage (MIL), baseline characteristics are captured for each cohort (i.e., scenario). In order to obtain characteristics for patients exposed to a particular medical product, inclusion criteria for that exposure must be applied to a scenario. If conducting a Type 4 non-inferential analysis with MIL or an inferential analysis (with or without MIL), baseline characteristics are captured for each medical product exposure group. Format: SAS character $1 Example: `Y`

7.1.2 Type 1 File

The Type 1 File is optional and its specification is only required for a background rate calculation cohort identification strategy. Options include selecting the number of events an individual can contribute to the request,the number of days before index date to assess incidence criteria, whether to truncate enrollment at death date, and whether to output a table characterizing reason for censoring eligibility. Table 7.3 contains detailed specifications for this file.

Table 7.3: Type 1 File Specification

Parameter	Field Name	Description
Name of Cohort	GROUP	Standardized name used to differentiate cohorts. Notes Multiple cohorts can be defined within the same Type 1 File. In this case all cohorts are queried independently and results are reported separately and labeled using each GROUP name specified. GROUP is the primary key linking cohorts across input files; GROUP values must match (including case) between the TYPE1FILE and other input files. GROUP value must begin with letter or underscore. GROUP values must be lowercase. Format: SAS character $40; no special characters (e.g., commas, periods, hyphens, spaces, etc.) allowed, and underscores must be used to mark spaces. Example: `insulin`
Allowed Number of Index Dates per Individual	T1COHORTDEF	Indicates how many index dates an individual can contribute. Options include: 01: Cohort includes only the first valid index date per individual during the query period. 02: Cohort includes all valid index dates per individual during the query period. Notes T1COHORTDEF parameter is used in conjunction with the T1WASHPER variable (below) to define valid index date(s). Format: SAS character $2 Example: `01`
Type 1 Index Washout Period	T1WASHPER	Length of washout period in days. The washout period is a period before an index date during which an individual cannot have evidence of index-defining criteria (see Cohort Codes File specification for additional details on index-defining criteria). Notes (special case) When T1WASHPER = missing the program requires ENRDAYS of continuous enrollment but only considers an exposure/event valid if, at index date, the member has no evidence of the exposure in their entire available enrollment history Format: Numeric Example: `365`
Censor Enrollment at Evidence of Death	CENSOR_DTH	Indicates if enrollment should be censored based on death date. Allowable values are `Y` and `N`. Date of death can be determined two ways: Where `Discharge_Status` = `EX` in the SCDM Encounter table, date of death is set to discharge date. Using death date in the SCDM Death table for records where `Confidence` = `E`. Notes In cases where a death date appears in both the Death and Encounter tables, the date from the Death table will be used. Censoring is implemented by restricting enrollment eligibility. Member eligibility is truncated at death date. Once a death date is observed, a member can no longer contribute eligible periods (even if they are observed in the data). Format: SAS character $1 Example: `Y`
Categories for Follow-up Time	CENSOR_OUTPUT_CAT	Indicates ranges (in days) for stratification variable CENSDAYS_VALUE in [RUNID]_censor_cida.sas7bdat output. Notes Leave blank if only continuous values of CENSDAYS_VALUE are desired. If this field is left blank, output stratified by CENSDAYS_VALUE in [RUNID]_censor_cida will have one category that includes all values of CENSDAYS_VALUE. For each cohort in the TYPE1FILE, CENSOR_OUTPUT_CAT must specify the same value. Format: SAS character—length can vary Example: `0-364 365-729 730-1094 1095+`
Output Denominator Indicator	OUTPUTDENOM	Indicates whether to calculate denominator variables DENNUMPTS (number of patients eligible to have at least one index date) and DENNUMMEMDAYS (number of days that patients are eligible to have an index date). Valid values are: Y: DENNUMPTS and DENNUMMEMDAYS variables will be calculated and populated in [RUNID]_t1_cida.sas7bdat. M: DENNUMPTS variable will be calculated and populated in [RUNID]_t1_cida.sas7bdat. DENNUNMEMDAYS variable will NOT be calculated and will be set to missing in [RUNID]_t1_cida.sas7bdat. N: DENNUMPTS and DENNUMMEMDAYS variables will NOT be calculated and will be set to missing in [RUNID]_t1_cida.sas7bdat. Notes If MINRXDAYS is specified as an inclusion/exclusion criterion, OUTPUTDENOM will be automatically set to N. When performing a Type 1 or 2 analysis that calculates denominators (e.g., eligible members and eligible member-days) and requesting stratifications, the CIDA module will only populate denominators for the pre-defined stratification variables (see Table 2 in the User-defined Strata Levels Lookup Table documentation). Format: SAS character $1 Example: `Y`