/***************************************************************************** PROGRAM: HP_APR94_MAR15.SAS TITLE: Defining the Hypertension Cohort CREATED BY: Interdisciplinary Chronic Disease Collaboration (ICDC)/ Alberta Kidney Disease Network (AKDN), September 2010 ===================================================================== Updated by Zhihai in Mar. 2016. ====================================================================== TERMS OF USE: This code was created by the ICDC/AKDN for its own use. Please be aware that the ICDC/AKDN DOES NOT: 1) Provide any guarantees about this code 2) Commit to the availability of any updates to the code 3) Support any implementation outside of the ICDC/AKDN Please use this code for information purposes only. ====================================================================== Goal: Identify people in the Alberta Health and Wellness (AHW) administrative data from April 1, 1994 to March 31, 2015 with hypertension. This will form a cohort of patients with incident and prevalent hypertension. ICDC/AKDN Definition for Hypertension DATASETS: 1. Hospitalization Discharge Abstract Database (DAD), Apr. 1994 - Mar. 2015, all ICD diagnostic fields 2. Supplemental Enhanced Service Event (SESE) Database (Physician claims), Apr. 1994 - Mar. 2015, all ICD diagnostic fields Visits with diagnostic prefix "X", "B" or "E" excluded (These represent labs, radiology, ophthalmologist, dental, and chiropractic visits). CASE DEFINITION: One hospitalization (ICD-9-CM or ICD-10-CA) excluding gestational hypertension OR Two physician claims (excluding claims containing a CCPx procedure code with prefix 'X','B',or 'E') on separate dates with an ICD-9-CM code within two years excluding gestational hypertension. Gestational hypertension excluded prior to applying the case definition (i.e. Records where hypertension event is 150 days before gestational event (either hospitalization or physician claims) are excluded.) The date of hypertension onset is defined as the first date of hospitalization for hypertension (excluding gestational hypertension), or the former of the two physician claims (excluding gestational hypertension), whichever date is earliest. CODES: See Hypertension documentation for ICD-9-CM and ICD-10-CA codes for hypertension/gestational hypertension. HOSPITALIZATION TYPE: Any diagnosis type, selected from all available fields, for both hypertension hospitalizations and gestational hospitalizations DATES TO BE USED: All available data. DIAGNOSIS DATE: The date of hypertension onset is defined as the first date of hospitalization or the former of the two physician claims, whichever date is earliest. WASH-OUT PERIOD: There will be a wash-out period of at least 3 years. At least one diagnosis of hypertension present during the wash-out period excludes a patient as an incident case. If the date of onset hypertension (by case definition) occurs on or after April 1, 1997, hypertension will be classified as an incident case. If the date of onset hypertension (by case definition) occurs between April 1, 1994 and March 31, 1997, there will not be a wash-out period of at least 3 years available. In this case, hypertension will be considered prevalent. USAGE NOTES (for this program): The hp_94_11_cohort dataset includes the patients with hypertension identified from the claims (no claims containing a procedure code with prefix X,B, or E) or hospitalizations using any diagnosis type from all available fields. This dataset will form a cohort of patients with incident and prevalent hypertension. The dataset includes both adults and non-adults at the time of onset hypertension. The dataset includes a DATA_SOURCE variable to identify the data source of the hypertension case. PHN is the unique identifier used to link the administrative data. ******************************************************************************/ options threads=yes nofmterr mlogic helpbrowser=sas; * HELPBROWSER=SAS sets the SAS browser to default in the 64-bit OS --; * Location of the AHW administrative datasets (Hospitalizations and claims) --; libname ahw 'G:\IDENTIFIED\AHW\EXTRACTION_1994_2015\DATA\CleanDataCombined_2013_2015\DataCombined_2013_2015'; * Location to store the HP_94_15_cohort dataset --; libname hp 'G:\IDENTIFIED\AHW\EXTRACTION_1994_2015\DATA\DERIVED_DATA\AHW_HYPERTENSION\DATA'; * Location of the demographics dataset created by the ICDC/AKDN. Includes information on gender, and dates of birth, death, and out-migration from Alberta --; libname demog 'G:\IDENTIFIED\AHW\EXTRACTION_1994_2015\DATA\DERIVED_DATA\AHW_DEMOGRAPHICS\DEMOGRAPHICS\DATA'; *********************************; * MACROS USED IN THIS PROGRAM -- ; *********************************; * MACRO for sorting by &srtvar --; %macro srt(inds,outds,srtvar); proc sort data=&inds out=&outds; by &srtvar; run; %mend; * MACRO for first record of &frstvar --; %macro first(inds,outds,srtvar,frstvar); data &outds; set &inds; by &srtvar; if first.&frstvar; run; %mend; * MACRO for defining Fiscal Year from April 1994 - March 2015 --; %macro fy(in_t=,out_t=,date=); data &out_t; set &in_t; if not missing(&date) then do; if &date<='31MAR1994'd then FY='----/1994'; if '01APR1994'd<=&date<='31MAR1995'd then FY='1994/1995'; if '01APR1995'd<=&date<='31MAR1996'd then FY='1995/1996'; if '01APR1996'd<=&date<='31MAR1997'd then FY='1996/1997'; if '01APR1997'd<=&date<='31MAR1998'd then FY='1997/1998'; if '01APR1998'd<=&date<='31MAR1999'd then FY='1998/1999'; if '01APR1999'd<=&date<='31MAR2000'd then FY='1999/2000'; if '01APR2000'd<=&date<='31MAR2001'd then FY='2000/2001'; if '01APR2001'd<=&date<='31MAR2002'd then FY='2001/2002'; if '01APR2002'd<=&date<='31MAR2003'd then FY='2002/2003'; if '01APR2003'd<=&date<='31MAR2004'd then FY='2003/2004'; if '01APR2004'd<=&date<='31MAR2005'd then FY='2004/2005'; if '01APR2005'd<=&date<='31MAR2006'd then FY='2005/2006'; if '01APR2006'd<=&date<='31MAR2007'd then FY='2006/2007'; if '01APR2007'd<=&date<='31MAR2008'd then FY='2007/2008'; if '01APR2008'd<=&date<='31MAR2009'd then FY='2008/2009'; if '01APR2009'd<=&date<='31MAR2010'd then FY='2009/2010'; if '01APR2010'd<=&date<='31MAR2011'd then FY='2010/2011'; if '01APR2011'd<=&date<='31MAR2012'd then FY='2011/2012'; if '01APR2012'd<=&date<='31MAR2013'd then FY='2012/2013'; if '01APR2013'd<=&date<='31MAR2014'd then FY='2013/2014'; if '01APR2014'd<=&date<='31MAR2015'd then FY='2014/2015'; if '01APR2015'd<=&date then FY='2015/----'; end; COUNT=1; run; %mend; * MACRO to delete records in datasets occuring before dob and after death or outmigration --; * This macro uses the demographics dataset and SRT macro, defines the AGE variable, and produces * frequencies of records to be deleted. It also creates the &out_t2 dataset with the records deleted --; %macro demogcheck(in_t=,out_t1=,out_t2=,sourcedate=); data &out_t1; merge &in_t (in=in1) demog.ahw_demographics_2015(in=in2); by phn; if in1; age=(&sourcedate-pers_dob)/365.25; if (pers_dob^=. and pers_dob>&sourcedate) or pers_dob=. then dobdelete=1; if death_date^=. and death_date<&sourcedate then deathdelete=1; if out_migrate_date^=. and out_migrate_date<&sourcedate then outdelete=1; if dobdelete=1 | deathdelete=1 | outdelete=1 then rec_delete=1; run; proc freq data=&out_t1; table dobdelete*deathdelete*outdelete*rec_delete/list missing; run; %srt(&out_t1,&out_t1,phn); data &out_t2; set &out_t1; if rec_delete=1 then delete; run; %mend; * Begin ODS OUTPUT File for summary information --; ods rtf file="G:\IDENTIFIED\AHW\EXTRACTION_1994_2015\DATA\DERIVED_DATA\AHW_HYPERTENSION\OUTPUT\hp_cohortsummary_2015.rtf" bodytitle; *************************************************; * 1) Identify Hypertension from Physician Claims ; *************************************************; * 1a) Identify hypertension codes using physician claims with no XBE --; * This code identifies any ICD-9 code for hypertension from * all 3 diagnosis fields from the phsyician claims database. It * creates a variable 'CODE' to identify the code in the earliest * diagnosis field. Datasets for hypertension and gestational hypertension are created --; *note: no start_date variable for 2015 claim data; data claims_noxbe_94_15; set ahw.claims_noxbe_94_15; if missing(start_date) then start_date=end_date; run; data claim_hp (keep=phn hp_claimdt hp_claim hp_claimyr code) claim_obst (keep=phn obst_dt obst_claim obst_claimyr); LENGTH CODE $7; set claims_noxbe_94_15; CODE=' '; ARRAY dx (*) $ HLTH_DX_ICD9X_CODE_1 HLTH_DX_ICD9X_CODE_2 HLTH_DX_ICD9X_CODE_3; do i=1 to DIM(dx); if substr(dx(i), 1, 3) in ('401', '402','403','404','405') then do; hp_claim = 1; if code=' ' then CODE=dx(i); hp_claimyr=year(start_date); hp_claimdt= start_date; end; if substr(dx(i), 1, 2) in ('65','66') then do; obst_claim = 1; obst_claimyr=year(start_date); obst_dt= start_date; end; end; format hp_claimdt yymmdd10. obst_dt yymmdd10.; if hp_claim=1 then output claim_hp; if obst_claim=1 then output claim_obst; run; * 1b) Dates for claims must be on separate dates --; * First, sort by PHN and START_DATE. Take first of START_DATE --; %srt(claim_hp,claim_hp,phn hp_claimdt); %first(claim_hp,claim_hp2,phn hp_claimdt,hp_claimdt); %srt(claim_obst,claim_obst,phn obst_dt); %first(claim_obst,claim_obst2,phn obst_dt,obst_dt); * 1c) Delete claims after death date/out-migration and claims before dob --; %demogcheck(in_t=claim_hp2,out_t1=claim_hp3,out_t2=claim_hp4,sourcedate=hp_claimdt) %demogcheck(in_t=claim_obst2,out_t1=claim_obst3,out_t2=claim_obst4,sourcedate=obst_dt) proc sort data=claim_hp4 out=test nodupkey; by phn; run; ******************************************************; * 2) Identify Hypertension from Hospitalization files ; ******************************************************; * 2a) Create datasets with hospitalizations containing relevant ICD9 codes --; * MACRO to extract codes for ICD9 dataset --; * The ICD9 codes from the hospitalizations database are available from 3 datasets covering * the fiscal years 1994-1997, 1997-1999, and 1999-2002. There are 16 fields for ICD9 * diagnosis codes. This code identifies any ICD-9 code for hypertension from * all 16 diagnosis fields from the hospitalizations database. It creates a variable 'CODE' * to identify the code in the earliest diagnosis field. This code also identifies cases * of gestational hypertension --; %macro hospicd9(inds,outds,outhp,outobst); data &outds; set &inds; ARRAY dx (*) $ SEPI_DX_MR_ICD9CM SEPI_DX_OTH_ICD9CM_1 - SEPI_DX_OTH_ICD9CM_15; ARRAY hos_ICD9_ (*) $ hosp_icd9_1-hosp_icd9_16; DO I=1 TO DIM(dx); hos_ICD9_(I)=dx(I); END; drop i; keep phn start_date hosp_icd9_1-hosp_icd9_16; if start_date=. then delete; run; data &outhp (keep=phn hp_hos hp_hosdt code) &outobst (keep=phn obst_hos obst_dt); LENGTH CODE $7; set &outds; ARRAY hos_ICD9_ (*) $ hosp_icd9_1-hosp_icd9_16; do i=1 to DIM(hos_ICD9_); * Hypertension hospitalization --; if substr(hos_icd9_(i), 1, 3) in ('401', '402','403','404','405') then do; hp_hos = 1; if code=' ' then code=hos_ICD9_(i); hp_hosdt = start_date; end; * Gestational hypertension --; if substr(hos_icd9_(i), 1, 2) in ('65','66') then do; obst_hos = 1; obst_dt = start_date; end; end; format hp_hosdt yymmdd10. obst_dt yymmdd10.; if hp_hos=1 then output &outhp; * Hypertension hospitalizations --; if obst_hos=1 then output &outobst; * Gestational hypertension --; run; %mend; %hospicd9(ahw.hosp_94_97,hosp_94_97,hosp_94_97_hp,hosp_94_97_obst); %hospicd9(ahw.hosp_97_99,hosp_97_99,hosp_97_99_hp,hosp_97_99_obst); %hospicd9(ahw.hosp_99_02,hosp_99_02,hosp_99_02_hp,hosp_99_02_obst); * 2b) Create dataset with hospitalizations containing relevant ICD10 codes --; * ICD10 codes used since April 2002. This code identifies any ICD-10 code from * all 25 diagnosis fields from the hospitalizations database. It creates a variable 'CODE' * to identify the code in the earliest diagnosis field. This code also identifies cases * of gestational hypertension --; data hospicd10; set ahw.hosp_02_15; ARRAY dx (*) $ DX1-DX25; ARRAY hos_ICD10_ (*) $ hosp_icd10_1-hosp_icd10_25; DO I=1 TO DIM(dx); hos_ICD10_(I)=dx(I); END; drop i; keep phn start_date hosp_ICD10_1-hosp_ICD10_25; run; data hospicd10_hp (keep=phn hp_hos hp_hosdt code) hospicd10_obst (keep=phn obst_hos obst_dt); LENGTH CODE $7; set hospicd10; ARRAY hos_ICD10_ (*) $ hosp_ICD10_1-hosp_ICD10_25; do i=1 to DIM(hos_ICD10_); * Hypertension hospitalizations --; if substr(hos_ICD10_(i), 1, 3) in ('I10','I11','I12','I13','I15') then do; hp_hos = 1; if code=' ' then code=hos_ICD10_(i); hp_hosdt = start_date; end; * Gestational hypertension --; if substr(hos_ICD10_(i), 1, 2)='O6' or substr(hos_ICD10_(i), 1, 3) in ('O13','O14','O29','O47','O48','O70','O71','O72','O73','O74','O75','O80','O81','O82','O83','O84') then do; obst_hos = 1; obst_dt = start_date; end; end; if hp_hos=1 then output hospicd10_hp; * Hypertension hospitalizations --; if obst_hos=1 then output hospicd10_obst; * Gestational hypertension --; run; * 2c) Combine all 4 hospitalizations for hypertension and gestational hypertensions data sources and sort by phn/start date --; data hosp_hp; set hosp_94_97_hp hosp_97_99_hp hosp_99_02_hp hospicd10_hp; hphosp_yr=year(hp_hosdt); run; %srt(hosp_hp,hosp_hp,phn hp_hosdt); %first(hosp_hp,hosp_hp2,phn hp_hosdt,hp_hosdt); data hosp_obst; set hosp_94_97_obst hosp_97_99_obst hosp_99_02_obst hospicd10_obst; obsthosp_yr=year(obst_dt); run; %srt(hosp_obst,hosp_obst,phn obst_dt); %first(hosp_obst,hosp_obst2,phn obst_dt,obst_dt); * 2d) Delete hospitalizations after death date/out-migration date or before DOB from hypertension and gestational hypertension datasets --; %demogcheck(in_t=hosp_hp2,out_t1=hosp_hp3,out_t2=hosp_hp4,sourcedate=hp_hosdt); %demogcheck(in_t=hosp_obst2,out_t1=hosp_obst3,out_t2=hosp_obst4,sourcedate=obst_dt); proc sort data=hosp_hp4 out=test nodupkey; by phn; run; *****************************************; * 3) Removing gestational hypertension --; *****************************************; * 3a) Set gestational records from both claims and hospitalizations--; data gest; set hosp_obst4 claim_obst4; run; %srt(gest,gest,phn obst_dt); %first(gest,gest2,phn obst_dt,obst_dt); * 3b) Transpose by PHN to obtain Wide format --; proc transpose data = gest2 out = widegest prefix= obst_dt; var obst_dt; by phn; run; %srt(widegest,widegest,phn); * 3c) Exclude gestational records from source datasets --; * MACRO to exclude gestational events (records) from either claims or hospitalization files --; %macro excl(indt,outdt1,outdt2,sourcedate); data &outdt1; merge &indt (in=in1) widegest (in=in2); by phn; if in1; run; %first(&outdt1,subjb4_&outdt1,phn,phn); data &outdt2; set &outdt1; array obstdt(*) obst_dt1-obst_dt86; array diff(*) difftest1-difftest86; do dates = 1 to 86; diff[dates]=obstdt[dates]-&sourcedate; if 0<=diff[dates] and diff[dates]<=150 then flag=1; end; if flag=1 then delete; run; %first(&outdt2,subjaft_&outdt2,phn,phn); %mend; %excl(hosp_hp4,hosp_hp5,hosp_hp6,hp_hosdt); %excl(claim_hp4,claim_hp5,claim_hp6,hp_claimdt); ************************************; * 4) Continue with Claims Algorithm ; ************************************; * 4a) Delete patients with single claim (i.e. Criteria requires at least 2 physician claims) --; data claim_hp7; set claim_hp6(drop=_NAME_ death death_date out_migrate out_migrate_date PERS_GENDER_CODE dobdelete deathdelete outdelete rec_delete obst_dt1-obst_dt86 difftest1-difftest86 dates flag); by phn; if first.phn and last.phn then delete; run; * 4b) Identify two hypertension claims in 2 years --; data claim_hp8; set claim_hp7; by phn hp_claimdt; days_diff=dif(hp_claimdt); if first.phn then do; days_diff=.; end; run; data claim_hp9 (drop=count); set claim_hp8; count+1; by phn; if first.phn then count=1; ind=count; run; data second_visit; set claim_hp9; if days_diff^=. & days_diff<=(365.25*2); run; %srt(second_visit,second_visit,phn hp_claimdt); %first(second_visit,second_visit2,phn,phn); data first_visit; set second_visit2 (keep=phn ind); ind2=ind-1; drop ind; rename ind2=ind; run; data first_visit2; merge claim_hp9 (in=in1) first_visit (in=in2); by phn ind; if in1 and in2; run; * 4c) Dataset with both visits where case definition is met --; data hp_claim (keep=phn casedate yr_hp source code age); set first_visit2 second_visit2; yr_hp=year(hp_claimdt); SOURCE='CLAIM'; rename hp_claimdt=casedate; run; * 4d) Take the date of the first visit for onset of hypertension --; * Store all of the hypertension cases identified from the physician claims in the HP folder --; %srt(hp_claim,hp_claim,phn casedate); %first(hp_claim,hp.pt_hp_claim,phn,phn); title1 'YEAR WHERE FIRST OF TWO CLAIMS IN 2 YEARS CASE DEFINITION MET'; proc freq data=hp.pt_hp_claim; tables yr_hp /list missing; run; title1; *********************************************; * 5) Continue with Hospitalization Algorithm ; *********************************************; * 5) Identify first hosp where case defintion is met --; data hosp_hp7 (keep=phn casedate yr_hp source code age); set hosp_hp6(drop=_NAME_ death death_date out_migrate out_migrate_date PERS_GENDER_CODE obst_dt1-obst_dt86 difftest1-difftest86 dates flag); yr_hp=year(hp_hosdt); rename hp_hosdt=casedate; source='HOSP'; run; %first(hosp_hp7,hp.pt_hp_hosp,phn,phn); title1 'YEAR WHERE FIRST HOSPITILZATION CASE DEFINITION MET'; proc freq data=hp.pt_hp_hosp; tables yr_hp /list missing; run; title1; ********************************************; * Identify the Onset date of Hypertension --; ********************************************; * 6) Combine hypertension (by case definition) from physician claims and hospitalizations. Take the first record per patient to identify the onset date of hypertension. Rename variables. Identify prevalent and incident cases. Identify adults. --; data hp; set hp.pt_hp_claim hp.pt_hp_hosp; run; %srt(hp,hp,phn casedate); data hp2 (drop=phn rename=(phn2=PHN)); length phn2 $9; set hp; by phn casedate; if first.phn; HP=1; if casedate<'01APR1997'd then CASE='PREVALENT'; if casedate>='01APR1997'd or age<3 then CASE='INCIDENT'; if age>=18 then ADULT=1; else ADULT=0; format casedate date9.; *label casedate='Date of First Hypertension (by defn)'; rename casedate=HP_DATE; rename source=DATA_SOURCE; phn2=phn; *label phn2='Personal Health Number'; run; * 7) Save HP cohort --; data hp.hp_94_15_COHORT (drop=code) hp_94_15_cohort; retain PHN HP HP_DATE DATA_SOURCE CASE; set hp2(drop=yr_hp adult age); run; ************************************************************************; * DESCRIPTIVE STATISTICS FOR THE HYPERTENSION COHORT, HP_94_09_COHORT --; ************************************************************************; * Add Fiscal Year --; %fy(in_t=hp_94_15_cohort,out_t=test,date=hp_date) title1 "PROC CONTENTS: HP_94_15_COHORT"; proc contents data=hp.hp_94_15_cohort varnum; run; title1; title1 "Frequency of Subjects who are Adults (>=18 yrs) at time of First Hypertension (by Defn)"; proc freq data=hp2; table ADULT; run; title1; title1 "Frequency of Prevalent and Incident Hypertension (by Defn)"; proc freq data=test; table Case/missing; run; title1; title1 "Frequency of Prevalent and Incident Hypertension (by Defn) by Fiscal Year"; proc freq data=test; table Case*fy/list missing; run; title1; title1 "Frequency: First Hypertension (by Defn) by Fiscal Year"; proc freq data=test; table fy/missing out=testa; run; title1; title1 "Frequency: Prevalent Hypertension (by Defn) by Fiscal Year"; proc freq data=test(where=(case='PREVALENT')); table fy*case/ list missing; run; title1; title1 "Frequency: Incident Hypertension (by Defn) by Fiscal Year"; proc freq data=test(where=(case='INCIDENT')); table fy*case/list missing; run; title1; title1 "Frequency: First Hypertension (by Defn)- Data Source (CLAIM, HOSP)"; proc freq data=test; table data_source/missing; run; title1; title1 "Frequency: First Hypertension (by Defn)- Data Source (CLAIM, HOSP) by Fiscal Year"; proc freq data=test; table fy*data_source/missing; run; title1; title1 "Frequency: First Hypertension (by Defn)- ICD Code"; proc freq data=test; table code/missing; run; title1; title1 "Frequency: First Hypertension (by Defn)- ICD Code by Fiscal Year"; proc freq data=test; table code*fy/missing; run; title1; * Create Graph- First Case Hypertension (by Defn) by Fiscal Year --; title1 "Graph: First Hypertension (by Defn) by Fiscal Year"; proc gplot data=testa; plot count*fy; symbol i=join v=circle c=black; label count="First Hypertension (by Defn)"; label fy="Fiscal Year of First Hypertension (by Defn)"; run; quit; ods rtf close; *******************; * END OF PROGRAM --; *******************;