假设有两个数据集:
data DB1;
input ID :$20.Reference_date :date9. Discharge :date9.;
format Reference_date Discharge date9.;
cards;
0001 14JUN2017 19JUN2017
0002 12MAR2016 17MAR2016
0003 01MAY2016 05MAY2016
0004 19MAR2017 22MAR2017
0005 10MAR2017 22MAR2017
0007 10OCT2015 14OCT2015
;
data DB2;
input ID :$20.Discharge_new :date9.;
format Discharge_new date9.;
cards;
0001 21JUN2017
0002 13MAR2016
0003 04MAY2016
0004 19MAR2017
0005 22MAR2017
0006 27JUN2022
0007 18OCT2015
;
有没有办法检查对于每个ID,来自DB2的discharge_new是否在Reference_date-Discharge的区间内(>=; <=) of DB1? If yes add a Flag = 1 (otherwise Flag = 0) to DB2 to get DB3. Note that there are some IDs in DB2 not present in DB1 that should remain missing. I don't know how to deal with dates comparison from two different datasets.
提前谢谢您。
所需输出:
data DB3;
input ID :$20.Discharge_new :date9. Flag :$20.;
format Discharge_new date9.;
cards;
0001 21JUN2017 0
0002 13MAR2016 1
0003 04MAY2016 1
0004 19MAR2017 1
0005 22MAR2017 1
0006 27JUN2022 .
0007 18OCT2015 0
;
这假设您的数据按 ID 排序并且不存在重复的 ID(这两者在您的示例数据中都是正确的)。
在合并时使用
(in = )
数据集选项来创建检查,以确保该行存在于相关数据集中。
data db3;
merge db1 (in = indb1) db2 (in = indb2);
by id;
* only keep rows from db2;
if indb2;
* only assign a flag of 1 or 0 if there is a match in db1
if indb1 then do;
if discharge_new <= reference_date <= discharge then flag = 1;
else flag = 0;
end;
keep id discharge_new flag;
run;