我有两张桌子。一个叫
enrollment
,另一个叫students
。
报名
student_id class_id
004 compsci
010 compsci
002 engli
010 engli
003 engli
同学们:
student_id first_name
002 Abril
004 Chris
010 Davis
003 Riley
我需要一个如下所示的输出表,其中只有
class_id
和 first_name
通过上面两个表中的 student_id
和 first_name
的匹配出现。
class_id first_name
compsci Davis
compsci Chris
engli Abril
engli Davis
engli Riley
数据:
enrollment<-data.frame(student_id=(c(004,010,002,010,003)),class_id=(c("compsci","compsci","engli","engli","engli")))
student<-data.frame(student.id=c(002,004,010,003),first_name=(c("Abril","Chris","Davis","Riley")))
与
dplyr
:
library(dplyr)
#Code
new <- student %>% left_join(enrollment,by = c('student.id'='student_id')) %>% select(-student.id)
输出:
first_name class_id
1 Abril engli
2 Chris compsci
3 Davis compsci
4 Davis engli
5 Riley engli
您可以使用
merge
命令,即:
colnames(student)[1] <- 'student_id'
joined <- merge(enrollment,student,by='student_id')
joined <- joined[,c('class_id','first_name')]
您也可以使用
data.table
方法:
setDT(enrollment)[student, first_name := i.first_name, on=.(student_id==student.id)][, -'student_id']
class_id first_name
<char> <char>
1: compsci Chris
2: compsci Davis
3: engli Abril
4: engli Davis
5: engli Riley
我们可以使用
match
中的
base R
enrollment$first_name <- student$first_name[match(enrollment$student_id,
student$student.id)]
enrollment[c('class_id', 'first_name')]
-输出
# class_id first_name
#1 compsci Chris
#2 compsci Davis
#3 engli Abril
#4 engli Davis
#5 engli Riley