将列从一个数据帧合并到另一个数据帧(left_join不起作用) - rstudio

问题描述 投票:2回答:2

我有两个数据帧:

df1:我的主数据集,带有地址列

df2:包含纬度和经度加上和地址列的数据库

我想将两列从df2合并到我的df1。

DF1:

ID    VAR1   VAR2   VARX      Address
 1     7      2       x     Road 1, 1234 City
 2     8      0       y     Road 4, 1234 City
 3     6      2       x     Road 5, 1234 City
 4     7      2       x     Road 6, 1234 City
 5     4      1       y     Road 10, 1234 City
 6     1      2       x     Road 11, 1234 City

DF2:

    Address            Latitude      Longitude
Road 1, 1234 City        12,67          56,78
Road 2, 1234 City        12,66          55,67
Road 3, 1234 City        12,45          55,10
Road 4, 1234 City        12,10          55,20
Road 5, 1234 City        11,50          55,30
Road 6, 1234 City        12,34          55,32
Road 7, 1234 City        12,89          55,40
Road 8, 1234 City        12,77          55,45
Road 9, 1234 City        11,67          55,67
Road 10, 1234 City       11,90          55,78
Road 11, 1234 City       11,12          56,59

所以我的新数据框看起来像这样:

新的数据帧,df3:

ID    VAR1   VAR2   VARX      Address            Latitude   Longitude
 1     7      2       x     Road 1, 1234 City     12,67       56,78
 2     8      0       y     Road 4, 1234 City     12,10       55,20
 3     6      2       x     Road 5, 1234 City     11,50       55,30
 4     7      2       x     Road 6, 1234 City     12,34       55,32
 5     4      1       y     Road 10, 1234 City    11,90       55,78
 6     1      2       x     Road 11, 1234 City    11,12       56,59

我尝试过left_join,但它只返回NA。

df3 <- left_join(df1, df2, by = c("Address"))

编辑:已解决显然我的一个地址列中有一些错误的空格。上面的代码确实有效。

r merge left-join
2个回答
1
投票

left_join应该工作正常。看看这个并检查您的数据结构。

df3 <- dplyr::left_join(df1, df2, by = "Address")

产量

  ID VAR1 VAR2 VARX            Address Latitude Longitude
1  1    7    2    x  Road 1, 1234 City    12,67     56,78
2  2    8    0    y  Road 4, 1234 City    12,10     55,20
3  3    6    2    x  Road 5, 1234 City    11,50     55,30
4  4    7    2    x  Road 6, 1234 City    12,34     55,32
5  5    4    1    y Road 10, 1234 City    11,90     55,78
6  6    1    2    x Road 11, 1234 City    11,12     56,59

数据

DF1

structure(list(ID = 1:6, VAR1 = c(7L, 8L, 6L, 7L, 4L, 1L), VAR2 = c(2L, 
0L, 2L, 2L, 1L, 2L), VARX = structure(c(1L, 2L, 1L, 1L, 2L, 1L
), .Label = c("x", "y"), class = "factor"), Address = structure(c(1L, 
4L, 5L, 6L, 2L, 3L), .Label = c("Road 1, 1234 City", "Road 10, 1234 City", 
"Road 11, 1234 City", "Road 4, 1234 City", "Road 5, 1234 City", 
"Road 6, 1234 City"), class = "factor")), .Names = c("ID", "VAR1", 
"VAR2", "VARX", "Address"), class = "data.frame", row.names = c(NA, 
-6L))

DF2

structure(list(Address = structure(c(1L, 4L, 5L, 6L, 7L, 8L, 
9L, 10L, 11L, 2L, 3L), .Label = c("Road 1, 1234 City", "Road 10, 1234 City", 
"Road 11, 1234 City", "Road 2, 1234 City", "Road 3, 1234 City", 
"Road 4, 1234 City", "Road 5, 1234 City", "Road 6, 1234 City", 
"Road 7, 1234 City", "Road 8, 1234 City", "Road 9, 1234 City"
), class = "factor"), Latitude = structure(c(9L, 8L, 7L, 5L, 
2L, 6L, 11L, 10L, 3L, 4L, 1L), .Label = c("11,12", "11,50", "11,67", 
"11,90", "12,10", "12,34", "12,45", "12,66", "12,67", "12,77", 
"12,89"), class = "factor"), Longitude = structure(c(10L, 7L, 
1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L), .Label = c("55,10", "55,20", 
"55,30", "55,32", "55,40", "55,45", "55,67", "55,78", "56,59", 
"56,78"), class = "factor")), .Names = c("Address", "Latitude", 
"Longitude"), class = "data.frame", row.names = c(NA, -11L))

1
投票

基本R功能

merge(df1,df2,by = "Address")

产量

      Address        ID  VAR1 VAR2 VARX Latitude Longitude
1  Road 1, 1234 City  1    7    2    x    12,67     56,78
2 Road 10, 1234 City  5    4    1    y    11,90     55,78
3 Road 11, 1234 City  6    1    2    x    11,12     56,59
4  Road 4, 1234 City  2    8    0    y    12,10     55,20
5  Road 5, 1234 City  3    6    2    x    11,50     55,30
6  Road 6, 1234 City  4    7    2    x    12,34     55,32
© www.soinside.com 2019 - 2024. All rights reserved.