尝试合并两个不同长度的数据集(使用Combine())

问题描述 投票:0回答:2

这些是有问题的两个数据集:

> head(Housing_Training)
  Id MSSubClass LotFrontage LotArea OverallQual OverallCond YearBuilt YearRemodAdd MasVnrArea TotalBsmtSF GrLivArea FullBath
1  1         60          65    8450           7           5      2003         2003        196         856      1710        2
2  2         20          80    9600           6           8      1976         1976          0        1262      1262        2
3  3         60          68   11250           7           5      2001         2002        162         920      1786        2
4  4         70          60    9550           7           5      1915         1970          0         756      1717        1
5  5         60          84   14260           8           5      2000         2000        350        1145      2198        2
6  6         50          85   14115           5           5      1993         1995          0         796      1362        1
  HalfBath BedroomAbvGr KitchenAbvGr TotRmsAbvGrd Fireplaces GarageYrBlt GarageCars GarageArea WoodDeckSF OpenPorchSF MoSold YrSold
1        1            3            1            8          0        2003          2        548          0          61      2   2008
2        0            3            1            6          1        1976          2        460        298           0      5   2007
3        1            3            1            6          1        2001          2        608          0          42      9   2008
4        0            3            1            7          1        1998          3        642          0          35      2   2006
5        1            4            1            9          1        2000          3        836        192          84     12   2008
6        1            1            1            5          0        1993          2        480         40          30     10   2009
  SalePrice
1    208500
2    181500
3    223500
4    140000
5    250000
6    143000
> head(Housing_Testing)
  ï..Id MSSubClass LotFrontage LotArea OverallQual OverallCond YearBuilt YearRemodAdd MasVnrArea TotalBsmtSF GrLivArea FullBath
1  1001         20          74   10206           3           3      1952         1952          0           0       944        1
2  1002         30          60    5400           5           6      1920         1950          0         691       691        1
3  1003         20          75   11957           8           5      2006         2006         53        1574      1574        2
4  1004         90          NA   11500           5           6      1976         1976        164        1680      1680        2
5  1005        120          43    3182           7           5      2005         2006         16        1346      1504        2
6  1006         80          65    8385           5           8      1977         1977        220         985       985        2
  HalfBath BedroomAbvGr KitchenAbvGr TotRmsAbvGrd Fireplaces GarageYrBlt GarageCars GarageArea WoodDeckSF OpenPorchSF MoSold YrSold
1        0            2            1            4          0        1956          2        528          0           0      7   2009
2        0            2            1            4          0        1920          1        216          0          20      1   2007
3        0            3            1            7          1        2006          3        824        144         104      7   2008
4        0            4            2            8          0        1976          2        528          0           0      6   2007
5        0            1            1            7          1        2005          2        457        156           0      5   2009
6        0            3            1            6          0        1977          1        328        210           0     11   2008
  SalePrice
1     82000
2     86000
3    232000
4    136905
5    181000
6    149900

我正在尝试将它们结合起来

问题是训练数据集有 1000 行,测试数据集有 460 行

但实际上,我只想加入他们,拥有 1460 行

作业说要使用组合功能:

当我合并两个数据集时,我得到这个:

> combine(Housing_Training,Housing_Testing)
Error in `$<-.data.frame`(`*tmp*`, "layout", value = list(l = integer(0),  : 
  replacement has 1001 rows, data has 1000
In addition: Warning message:
'combine' is deprecated.
Use 'gtable_combine' instead.
See help("Deprecated") 

然后我尝试了

> gtable_combine(Housing_Training,Housing_Testing)
Error in `$<-.data.frame`(`*tmp*`, "layout", value = list(l = integer(0),  : 
  replacement has 1001 rows, data has 1000
r merge dataset
2个回答
0
投票

根据

?dplyr::combine
(使用过
dplyr 1.0.0

combine() 已被弃用,取而代之的是 vctrs::vec_c()。 merge() 尝试自动猜测您是否需要 c() 还是 unlist(),但可能会以令人惊讶的方式失败。我们现在认为最好明确一些。

因此,它将组合向量并假设我们需要绑定数据框的行,一个选项是

bind_rows

bind_rows(Housing_Training,Housing_Testing, .id = 'grp')

0
投票

老师说:

您可能想使用bind_rows() 或rbind() 函数来代替。请尝试告诉我。谢谢。

rbind() 很有魅力

捂脸

© www.soinside.com 2019 - 2024. All rights reserved.