将数据从宽转换为长（使用多列）[重复]

Question

这个问题在这里已有答案：

Reshaping multiple sets of measurement columns (wide format) into single columns (long format) 7个答案

我目前有大量数据，看起来类似于：

cid dyad f1 f2 op1 op2 ed1 ed2 junk 
1   2    0  0  2   4   5   7   0.876
1   5    0  1  2   4   4   3   0.765

等等

我希望进入一个类似于此的长数据框：

cid dyad f op ed junk  id
1   2    0 2  5  0.876 1
1   2    0 4  7  0.876 2
1   5    0 2  4  0.765 1
1   5    1 4  3  0.765 2

我已经尝试使用gather（）函数以及reshape（）函数，但无法弄清楚如何创建多个列而不是将所有列折叠成长样式

所有帮助表示赞赏

Answer 1

通过使用reshape()参数并将varying设置为direction，您可以使用基本"long"函数（大致）同时融合多组变量。

例如，在这里，您将向varying参数提供三个变量名称“集合”（向量）的列表：

dat <- read.table(text="
cid dyad f1 f2 op1 op2 ed1 ed2 junk 
1   2    0  0  2   4   5   7   0.876
1   5    0  1  2   4   4   3   0.765
", header=TRUE)

reshape(dat, direction="long", 
        varying=list(c("f1","f2"), c("op1","op2"), c("ed1","ed2")), 
        v.names=c("f","op","ed"))

你最终会得到这个：

    cid dyad  junk time f op ed id
1.1   1    2 0.876    1 0  2  5  1
2.1   1    5 0.765    1 0  2  4  2
1.2   1    2 0.876    2 0  4  7  1
2.2   1    5 0.765    2 1  4  3  2

请注意，除了三个集合之外，还会创建两个变量：一个$id变量 - 跟踪原始表中的行号（dat），以及一个$time变量 - 它对应于原始变量的顺序倒塌了。现在还有嵌套的行号 - 1.1, 2.1, 1.2, 2.2，这里只是该行的$id和$time的值。

如果不确切地知道你想要跟踪什么，很难说$id或$time是否是你想要用作行标识符的东西，但它们都在那里。

参与timevar和idvar参数也很有用（例如，你可以将timevar设置为NULL）。

reshape(dat, direction="long", 
        varying=list(c("f1","f2"), c("op1","op2"), c("ed1","ed2")), 
        v.names=c("f","op","ed"), 
        timevar="id1", idvar="id2")

Answer 2

tidyr包可以使用函数收集，分离和传播来解决这个问题：

df<-read.table(header=TRUE, text="cid dyad f1 f2 op1 op2 ed1 ed2 junk 
1   2    0  0  2   4   5   7   0.876
               1   5    0  1  2   4   4   3   0.765")

library(tidyr)

print(df %>%gather( name, value, -c(cid, dyad, junk)) %>% 
  separate( name, into=c("name", "id"), sep= -2 ) %>%
  spread( key=c(name), value)
)


#step by step:
  #collect the columns f, op, ed to the common cid, dyad and junk
df<-gather(df, name, value, -c(cid, dyad, junk))
  #separate the number id from the names
df<-separate(df, name, into=c("name", "id"), sep= -2 )
  #made wide again.
df<-spread(df, key=c(name), value)

将数据从宽转换为长（使用多列）[重复]

问题描述投票：0回答：2

2个回答

最新问题

将数据从宽转换为长（使用多列）[重复]

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2