每个标签的年份总和数据帧

问题描述 投票:-3回答:2

拥有这样的数据结构:

dtest <- data.frame(label=c("yahoo","google","yahoo","yahoo","google","google","yahoo","yahoo"), year=c(2000,2001,2000,2001,2003,2003,2003,2003))

如何提取像这样的新数据帧:

doutput <- data.frame(label=c("yahoo","yahoo","yahoo","yahoo","google","google","google","google"), year=c(2000,2001,2002,2003,2000,2001,2002,2003), volume=c(2,1,0,3,0,1,0,2))

> doutput
   label year volume
1  yahoo 2000      2
2  yahoo 2001      1
3  yahoo 2002      0
4  yahoo 2003      3
5 google 2000      0
6 google 2001      1
7 google 2002      0
8 google 2003      2
r
2个回答
1
投票

一种方法是使用dplyr

library(dplyr)

dtest %>%
  group_by(label, year) %>%
  tally(name = "volume")

# A tibble: 5 x 3
# Groups:   label [2]
  label   year volume
  <fct>  <dbl>  <int>
1 google  2001      1
2 google  2003      2
3 yahoo   2000      2
4 yahoo   2001      1
5 yahoo   2003      2

1
投票

这是一个基础R的解决方案:

as.data.frame(table(transform(dtest,
                              year = factor(year, levels = seq(min(year), max(year))))))

结果:

   label year Freq
1 google 2000    0
2  yahoo 2000    2
3 google 2001    1
4  yahoo 2001    1
5 google 2002    0
6  yahoo 2002    0
7 google 2003    2
8  yahoo 2003    2
© www.soinside.com 2019 - 2024. All rights reserved.