我有一个这样的数据框:
df <- structure(list(A = c(2, 3, 1), B = c(3, 2, 1), C = c(4, 5, 1),
D = c(4, 4, 1), Genus = c("Ensifer", "Ensifer", "Ensifer"
)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA,
-3L))
A B C D Genus
<dbl> <dbl> <dbl> <dbl> <chr>
1 2 3 4 4 Ensifer
2 3 2 5 4 Ensifer
3 1 1 1 1 Ensifer
在这个数据框中,我有五列,其中四列有值,而第五列有名称,同一个名称重复多次,但我希望名称
Ensifer
变成一个,所有值加起来并成为一个就这样排
A B C D Genus
<dbl> <dbl> <dbl> <dbl> <chr>
1 6 6 10 9 Ensifer
我想在 R 中这样做,因为数据太长了
我已经尝试过这段代码,但是它花费的时间太长了
count <- read.csv("count_data.csv", header=T)
shl <- aggregate(count, by=count['Genus'], sum)
结合子集使用
aggregate()
功能
count <- read.csv("count_data.csv", header=T)
# Subset the data to only include the columns you need
count_subset <- count[,c("Genus", "Sample1", "Sample2", "Sample3")]
# Use the aggregate() function to group the data by "Genus" and sum the values
count_agg <- aggregate(. ~ Genus, data=count_subset, sum)
colnames(count_agg) <- c("Genus", "Sample1", "Sample2", "Sample3")
count_agg
根据您的图像更改样本值
我们也可以这样做:
dplyr:这里我们应用
sum()
函数跨列A:D
.
library(dplyr)
df %>%
summarise(across(A:D, sum), .by=Genus)
Genus A B C D
<chr> <dbl> <dbl> <dbl> <dbl>
1 Ensifer 6 6 10 9
基地R1:
result <- aggregate(df[, 1:4], by = list(df$Genus), FUN = sum)
names(result)[1] <- "Genus"
Genus A B C D
1 Ensifer 6 6 10 9
基础R2:
t(sapply(split(df[, 1:4], df$Genus), colSums))
A B C D
Ensifer 6 6 10 9