我有这样的桌子(下图)。
sample_name | GENE_FRACTION | |
---|---|---|
Sample_1 | 0.00057491 | |
Sample_2 | 0.0044843 | |
Sample_3 | 0.01253306 | |
Sample_4 | 0.00942854 | |
0.00034495 | IGHV1-11 | |
sample_9 | ||
ighv1-18 | Sample_10 | |
如何将上表转换为r? |
IGHV1-11
df <- data.frame(
Gene_name = c("IGHV1-11", "IGHV1-12", "IGHV1-15", "IGHV1-18", "IGHV1-19",
"IGHV1-2", "IGHV1-11", "IGHV1-13", "IGHV1-16", "IGHV1-18"),
Sample_name = c("sample_1", "sample_2", "sample_3", "sample_4", "sample_5",
"sample_6", "sample_7", "sample_8", "sample_9", "sample_10"),
Gene_fraction = c(0.00057491, 0.0044843, 0.01253306, 0.00942854, 0.01747729,
0.00034495, 0.00103484, 0.01517765, 0.00758882, 0.00827872)
)
# In tidyverse
library(tidyverse)
df1 <- df %>%
mutate(Sample_name = ifelse( # your logic of adding suffix MT/WT
as.numeric(gsub("sample_", "", Sample_name)) %in% c(4, 7, 9, 10, 11),
paste0(Sample_name, "MT"),
paste0(Sample_name, "WT")
)) %>%
pivot_wider(names_from = Gene_name, # pivoting wider
values_from = Gene_fraction,
values_fill = 0)
# or in base R
df2 <- df
# Modify sample names
df2$Sample_name <- ifelse(as.numeric(sub("sample_", "", df2$Sample_name)) %in% c(4, 7, 9, 10, 11),
paste0(df$Sample_name, "MT"),
paste0(df$Sample_name, "WT"))
# Pivot to wide format
df2 <- reshape(df2, idvar = "Sample_name", timevar = "Gene_name",
direction = "wide", v.names = "Gene_fraction")
colnames(df2) <- gsub("Gene_fraction.", "", colnames(df2))
# Replace NA with 0
df2[is.na(df2)] <- 0
sample_1wt | 0.00057491 | |||||||
---|---|---|---|---|---|---|---|---|
0 | 0 | 0 | 0 | 0 | ||||
sample_3wt | 0 | 0 | 0.01253306 | |||||
0 | 0 | 0 | 0 | |||||
sample_5wt | 0 | 0 | 0 | 0 | ||||
0 | 0 | 0 | ||||||
sample_7mt | 0.00103484 | 0 | 0 | 0 | 0 | |||
0 | 0 | |||||||
sample_9mt | 0 | 0 | 0 | 0 | 0 | 0 | ||
0.00758882 | ||||||||
sample_11mt | 0 | 0 | 0 | 0 | 0.04679775 | 0 | 0 | |
我应该在每一行上迭代并将值附加到新的数据框中吗? | tidyverse | base R | 中使用它。 |