识别列中的唯一值,并在新列中重命名它们

问题描述 投票:1回答:1

我需要一些关于R中数据帧操作的建议。我正在进行细胞克隆性分析,我正在尝试将细胞分组为扩展或未扩展的克隆。

我有一个数据框如下:

Cell    Ident   Count   Clonality
C1      A       5       Expanded
C2      B       3       Expanded
C3      A       5       Expanded
C4      C       2       Unexpanded
C5      A       5       Expanded
C6      B       3       Expanded
C7      C       2       Unexpanded
C8      A       5       Expanded
C9      A       5       Expanded
C10     B       3       Expanded

对于clonality列,我创建了一个循环,用于标识扩展计数> = 3的行,而计数未取消为3的行。

但是,我想要做的是识别计数<3为未展开的行,但对于计数> = 3的行,根据其身份将它们标识为Expanded#。

我希望我的最终数据框看起来像这样:

Cell    Ident   Count   Clonality
C1      A       5       Expanded 1
C2      B       3       Expanded 2
C3      A       5       Expanded 1
C4      C       2       Unexpanded
C5      A       5       Expanded 1
C6      B       3       Expanded 2
C7      C       2       Unexpanded
C8      A       5       Expanded 1
C9      A       5       Expanded 1
C10     B       3       Expanded 2

我想我需要运行循环,但我不确定如何修改循环来执行此操作。我目前使用的循环如下:

for (n in 1:nrow(df)){
  count <- df$Count[n]
  if (count >= 3){
    df$Clonality[n] <- "Expanded"
  } else {
    df$Clonality[n] <- "Unexpanded"
  }
}

希望有人可以在这里指导我。

r dataframe
1个回答
4
投票

你可以这样做:

library(tidyverse)
df %>%
    mutate_if(is.factor, as.character) %>%
    mutate(Clonality = if_else(
        Clonality == "Expanded",
        sprintf("%s %i", Clonality, as.factor(Ident)),
        Clonality))
#   Cell Ident Count  Clonality
#1    C1     A     5 Expanded 1
#2    C2     B     3 Expanded 2
#3    C3     A     5 Expanded 1
#4    C4     C     2 Unexpanded
#5    C5     A     5 Expanded 1
#6    C6     B     3 Expanded 2
#7    C7     C     2 Unexpanded
#8    C8     A     5 Expanded 1
#9    C9     A     5 Expanded 1
#10  C10     B     3 Expanded 2

说明:我们通过添加Clonalityfactor级别(这意味着Ident => 1,A => 2,依此类推)来转换B中的条目,当且仅当Clonality == Expanded


或者在基地R中使用transform

df <- transform(df, Clonality = ifelse(
    Clonality == "Expanded",
    sprintf("%s %i", Clonality, as.factor(Ident)),
    as.character(Clonality)))
df
#   Cell Ident Count  Clonality
#1    C1     A     5 Expanded 1
#2    C2     B     3 Expanded 2
#3    C3     A     5 Expanded 1
#4    C4     C     2 Unexpanded
#5    C5     A     5 Expanded 1
#6    C6     B     3 Expanded 2
#7    C7     C     2 Unexpanded
#8    C8     A     5 Expanded 1
#9    C9     A     5 Expanded 1
#10  C10     B     3 Expanded 2

样本数据

df <- read.table(text =
    "Cell    Ident   Count   Clonality
C1      A       5       Expanded
C2      B       3       Expanded
C3      A       5       Expanded
C4      C       2       Unexpanded
C5      A       5       Expanded
C6      B       3       Expanded
C7      C       2       Unexpanded
C8      A       5       Expanded
C9      A       5       Expanded
C10     B       3       Expanded", header = T)
© www.soinside.com 2019 - 2024. All rights reserved.