使用单独的 df 重命名 phylo 尖端标签

Question

我希望使用第二个 df 重命名我的 phylo 树中的 1290 个提示标签。我可以使用以下代码一次重命名一个标签：

phylo$tip.label[phylo$tip.label=="e54924083c02bd088c69537d02406eb8"] <- "something"

但这显然效率低下。如何使用包含原始提示标签和新标签的第二个 df 重命名所有标签？如果需要的话我可以提供示例数据（文件非常大）。

谢谢！

Answer 1

让我们举一个例子树：

oldTree <- ape::read.tree(text = "(t6, (t5, (t4, (t3, (t2, t1)))));")
old <- c("t1", "t3", "t6")
new <- c("tip_1", "tip_3", "tip_6")
plot(oldTree)

一个稳健的方法是：

tree <- oldTree
tree$tip.label[match(old, tree$tip.label)] <- new
plot(tree)

Thomas 的代码仅在

tree$tip.label

和

old.labels

中的标签顺序相同时才有效。

tree <- oldTree
my_data_frame <- data.frame(old.labels = old,
                            new.labels = new)

## Find the old labels in the tree
tree$tip.label[tree$tip.label %in% my_data_frame$old.labels] <- my_data_frame$new.labels

plot(tree)

此处提示的标签错误：

Answer 2

您可以使用值匹配功能

%in%

首先检测哪些标签位于数据框中的“旧标签”列中，然后将其替换为“新标签”。

## A random tree
my_tree <- rtree(20)

## A data.frame of names to replace
my_data_frame <- data.frame(old.labels = c("t1", "t3", "t9"),
                            new.labels = c("tip_1", "tip_3", "tip_9"))

## Find the old labels in the tree
my_tree$tip.label[my_tree$tip.label %in% my_data_frame$old.labels] <- my_data_frame$new.labels

my_tree$tip.label %in% my_data_frame$old.labels

返回与

TRUE

匹配的每个提示的逻辑向量 (

FALSE

/

my_data_frame$old.labels

)，然后您可以轻松地将其替换为您选择的相同长度的内容（即

length(which(my_tree$tip.label %in% my_data_frame$old.labels)) == length(my_data_frame$new.labels)

）。

Answer 3

只需添加代码即可确保订单也匹配

#随机树

老树<- ape::read.tree(text = "(t2, (t5, (t4, (t6, (t3, t1)))));") plot(oldTree)

df <- cbind(old.name=oldTree$tip.label, new_name = c(rbind("tip_2", "tip_5", "tip_4","tip_6", "tip_3", "tip_1"))) %>% as.data.frame()

#获取元素的位置——将提示标签与数据框的标签进行匹配

pos_id <-match(oldTree$tip.label, df$old.name) pos_id #element position newTree <- oldTree

newTree$tip.label <- df$new_name[pos_id] #here sorting by pos_id par(mfrow=c(1,2))

情节（老树）情节（新树）

或者在tree_io包中查找rename_taxa，代码是相同的。 https://guangchuangyu.github.io/2018/04/rename-phylogeny-tip-labels-in-treeio/

但是我发现ggtree和treeio的语法有点麻烦。

使用单独的 df 重命名 phylo 尖端标签

问题描述投票：0回答：3

3个回答

最新问题

使用单独的 df 重命名 phylo 尖端标签

问题描述 投票：0回答：3

3个回答

最新问题

问题描述投票：0回答：3