在一列中搜索多个关键字并为每个关键字创建列

Question

我有以下数据。

stringstosearch <- c("to", "and", "at", "from", "is", "of")

set.seed(199)
id <- c(rnorm(5))
x  <- c("Contrary to popular belief, Lorem Ipsum is not simply random text.",
       "A Latin professor at Hampden-Sydney College in Virginia",
       "It has roots in a piece of classical Latin ", 
       "literature from 45 BC, making it over 2000 years old.", 
       "The standard chunk of Lorem Ipsum used since")
datatxt <- data.frame(id, x)

datatxt$result <- str_detect(datatxt$x, paste0(stringstosearch, collapse = '|'))

我想搜索

stringtosearch

中列出的关键字，并为每个关键字创建包含结果的列。

我能做到，

library(stringr)

datatxt$result <- str_detect(datatxt$x, paste0(stringstosearch, collapse = '|'))

datatxt$result

> datatxt$result
[1] TRUE TRUE TRUE TRUE TRUE

但是我想为

stringstosearch

中的每个字符串创建结果。知道该怎么做吗？

结果应如下所示或类似：

          id                                                                  x    to   and    at  from    is    of
1 -1.9091427 Contrary to popular belief, Lorem Ipsum is not simply random text.  TRUE FALSE FALSE FALSE  TRUE  TRUE
2  0.5551667            A Latin professor at Hampden-Sydney College in Virginia FALSE FALSE  TRUE FALSE FALSE FALSE
3 -2.2163365                        It has roots in a piece of classical Latin  FALSE FALSE FALSE FALSE FALSE FALSE
4  0.4941455              literature from 45 BC, making it over 2000 years old. FALSE FALSE FALSE  TRUE FALSE FALSE
5 -0.5805710                       The standard chunk of Lorem Ipsum used since FALSE FALSE FALSE FALSE FALSE FALSE

知道如何实现这一目标吗？

Answer 1

这是一个基本的 R 方法。我们使用

sprintf()

将

\\b

单词边界锚点添加到每个模式。例如，这意味着

"and"

不会匹配

"random"

。

datatxt[stringstosearch] <- lapply(
    sprintf("\\b%s\\b", stringstosearch), \(x) grepl(x, datatxt$x)
)

输出：

#           id                                                                  x    to   and    at  from    is    of
# 1 -1.9091427 Contrary to popular belief, Lorem Ipsum is not simply random text.  TRUE FALSE FALSE FALSE  TRUE FALSE
# 2  0.5551667            A Latin professor at Hampden-Sydney College in Virginia FALSE FALSE  TRUE FALSE FALSE FALSE
# 3 -2.2163365                        It has roots in a piece of classical Latin  FALSE FALSE FALSE FALSE FALSE  TRUE
# 4  0.4941455              literature from 45 BC, making it over 2000 years old. FALSE FALSE FALSE  TRUE FALSE FALSE
# 5 -0.5805710                       The standard chunk of Lorem Ipsum used since FALSE FALSE FALSE FALSE FALSE  TRUE

在一列中搜索多个关键字并为每个关键字创建列

问题描述投票：0回答：1

1个回答

最新问题

在一列中搜索多个关键字并为每个关键字创建列

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1