R中特殊字符的条件替换[关闭]

问题描述 投票:0回答:2

我希望能够获取包含列df $ col的数据帧df,其列表包含:

I?m tired
You?re tired
You?re tired?
Are you tired?
?I am tired

并替换在带有撇号的字母和出现在字符串开头的问号之间出现的问号:

I'm tired
You're tired
You're tired?
Are you tired?
I am tired
r regex gsub
2个回答
2
投票

我会在开头使用sub作为问号,为其他人使用gsub,因为字符串中的单词之间可能有几个问号但开头只有一个。

gsub("(\\w)\\?(\\w)", "\\1'\\2", sub("^\\?", "", df$col))
[1] "I'm tired"      "You're tired"   "You're tired?"  "Are you tired?"
[5] "I am tired"   

有关解释,请参阅https://regex101.com/r/jClVPg/1

一些解释:

  • 第一捕获组(\\ w): \\ w匹配任何单词字符(等于[a-zA-Z0-9_])
  • \\?匹配角色?字面意思(区分大小写)
  • 第二捕获组(\\ w): \\ w匹配任何单词字符(等于[a-zA-Z0-9_])

0
投票

我们可以使用sub

df$col <- sub("^'", "", sub("[?](?!$)", "'", df$col, perl = TRUE))
df$col
#[1] "I'm tired"      "You're tired"   "You're tired?"  "Are you tired?" "I am tired"    

在这里,我们假设将有一个?,如示例中所示。否则,只需用sub替换内部gsub

data

df <- structure(list(col = c("I?m tired", "You?re tired", "You?re tired?", 
"Are you tired?", "?I am tired")), .Names = "col", 
 class = "data.frame", row.names = c(NA, -5L))
© www.soinside.com 2019 - 2024. All rights reserved.