假设我有一个像这样的字符串:
My name is <<yourName>> and I am working in the <<Team Name>> team\\n as a <<yourTitle>>.
我怎样才能提取
<<placeholders>>
,以便最终我得到一个列表:
My name is
<<yourName>>
and I am working in the
<<Team Name>>
team\\nas a
<<yourTitle>>
备注:
<<
和>>
之间的内容可以是任何内容。换句话说,它们不是预定义的。<<
和>>
字符之间可以有空格字符(如<<Team Name>>
)\\r
和/或 \\n
但我希望保留它们。我想过在 C# 中使用正则表达式,但没有成功。
您可以将
Regex.Split
与模式 (?=<<)|(?<=>>)
结合使用,通过环视在 <<
之前和 >>
之后进行分割。这不是测试匹配的 << and >> 对。
private static readonly Regex _regex = new (@"(?=<<)|(?<=>>)", RegexOptions.Compiled);
...
string[] parts = _regex.Split(searchString);
将正则表达式对象设置为静态只读字段可确保它仅被编译和初始化一次。
如果您只想查找匹配项,则只需使用模式
<<.*?>>
即可。 我假设占位符不会嵌套。
请注意,还存在一个 Regex.Replace Method。接受 MatchEvaluator Delegate 的重载允许您根据占位符名称提供一个值。如果您想这样做,请使用模式
<<(.*?)>>
和括号直接捕获占位符名称,而不将周围的 << and >> 作为组号 1。
无需正则表达式即可解决:
using System;
using System.Collections.Generic;
static class Program {
static List<string> SplitByPlaceholders(string sc) {
var result = new List<string>(); int i, j = 0;
while ((i = sc.IndexOf("<<", j)) > -1) {
if (i > j) result.Add(sc.Substring(j, i - j));
j = sc.IndexOf(">>", i);
if (j > -1) result.Add(sc.Substring(i, j - i + 2));
else break;
j += 2;
}
if (j > -1) { if (j < sc.Length) result.Add(sc.Substring(j)); }
else result.Add(sc.Substring(i));
return result;
}
static void Main() {
foreach (var x in SplitByPlaceholders("My name is <<yourName>> and I am working in the <<Team Name>> team\n as a <<yourTitle>>."))
Console.WriteLine(x);
Console.WriteLine(new String('+', 60));
foreach (var x in SplitByPlaceholders("<<yourName>> and I am working in the <<Team Name>> team\n as a <<yourTitle>>."))
Console.WriteLine(x);
Console.WriteLine(new String('+', 60));
foreach (var x in SplitByPlaceholders("My name is <<yourName>> and I am working in the <<Team Name>> team\n as a <<yourTitle."))
Console.WriteLine(x);
}
}
我的名字是
<>
我在
工作 <>
团队
作为一个
<>
.
++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++
<>
我在
工作 <>
团队
作为一个
<>
.
++++++++++++++++++++++++++++++++++++++++++++++++++++++ +++++
我的名字是
<>
我在
工作 <>
团队
作为一个
<
您可以使用正则表达式来平衡此SO帖子中的字符。调整后如下所示:
\<\<(?>\<\<(?<c>)|[^<>]+|\>\>(?<-c>))*(?(c)(?!))\>\>
它将数学平衡
<< ... >>
代币。
然后您可以使用
Regex.Matches
和Regex.Split
来获取您想要的所有物品:
var regex = new Regex(@"\<\<(?>\<\<(?<c>)|[^<>]+|\>\>(?<-c>))*(?(c)(?!))\>\>", RegexOptions.Compiled);
var searchString = "My name is <<yourName>> and I am working in the <<Team Name>> team\n as a <<yourTitle>>.";
var matches = regex.Matches(searchString);
// It will give "<<yourName>>", "<<Team Name>>", "<<yourTitle>>"
var parts = regex.Split(searchString);
// It will give "My name is ", " and I am working in the ", " team\n as a ", "."