正则表达式忽略一批字符串中的部分

Question

我想知道是否可以为这种情况构建一个正则表达式。在某些上下文中，我正在使用 patch(diff) 文件来查找某些 ASCII 文件类型之间的差异，并且我想匹配此处所示的字符串部分。

注意：这些“ASCII 文件类型”中的注释是“！”之后的注释

应该匹配

示例 1.

- Content AAAA ! Comment which can be ignored
- Content BBBB ! Comment which can be ignored
- Content CCCC ! Comment which can be ignored
+ Content AAAA ! Comment which can be ignored even if it is different
+ Content BBBB ! Comment which can be ignored even if it is different
+ Content CCCC ! Comment which can be ignored even if it is different

示例 2.

- Exact same content ! Comment which can be ignored
+ Exact same content ! Comment which can be ignored even if it is different

因此，如果内容部分（+）和（-）相同的话，注释将被忽略。

不应该匹配

这不应该匹配，因为其内容不同，

示例3.

- Content AAAA ! Comment which can be ignored
- Content BBBB ! Comment which can be ignored
- Content CCCC ! Comment which can be ignored
+ Content 1111 ! Comment which can be ignored even if it is different
+ Content 2222 ! Comment which can be ignored even if it is different
+ Content 3333 ! Comment which can be ignored even if it is different

对于单行差异（如示例 2 所示），我有一个有效的正则表达式，如下所示

-(.+)!.*\r?\n\+\1.+

但我无法为多行字符串创建一个字符串，如示例 1 所示。

Answer 1

不要把它当成一个要匹配的块，一行一行地去匹配。匹配内容 - AAAA 后，您需要向前查找 + AAAA 的任何实例，然后将指针恢复到相同的 - BBBB。正则表达式中的前瞻就是为了这个目的而设计的。所以，稍微重写你的正则表达式，

/^-(.+?)(!.*?)?$(?=.*?^\+\1)

我还将评论部分设为可选。您需要设置单行标志 (/s)，以便点 (.) 匹配换行符，并设置多行标志 (/m)，以便向上箭头和美元符号锚点匹配行边界和字符串边界。

请注意，如果您的文件中有相同内容的多个实例，此操作可能会失败。

这是 Regex101 中该正则表达式的屏幕截图：

正则表达式忽略一批字符串中的部分

问题描述投票：0回答：1

应该匹配

不应该匹配

1个回答

最新问题

正则表达式忽略一批字符串中的部分

问题描述 投票：0回答：1

应该匹配

不应该匹配

1个回答

最新问题

问题描述投票：0回答：1