我有以下 Markdown 片段:
# Glossary
This guide is aimed to familiarize the users with definitions to relevant DVC
concepts and terminologies which are frequently used.
## Workspace directory
Also abbreviated as workspace, it is the root directory of a project where DVC
is initialized by running `dvc init` command. Therefore, this directory will
contain a `.dvc` directory as well.
## Cache directory
DVC cache is a hidden storage which is found at `.dvc/cache`. This storage is
used to manage different versions of files which are under DVC control. For more
information on cache, please refer to the this
[guide](/doc/commands-reference/config#cache).
我想将其拆分,以便存在应该是的匹配:
# Glossary
...
## Workspace directory
...
## Cache directory
...
我尝试使用正则表达式来匹配它们
/#{1,2}\s.+\n{2}[^(#{2}\s)]*/
。我的意图是首先将标题与这部分#{1,2}\s.+\n{2}
匹配,然后在找到##\s
时终止匹配。但我在第二部分失败了。有人可以指导我吗?
将
split
与 /^(?=#+ )/m
正则表达式(演示)一起使用或与 match(/^#+ [^#]*(?:#(?!#)[^#]*)*/gm)
匹配(参见 另一个演示):
let contents = `# Glossary
This guide is aimed to familiarize the users with definitions to relevant DVC
concepts and terminologies which are frequently used.
## Workspace directory
Also abbreviated as workspace, it is the root directory of a project where DVC
is initialized by running \`dvc init\` command. Therefore, this directory will
contain a \`.dvc\` directory as well.
## Cache directory
DVC cache is a hidden storage which is found at \`.dvc/cache\`. This storage is
used to manage different versions of files which are under DVC control. For more
information on cache, please refer to the this
[guide](/doc/commands-reference/config#cache).`;
console.log(contents.split(/^(?=#+ )/m).filter(Boolean));
console.log(contents.match(/^#+ [^#]*(?:#(?!#)[^#]*)*/gm));
输出:
[
"# Glossary\n\nThis guide is aimed to familiarize the users with definitions to relevant DVC\nconcepts and terminologies which are frequently used.\n\n",
"## Workspace directory\n\nAlso abbreviated as workspace, it is the root directory of a project where DVC\nis initialized by running `dvc init` command. Therefore, this directory will\ncontain a `.dvc` directory as well.\n\n",
"## Cache directory\n\nDVC cache is a hidden storage which is found at `.dvc/cache`. This storage is\nused to manage different versions of files which are under DVC control. For more\ninformation on cache, please refer to the this\n[guide](/doc/commands-reference/config#cache)."
]
正则表达式#2(匹配)图:
我知道这是一篇旧帖子,但主题仍然相关,我希望比我有更多正则表达式知识的人会看到此评论并提供更新。
我一直在使用 Wiktor 的匹配正则表达式来查找标题以及下一个标题之前的后续文本。
除非文本正文中的任何位置有 h1 (#) 标题,否则它效果很好。如果存在,它将被“吞噬”并成为上一节的一部分,因为当正则表达式看到两个或多个 # 后跟一个空格,并且“#”不符合该条件时,它实际上会停止。
这会失败:
## header 2
some text
# header 1
some more text
## header 2b
第一场比赛将是:
## header 2
some text
# header 1
some more text
而不是:
## header 2
some text
假设似乎只有一个 h1 (#) 标题,并且其前面没有任何其他标题,那么我没有发现任何问题。
说实话,这对我来说在实践中并不是一个真正的问题,我只是在尝试理解 regex101.com 中的正则表达式时才发现它。