标题出现时分割 Markdown 片段

Question

我有以下 Markdown 片段：

# Glossary

This guide is aimed to familiarize the users with definitions to relevant DVC
concepts and terminologies which are frequently used.

## Workspace directory

Also abbreviated as workspace, it is the root directory of a project where DVC
is initialized by running `dvc init` command. Therefore, this directory will
contain a `.dvc` directory as well.

## Cache directory

DVC cache is a hidden storage which is found at `.dvc/cache`. This storage is
used to manage different versions of files which are under DVC control. For more
information on cache, please refer to the this
[guide](/doc/commands-reference/config#cache).

我想将其拆分，以便存在应该是的匹配：

# Glossary
...

## Workspace directory
...

## Cache directory
...

我尝试使用正则表达式来匹配它们

/#{1,2}\s.+\n{2}[^(#{2}\s)]*/

。我的意图是首先将标题与这部分

#{1,2}\s.+\n{2}

匹配，然后在找到

##\s

时终止匹配。但我在第二部分失败了。有人可以指导我吗？

Answer 1

将

split

与

/^(?=#+ )/m

正则表达式（演示）一起使用或与

match(/^#+ [^#]*(?:#(?!#)[^#]*)*/gm)

匹配（参见另一个演示）：

let contents = `# Glossary

This guide is aimed to familiarize the users with definitions to relevant DVC
concepts and terminologies which are frequently used.

## Workspace directory

Also abbreviated as workspace, it is the root directory of a project where DVC
is initialized by running \`dvc init\` command. Therefore, this directory will
contain a \`.dvc\` directory as well.

## Cache directory

DVC cache is a hidden storage which is found at \`.dvc/cache\`. This storage is
used to manage different versions of files which are under DVC control. For more
information on cache, please refer to the this
[guide](/doc/commands-reference/config#cache).`;

console.log(contents.split(/^(?=#+ )/m).filter(Boolean));
console.log(contents.match(/^#+ [^#]*(?:#(?!#)[^#]*)*/gm));

输出：

[
  "# Glossary\n\nThis guide is aimed to familiarize the users with definitions to relevant DVC\nconcepts and terminologies which are frequently used.\n\n",
  "## Workspace directory\n\nAlso abbreviated as workspace, it is the root directory of a project where DVC\nis initialized by running `dvc init` command. Therefore, this directory will\ncontain a `.dvc` directory as well.\n\n",
  "## Cache directory\n\nDVC cache is a hidden storage which is found at `.dvc/cache`. This storage is\nused to manage different versions of files which are under DVC control. For more\ninformation on cache, please refer to the this\n[guide](/doc/commands-reference/config#cache)."
]

Regex #1（分割）图：

正则表达式#2（匹配）图：

Answer 2

我知道这是一篇旧帖子，但主题仍然相关，我希望比我有更多正则表达式知识的人会看到此评论并提供更新。

我一直在使用 Wiktor 的匹配正则表达式来查找标题以及下一个标题之前的后续文本。

除非文本正文中的任何位置有 h1 (#) 标题，否则它效果很好。如果存在，它将被“吞噬”并成为上一节的一部分，因为当正则表达式看到两个或多个 # 后跟一个空格，并且“#”不符合该条件时，它实际上会停止。

这会失败：

## header 2
some text
# header 1
some more text
## header 2b

第一场比赛将是：

## header 2
some text
# header 1
some more text

而不是：

## header 2
some text

假设似乎只有一个 h1 (#) 标题，并且其前面没有任何其他标题，那么我没有发现任何问题。

说实话，这对我来说在实践中并不是一个真正的问题，我只是在尝试理解 regex101.com 中的正则表达式时才发现它。

标题出现时分割 Markdown 片段

问题描述投票：0回答：2

2个回答

最新问题

标题出现时分割 Markdown 片段

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2