验证字符串并返回动态数量的孤立单词

Question

我想验证我的输入字符串并从中提取不可预测数量的子字符串——使用一个正则表达式模式。

示例字符串：

location in [chambre, cuisine, salle-de-bain, jardin]

只需一步，我想验证形状是否是 [word, word, word...] 中的单词，并且我想捕获每个单词。（为了性能我想只一步完成，因为这段代码已经可以完成三步了，但是太长了）

我当前的正则表达式是：

/([a-zA-Z]+)\s+in\s+\[\s*([a-zA-Z-]+)\s*(?:,\s*([a-zA-Z-]+)\s*)*\s*\]/

我抓住了

location

、

chambre

和

jardin

。我没听清

cuisine

和

salle-de-bain

。

$condition = 'location in [chambre, cuisine, salle-de-bain, jardin]';
preg_match('/([a-zA-Z]+)\s+in\s+\[\s*([a-zA-Z-]+)\s*(?:,\s*([a-zA-Z-]+)\s*)*\s*\]/', $condition, $matches);
var_dump($matches);

array:4 [▼
  0 => "location in [chambre, cuisine, salle-de-bain, jardin]"
  1 => "location"
  2 => "chambre"
  3 => "jardin"
]

我没有发现我的正则表达式中有什么问题来捕获 2 个缺失的单词。我只得到数组中的第一个和最后一个。

Answer 1

在 PHP 中，重复捕获组将始终仅保留捕获的最后一个子字符串。

您可以将

preg_match_all

与正则表达式一起使用，例如

[a-zA-Z]+(?=\s+in\s+\[\s*[a-zA-Z-]+(?:\s*,\s*[a-zA-Z-]+)*\s*])|(?:\G(?!^)\s*,\s*|(?<=[a-zA-Z])\s+in\s+\[\s*)\K[a-zA-Z-]+(?=(?:\s*,\s*[a-zA-Z-]+)*\s*])

请参阅正则表达式演示。详情：

```
[a-zA-Z]+(?=\s+in\s+\[\s*[a-zA-Z-]+(?:\s*,\s*[a-zA-Z-]+)*\s*])
```
- 紧跟在
```
in
```
后面的一个或多个 ASCII 字母，并用一个或多个空格字符括起来，然后是
```
[
```
、零个或多个空格、一个或多个 ASCII 字母或连字符，然后是零个或多个重复一个由零个或多个空格括起来的逗号，然后是一个或多个 ASCII 字母或连字符，然后是零个或多个空格和一个
```
]
```
字符
```
|
```
- 或
```
(?:\G(?!^)\s*,\s*|(?<=[a-zA-Z])\s+in\s+\[\s*)\K[a-zA-Z-]+(?=(?:\s*,\s*[a-zA-Z-]+)*\s*])
```
：
- ```
(?:\G(?!^)\s*,\s*|(?<=[a-zA-Z])\s+in\s+\[\s*)
```
  - 上一场比赛的结尾和一个逗号，中间有零个或多个空格或紧接在 ASCII 字母前面的位置，然后是一个或多个空格、
```
in
```
  、一个或多个空格、
```
[
```
  和零个或更多空白
- ```
\K
```
  - 省略目前匹配的文本
- ```
[a-zA-Z-]+
```
  - 一个或多个 ASCII 字母或连字符
- ```
(?=(?:\s*,\s*[a-zA-Z-]+)*\s*])
```
  - 正向先行，需要重复零次或多次逗号，并用零个或多个空格括起来，然后是一个或多个 ASCII 字母或连字符，然后是零个或多个空格和一个
```
]
```
  字符。

Answer 2

匹配前导单词，然后使用前瞻来验证剩余部分是否是所需的字符串。

然后使用

\G

（从字符串开头开始匹配，或者从上一个匹配的结束位置开始匹配）继续匹配。匹配大括号表达式中各个单词之前不需要的字符。

使用

\K

重新启动全字符串匹配 - 这将删除分隔子字符串并仅在输出数组中留下查找的值。

使用

模式修饰符会忽略模式本身中的任何文字空白，并允许开发人员有意义地分隔子模式并编写解释性注释。

代码：（演示）

$input = 'location in [chambre, cuisine, salle-de-bain, jardin]';
$regex = <<<REGEX
/
[a-zA-Z]+                #match a word
(?=                      #lookahead
   \s+in\s+\[\s*         #match one or more whitespaces, the word "in", one or more whitespaces, an opening brace, zero or more whitespaces
   [a-zA-Z-]+\s*         #match a word, zero or more whitespaces
   (?:                   #open non-capturing group
      ,\s*[a-zA-Z-]+\s*  #match comma, zero of more whiteapaces, a word, zero or more whitespaces
   )*                    #close non-capturing group and allow zero or more repetitions of the preceding subpattern
   \]                    #match a closing square brace
)                        #end non-capturing group
|                        #or
\G(?!^)                  #continue matching but not from the start of the string
(?:                      #open non-capturing group
   \s+in\s+\[\s*         #match one or more whitespaces, the word "in", one or more whitespaces, an opening brace, zero or more whitespaces
   |                     #or
   \s*,\s*               #match zero or more whitespaces, comma, zero or more whitespaces
)                        #close non-capturing group
\K                       #forget previously matched characters
[a-zA-Z-]+               #match a word
/x
REGEX;
preg_match_all($regex, $input, $matches);
var_export($matches[0]);

输出：

array (
  0 => 'location',
  1 => 'chambre',
  2 => 'cuisine',
  3 => 'salle-de-bain',
  4 => 'jardin',
)

验证字符串并返回动态数量的孤立单词

问题描述投票：0回答：2

2个回答

最新问题

验证字符串并返回动态数量的孤立单词

问题描述 投票：0回答：2

2个回答

最新问题

问题描述投票：0回答：2