我有一个看起来像这样的 php 字符串...
[line1]this is some test text[/line1][line2]This is line 2 text[/line2][line3]This is line 3 text[/line3]
我正在尝试创建一个看起来像这样的数组..
array(
"line1" => "this is some test text",
"line2" => "This is line 2 text",
"line3" => "This is line 3 text"
)
该字符串是动态创建的,因此它可以包含第 1 行 - 第 99 行等。
执行此操作并保持可扩展性的最佳方法是什么? 有人可以给我举个例子吗?
就正则表达式而言,这可能是匹配模式的合理妥协。
注意:这不会处理嵌套/递归。
\[([^\]]+)\](.*?)\[/\g{1}\]
用途:
preg_match_all( '%\[([^\]]+)\](.*?)\[/\g{1}\]%', $subject, $matches, PREG_SET_ORDER );
var_dump( $matches );
Match the character “[” literally «\[»
Match the regex below and capture its match into backreference number 1 «([^\]]+)»
Match any character that is NOT a “]” «[^\]]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character “]” literally «\]»
Match the regex below and capture its match into backreference number 2 «(.*)»
Match any single character that is NOT a line break character (line feed) «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (lazy) «*»
Match the character “[” literally «\[»
Match the character “/” literally «/»
Match the same text that was most recently matched by capturing group number 1 (case sensitive; fail if the group did not participate in the match so far) «\g{1}»
Match the character “]” literally «\]»
您可以使用
explode
采用这种方法(通过多次分割输入字符串)来隔离相关数据:line-id 和 line-text:
$text = "[line1]this is some test text[/line1][line2]This is line 2 text[/line2][line3]This is line 3 text[/line3]";
$text = explode( '][line', ']'.$text.'[line');
// you'll get a spurious item at index 0 and at the end, they'll be skipped
$n = count( $text );
$output = array();
for( $i = 1; $i < $n - 1; $i++ )
{
$line = explode( ']', $text[ $i] );
$id = 'line' . $line[ 0 ];
$line = explode( '[', $line[ 1 ] );
$value = $line[ 0 ];
$output[ $id ] = $value;
}
var_export( $output );
echo "\n";
你得到:
array (
'line1' => 'this is some test text',
'line2' => 'This is line 2 text',
'line3' => 'This is line 3 text',
)
注:
可以容忍并妥善处理空“线”
line
文本内的方形刹车会破坏代码并搞砸一切。
输入格式必须严格按照表格格式
[line
n]
文字[/line
n]
....
如果您有进一步的要求,可以调整代码。
我认为这可能是一个很好的起点。
注意(再次):
这是一个可行的解决方案(考虑到上面提到的限制)。
另一种方法是使用正则表达式,并
preg_match_all()
使用捕获组来检索行 ID 和行文本。