如何根据[0-9
与tumbling window
是否匹配来分组?
期望的输出:
...
<record>
<name>joe</name>
<data>phone1</data>
<data>phone2</data>
</record>
...
当前输出,未分组:
<xml>
<record>
<person data="name">phone1</person>
</record>
<record>
<person data="name">phone2</person>
</record>
<record>
<person data="name">phone3sue</person>
</record>
<record>
<person data="name">cell4</person>
</record>
<record>
<person data="name">home5alice</person>
</record>
<record>
<person data="name">atrib6</person>
</record>
<record>
<person data="name">x7</person>
</record>
<record>
<person data="name">y9</person>
</record>
<record>
<person data="name">z10</person>
</record>
</xml>
输入:
<text>
<line>people</line>
<line>joe</line>
<line>phone1</line>
<line>phone2</line>
<line>phone3</line>
<line>sue</line>
<line>cell4</line>
<line>home5</line>
<line>alice</line>
<line>atrib6</line>
<line>x7</line>
<line>y9</line>
<line>z10</line>
</text>
代码:
xquery version "3.0";
<xml>
{
for tumbling window $line in db:open("foo.txt")//text()
start $s when matches($s, '[0-9]')
return
<record>
<person data="name">{$line}</person>
</record>
}
</xml>
查看输出,“ phone3sue”显然在进行一些匹配和分组,尽管并不完全符合要求,因为“ phone3”应该在其自己的元素中,嵌套在“ joe”而不是“ sue”中。但是,仍然有一些匹配项正在发生。
如果使用3.0或3.1,请使用FLWOR表达式的“ window”子句。有点像
for tumbling window $w in line
start $s when matches($s, '[a-z]')
return <group key="$s">{$w}</group>
未经测试,可能需要更正或改编。
嗯,这看起来很有希望。但是,尚不确定在逻辑上区分文本的方式,以便“ joe”具有一个键,而“ phone1”具有另一个键。
输入:
<text>
<line>people</line>
<line>joe</line>
<line>phone1</line>
<line>phone2</line>
<line>phone3</line>
<line>sue</line>
<line>cell4</line>
<line>home5</line>
<line>alice</line>
<line>atrib6</line>
<line>x7</line>
<line>y9</line>
<line>z10</line>
</text>
输出:
<group key="$s">phone1</group>
<group key="$s">phone2</group>
<group key="$s">phone3</group>
<group key="$s">cell4</group>
<group key="$s">home5</group>
<group key="$s">atrib6</group>
<group key="$s">x7</group>
<group key="$s">y9</group>
<group key="$s">z10</group>
代码:
xquery version "3.0";
for $line in db:open("foo.txt")//text()
for tumbling window $w in $line
start $s when matches($s, '[0-9]')
return <group key="$s">{$w}</group>