我正在使用 Saxon-EE 11,我的平台语言是
en-us
。
我正在尝试通过指定 UCA 排序规则来实现
<xsl:sort>
指令的自定义排序行为。忽略 XML 文档详细信息,只讨论核心的逐字符串比较问题,我想要这些字符串:
ABSENTEES
ABSENTEE VOTING
MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)
MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT
MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD
MINNEAPOLIS
MINNEAPOLIS PORT AUTHORITY
按此顺序排序:
ABSENTEE VOTING
ABSENTEES
MINNEAPOLIS
MINNEAPOLIS PORT AUTHORITY
MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD
MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT
MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)
尝试将规则翻译成英文:
ABSENTEE VOTING
在ABSENTEES
之前)UCA 排序规则
http://www.w3.org/2013/collation/UCA?alternate=shifted
可以正确处理 MINNEAPOLIS*
字符串,但会将 ABSENTEES
放在 ABSENTEE VOTING
之前。
裸 UCA 排序规则
http://www.w3.org/2013/collation/UCA
可以正确处理 ABSENTEES
和 ABSENTEE VOTING
,但会将 MINNEAPOLIS/SAINT PAUL
和 MINNEAPOLIS-SAINT PAUL
字符串放在带有 MINNEAPOLIS
和空格字符的任何内容之后。
我尝试了一些其他参数组合,尽管它们都没有产生更接近我正在寻找的东西。我几乎要放弃并在应用排序规则之前实现自定义预处理,或者直接使用 Java 实现。
如果我正在寻找的内容确实无法通过 UCA 校对实现,那么很高兴知道这一点。
使用以下输入:
XML
<root>
<string>ABSENTEES</string>
<string>ABSENTEE VOTING</string>
<string>MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)</string>
<string>MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT</string>
<string>MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD</string>
<string>MINNEAPOLIS</string>
<string>MINNEAPOLIS PORT AUTHORITY</string>
</root>
以及以下样式表:
XSLT 2.0
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/root">
<output>
<xsl:perform-sort select="string">
<xsl:sort select="translate(., '-/', ' ')"/>
</xsl:perform-sort>
</output>
</xsl:template>
</xsl:stylesheet>
我得到:
结果
<?xml version="1.0" encoding="UTF-8"?>
<output>
<string>ABSENTEE VOTING</string>
<string>ABSENTEES</string>
<string>MINNEAPOLIS</string>
<string>MINNEAPOLIS PORT AUTHORITY</string>
<string>MINNEAPOLIS/SAINT PAUL HOUSING FINANCE BOARD</string>
<string>MINNEAPOLIS-SAINT PAUL INTERNATIONAL AIRPORT</string>
<string>MINNEAPOLIS TEACHERS RETIREMENT FUND ASSOCIATION (MTRFA)</string>
</output>