[尝试解决separate issue时,我切换到Saxon XSLT处理器,一直在努力使代码的语法正常工作。该代码的目的是遍历HTML文件列表,查找每个页面中任何标题的第一个实例并将其转换为H1(因为我们必须将H2用于PDF输出,但需要H1用于HTML输出) 。
我以批处理文件开始:
set outputDir=%1
@set Saxon=C:\Users\%username%\saxon\saxon9he.jar
REM Create filelist
dir %outputDir%\*.htm /b /s /A-D > file_list.txt
@echo ^<filelist^>^</filelist^> > pre_filelist.xml
REM XML-ize filelist
java -cp %Saxon% net.sf.saxon.Transform -s:pre_filelist.xml -xsl:convert_filelist.xsl -o:pre_list.xml
REM Replace starting h2 tags with h1 tags
java -cp %Saxon% net.sf.saxon.Transform -s:pre_list.xml -xsl:h2toh1.xsl -o:null.xml
REM Garbage collection
DEL pre_list.xml
DEL pre_filelist.xml
DEL file_list.txt
pause
查找所有输出HTML文件,并使用convert_filelist.xsl在列表中设置格式:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- Set output style. XML with no indentations -->
<xsl:output indent="no" method="xml" omit-xml-declaration="yes"/>
<!-- Reads the file list text file into memory as a global variable. -->
<xsl:variable name="fileList">file_list.txt</xsl:variable>
<!-- Parses the file list text file to create an XML list of files that can be fed to the transformer -->
<xsl:template match="filelist">
<!-- Create a variable that can be parsed -->
<xsl:variable name="filelist_raw"><xsl:value-of select="unparsed-text($fileList,'UTF-8')"/></xsl:variable>
<!-- Create a open and close file tags for each line in the list -->
<xsl:variable name="driveLetter"><xsl:value-of select="substring-before(unparsed-text($fileList,'UTF-8'),':')"/>:<xsl:text disable-output-escaping="yes">\\</xsl:text></xsl:variable>
<xsl:variable name="driveLetterReplacement"><xsl:text disable-output-escaping="yes"><file>file:///</xsl:text><xsl:value-of select="$driveLetter"/></xsl:variable>
<!-- Generate an xml tree. The value-of is doing a text-level replacement. Looking for the drive letter and replacing it -->
<!-- with the file open tag and drive letter. Looking for the file extension and replacing with the extension and file close tag. -->
<file_list><xsl:value-of select="replace(replace(replace($filelist_raw,'.htm','.htm</file>'),$driveLetter,$driveLetterReplacement),'\\','/')" disable-output-escaping="yes"/></file_list>
</xsl:template>
</xsl:stylesheet>
然后使用h2toh1.xsl将第一个标头转换为H1:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0"
xmlns:MadCap="http://www.madcapsoftware.com/Schemas/MadCap.xsd"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- Set output style. XML with no indentations. Normally no. -->
<xsl:output method="xml" indent="yes" omit-xml-declaration="no"/>
<!-- Begin traversing the list of files in the output folder. -->
<xsl:template match="file_list">
<xsl:for-each select="*">
<xsl:variable name="filename" select="."/>
<xsl:variable name="content" select="document($filename)"/>
<!-- Generate a new output file to replace the Flare generated file. Uses the same file name. Transparent to the end user. -->
<xsl:result-document href="{$filename}" method="xml">
<xsl:apply-templates select="document($filename)">
<xsl:with-param name="content" select="$content"/>
</xsl:apply-templates>
</xsl:result-document>
</xsl:for-each>
</xsl:template>
<!-- Recreate each node as it appears in the generated document -->
<xsl:template match="*">
<xsl:param name="content"/>
<xsl:variable name="name" select="name(.)"/>
<xsl:element name="{$name}">
<xsl:for-each select="@*">
<xsl:copy-of select="."/>
</xsl:for-each>
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<!-- Select the first header and change it to an h1. -->
<xsl:template match="*[matches(name(), 'h\d')][1]">
<xsl:element name="h1">
<xsl:for-each select="@*|node()">
<xsl:copy-of select="."/>
</xsl:for-each>
</xsl:element>
</xsl:template>
</xsl:stylesheet>
但是,对于列表中的每个文件,我都会收到这些错误的实例:
在x2:variable / @ select上的char 9处警告h2toh1.xsl的第63列63:XTRE1500:无法读取在相同转换期间编写的文档:file:/// C:/ TechDocs / Projects / ScriptTest /Output/JPittman/Docs11/Default.htm
在x2:apply-templates / @ select上的char 9处警告h2toh1.xsl的第55列55:XTRE1500:无法读取在同一转换过程中编写的文档:file:/// C:/ TechDocs / Projects /ScriptTest/Output/JPittman/Docs11/Default.htm
我了解问题的原因,但我不知道如何解决。我也尝试使用收集功能,因为无论如何重写每个页面似乎都很麻烦,但是我不知道如何实现。有帮助吗?
在规范中定义错误的原因是未定义执行顺序,因此,如果您在转换中读取和写入相同的文件,则原则上无法预测写入是否在写入之前完成,或之后。 (当然,在实践中,这通常不是真的,因为会存在功能依赖性。)