我得到了一些OAI-PMH XML文件,其格式如下:
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/bundles/ojsoai/oai2.xsl" ?>
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2015-12-26T23:54:31Z</responseDate>
<request verb="ListRecords" metadataPrefix="oai_dc">http://oai_dc/</request>
<ListRecords>
<record>
<header>
<identifier>identifier</identifier>
<datestamp>2015-12-01T00:00:00Z</datestamp>
<setSpec>iksad</setSpec>
</header>
<metadata>
<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ http://www.openarchives.org/OAI/2.0/oai_dc.xsd">
<dc:title xml:lang="en-US">title en</dc:title>
<dc:title xml:lang="fr-FR">title fr</dc:title>
<dc:creator>creator</dc:creator>
<dc:subject>subject</dc:subject>
<dc:description>description1</dc:description>
<dc:description>description2</dc:description>
<dc:publisher>publisher</dc:publisher>
<dc:date>2015-12-01T00:00:00Z</dc:date>
<dc:type>type</dc:type>
<dc:format>application/pdf</dc:format>
<dc:identifier>identifier</dc:identifier>
<dc:identifier/>
<dc:source xml:lang="en-US">source</dc:source>
<dc:source>source</dc:source>
<dc:source>source</dc:source>
<dc:language>en</dc:language>
<dc:relation>relation</dc:relation>
</oai_dc:dc>
</metadata>
</record>
<record>
<header>
..
..
我尝试过
xml sel -T -t -m '/OAI-PMH/ListRecords/record/metadata' test.xml
和
xml sel -t -v "//*[local-name()='metadata']" test.xml
我只想转换格式为CSV的文件,但没有运气。
title lang us; title lang fr; description1; description2
有人可以帮我吗?
谢谢您的帮助...
这里有很多名称空间。有100%的把握可以做到这一点的更好方法(xmlstarlet支持名称空间),但这应该可行: