在XML / XSD模式文件中,我们如何从元素中提取(解析)某些信息?

问题描述 投票:0回答:3

是否有一种方法可以解析XML文件中的某些数据,然后将该信息输出到excel(csv)文件中?我想知道是否可以在C#.NET应用程序中将其自动化,而不是手动浏览XML文件并在excel工作表中对其进行更新。这样,如果我有许多XML文件,则可以运行该应用程序以获取所需的信息。

例如,假设XML文件中存在此元素:

<element type="xpowercomponent_FeaturedTiles" UID="40204fc9b5424b349e03134d777d29bc" label="Featured Tiles" readonly="false" hidden="false" default="" required="false" Component="" CompTypes="ThemedTileSetComponent;ThemedTileSetElectedComponent;" AutoEmbed="" WrappedUp="" AllowWrappingChange="" />

我正在尝试从XML文件的each元素中提取的数据是,元素:类型,标签,CompTypes,只读,隐藏,必需。

我已经尝试执行以下操作,但是由于对XML的使用不太熟悉,因此不确定该方法是否可行。

static void Main(string[] args)
    {
        XmlSchemaSet schemas = new XmlSchemaSet();
        schemas.Add("", XmlReader.Create(new StringReader(File.ReadAllText("Schema.xsd"))));
        schemas.Compile();
        AnalyseSchema(schemas);
        Console.ReadLine();
    }
}

XmlSchema customerSchema = null;
        foreach (XmlSchema schema in set.Schemas())
        {
            customerSchema = schema;
        }

这里是完整的XML模式文件:

<Schema xmlns:dt="urn:schemas-microsoft-com:datatypes" name="TopTier" xmlns="urn:schemas-microsoft-com:xml-data">
  <AttributeType name="ID" dt:type="id" required="yes" />
  <AttributeType name="Name" dt:type="string" required="yes" />
  <AttributeType name="UniqueID" dt:type="int" default="schemas/107" />
  <AttributeType name="Icon" dt:type="int" default="2" />
  <AttributeType name="FriendlyName" dt:type="string" default="Top Tier" />
  <AttributeType name="SS" dt:type="boolean" default="true" />
  <AttributeType name="DSS" dt:type="boolean" default="true" />
  <AttributeType name="ViewName" dt:type="" default="" />
  <AttributeType name="DefaultCategorization" dt:type="string" default="" />
  <AttributeType name="Version" dt:type="int" default="9" />
  <ElementType name="Exports" content="textOnly" />
  <ElementType name="Title" dt:type="string" content="textOnly" />
  <ElementType name="Excerpt" dt:type="string" content="textOnly">
    <description>This info is used to display on pages that link to this page.</description>
  </ElementType>
  <ElementType name="ThumbnailImage" dt:type="imagefile" content="textOnly">
    <description>Thumbnail dimension should be 75 x 75 pixels.</description>
  </ElementType>
  <ElementType name="xhtml_BodyCopy" content="textOnly" />
  <ElementType name="xpowerlist_LeftSidebarComponentsOverride" content="eltOnly" />
  <ElementType name="xpowercomponent_LeftSidebarComponent" content="eltOnly" />
  <ElementType name="xpowercomponent_FeaturedTiles" content="eltOnly" />
  <ElementType name="SecondaryTitle" dt:type="string" content="textOnly" />
  <ElementType name="xpowerlist_SecondaryComponentList" content="eltOnly" />
  <ElementType name="xpowercomponent_SecondaryItem" content="eltOnly" />
  <ElementType name="xpowergroupstart_SEOGroup" content="eltOnly">
    <description>Search Engine Optimization Group</description>
  </ElementType>
  <ElementType name="BrowserTitle" dt:type="string" content="textOnly" />
  <ElementType name="MetaDescription" dt:type="string" content="textOnly" />
  <ElementType name="NoIndex" dt:type="boolean" content="textOnly" />
  <ElementType name="NoFollow" dt:type="boolean" content="textOnly" />
  <ElementType name="NoODP" dt:type="boolean" content="textOnly" />
  <ElementType name="Priority" dt:type="enumeration" values="0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0" content="textOnly" />
  <ElementType name="PublishDate" dt:type="localdate" content="textOnly" />
  <ElementType name="xpowergroupend_" content="eltOnly" />
  <ElementType name="xpowernavigation_BreadcrumbNavigation" content="eltOnly" />
  <ElementType name="xpowercomponent_SectionControl" content="eltOnly" />
  <ElementType name="TopTier" content="eltOnly" order="seq">
    <attribute type="ID" />
    <attribute type="Name" />
    <element type="Title" UID="2bbcb2816ba743e2bd935d526f47d688" label="Title" readonly="false" hidden="false" default="" required="true" />
    <element type="Excerpt" UID="2ac3ade7d9c34cae8a7f304d9754db86" label="Excerpt" readonly="false" hidden="false" default="" required="true" />
    <element type="ThumbnailImage" UID="f4940dbfe27f4783a97c356f3ba0f959" label="Thumbnail Image" readonly="false" hidden="false" default="" required="false" Expanded="false" Width="" Height="" Border="" HSpace="" VSpace="" Alignment="" AlternateText="" />
    <element type="xhtml_BodyCopy" UID="055d4709ffb24402a14f9d13af417401" label="Body Copy" readonly="false" hidden="false" default="" required="false" Height="" />
    <element type="xpowerlist_LeftSidebarComponentsOverride" UID="1d281fa780b6419089c52913640ade93" label="Left Sidebar Components Override" readonly="false" hidden="false" default="" required="false" Enclosed="" AllowEnclosureChange="" List="xpowercomponent_LeftSidebarComponent" Component="" CompTypes="" AutoEmbed="" WrappedUp="" AllowWrappingChange="" />
    <element type="xpowercomponent_FeaturedTiles" UID="40204fc9b5424b349e03134d777d29bc" label="Featured Tiles" readonly="false" hidden="false" default="" required="false" Component="" CompTypes="ThemedTileSetComponent;ThemedTileSetElectedComponent;" AutoEmbed="" WrappedUp="" AllowWrappingChange="" />
    <element type="SecondaryTitle" UID="3d00fbe341ff4f2481c6199deb046997" label="Secondary Title" readonly="false" hidden="false" default="" required="false" />
    <element type="xpowerlist_SecondaryComponentList" UID="3bc6293a826a4f2081dca1c5e1415978" label="Secondary Component List" readonly="false" hidden="false" default="" required="false" Enclosed="" AllowEnclosureChange="" List="xpowercomponent_SecondaryItem" Component="" CompTypes="TaxonomyTileComponent;ServiceComponent;ContactComponent;TitledCollection" AutoEmbed="" WrappedUp="" AllowWrappingChange="" />
    <element type="xpowergroupstart_SEOGroup" UID="bf312bbd47d74c758d13695b8fa011c0" label="SEO Group" readonly="false" hidden="true" default="" required="false" Enclosed="" AllowEnclosureChange="" />
    <element type="BrowserTitle" UID="c777f36e6f094b50a04692d3bfef387e" label="Browser Title" readonly="false" hidden="false" default="" required="false" />
    <element type="MetaDescription" UID="05bab416415541beb05e90b94f7e5248" label="Meta Description" readonly="false" hidden="false" default="" required="false" />
    <element type="NoIndex" UID="e224c1a3aaa846d9b19579d59d9b1af7" label="No Index" readonly="false" hidden="false" default="" required="false" />
    <element type="NoFollow" UID="b40ae068c0d749c4bb8ecd589388bb1b" label="No Follow" readonly="false" hidden="false" default="" required="false" />
    <element type="NoODP" UID="2557027d711d41c3b3ccbd3aab4650dd" label="No ODP" readonly="false" hidden="false" default="" required="false" />
    <element type="Priority" UID="f49eec4e94864e03aedbd0e2be6a2822" label="Priority" readonly="false" hidden="false" default="" required="false" values="0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0" />
    <element type="PublishDate" UID="5fdba65915744addb710631476822112" label="Publish Date" readonly="true" hidden="true" default="" required="false" />
    <element type="xpowergroupend_" />
    <element type="xpowernavigation_BreadcrumbNavigation" UID="140b1582e50e4a098e190942980bd728" label="Breadcrumb Navigation" readonly="false" hidden="true" default="" required="true" Expanded="false" Navigation="Ancestors" GenerationOrder="down" StartPage="x5" MaxNodes="" MaxDepth="" Query="">
      <element type="Exports" default="" />
    </element>
    <element type="xpowercomponent_SectionControl" UID="56564fa9c2bd4e9ba44a417960146376" label="SectionControl" readonly="false" hidden="true" default="" required="true" Component="" CompTypes="SectionControl" AutoEmbed="" WrappedUp="" AllowWrappingChange="" />
  </ElementType>
</Schema>
c# xml linq xsd xml-parsing
3个回答
0
投票

使用此代码。您需要将xmldocument转换为xdocument。因此,您可以轻松捕获每个元素及其数据。我使用了您提供的相同文件。我还提到了如何读取循环及其子元素的示例。

class Program
{
    static void Main(string[] args)
    {
        Parse();
    }

    public static void Parse()
    {

        XmlDocument xmlDoc = new XmlDocument();
        xmlDoc.Load(@"D:\New Text Document.xsd");

        var xdocument = xmlDoc.ToXDocument();
        foreach (var element in xdocument.Elements())
        {
            foreach (var node in element.Elements()) //childs...
            {
                if (node.Name.LocalName.Equals("objectId"))
                {
                    // obj.Id = node.Value;
                }

                if (node.Name.LocalName.Equals("name"))
                {
                    //obj.Name = node.Value;
                }

                if (node.Name.LocalName.Equals("scope"))
                {
                    foreach (var scopeNode in node.Elements())  //more childs
                    {
                        if (scopeNode.Name.LocalName.Equals("id"))
                        {
                            //  obj.ScopeId = scopeNode.Value;
                        }

                    }
                }
            }
        }
    }
}
public static class DocumentExtensions
{
    public static XmlDocument ToXmlDocument(this XDocument xDocument)
    {
        var xmlDocument = new XmlDocument();
        using (var xmlReader = xDocument.CreateReader())
        {
            xmlDocument.Load(xmlReader);
        }
        return xmlDocument;
    }

    public static XDocument ToXDocument(this XmlDocument xmlDocument)
    {
        using (var nodeReader = new XmlNodeReader(xmlDocument))
        {
            nodeReader.MoveToContent();
            return XDocument.Load(nodeReader);
        }
    }
}

0
投票

非常容易在XSLT中完成。您不需要架构。除非有特殊的字符需要转义,等等,它很简单:]

<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" expand-text="yes">
<xsl:mode on-no-match="shallow-skip"/>
<xsl:output method="text"/>  
<xsl:template match="Element"
>{@type},{@label},{@CompTypes},{@readonly},{@hidden},{@required}
</xsl:template>
</xsl:transform>

这是XSLT 3.0解决方案;如果您更喜欢使用.NET附带的XSLT 1.0处理器,则它比较冗长,但仍然非常简单。我没有包括标题行,但是添加它很简单。

在XSLT 3.0中,您甚至可以添加自动化功能以将其应用于整个XML文件集合(在1.0中,您需要在调用脚本中执行此操作。


0
投票

在C#中,使用System.Xml.XmlDocument和SelectNodes中的XPath语法

XmlDocument xml = new XmlDocument();
xml.Load( strFile );
foreach (XmlElement ndRow in xml.SelectNodes("//element")) {
    string strType = ndRow.GetAttribute("type");
    string strLabel = ndRow.GetAttribute("label");
}
© www.soinside.com 2019 - 2024. All rights reserved.