如何将 R 中的分层节点数据转换为数据框?

问题描述 投票:0回答:1

我有以下 xml 文件。

<?xml version="1.0" encoding="UTF-8"?>

<gudid xmlns="http://www.fda.gov/cdrh/gudid" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" version="1.0" xsi:schemaLocation="http://www.fda.gov/cdrh/gudid gudid.xsd">

  <header>
  <database frequency="monthly" id="5460" type="FULL">
  <downloadFile part="1" totalParts="174"/>
  <numberRecordXML>25000</numberRecordXML>
  <numberRecordsDatabase>4334252</numberRecordsDatabase>
  </database>
  <creationDate>2024-04-01T03:30:00</creationDate>
  <period end="2024-04-01T03:30:00" start="2014-09-24T00:00:00"/>
  </header>

  <device xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.fda.gov/cdrh/gudid">
    <publicDeviceRecordKey>33db3dc9-1c5e-4670-a3e1-b52ae0e8c7f0</publicDeviceRecordKey>
    <publicVersionStatus>Update</publicVersionStatus>
    <deviceRecordStatus>Published</deviceRecordStatus>
    <!--Note: publicVersionNumber is system generated and assigned to all device records to track public release updates to a device record-->
    <publicVersionNumber>3</publicVersionNumber>
    <publicVersionDate>2018-07-06</publicVersionDate>
    <devicePublishDate>2016-09-01</devicePublishDate>
    <deviceCommDistributionEndDate xsi:nil="true"></deviceCommDistributionEndDate>
    <deviceCommDistributionStatus>In Commercial Distribution</deviceCommDistributionStatus>
    <identifiers>
      <identifier>
        <deviceId>M991NM82330A243</deviceId>
        <deviceIdType>Package</deviceIdType>
        <deviceIdIssuingAgency>HIBCC</deviceIdIssuingAgency>
        <containsDINumber>M991NM82330A242</containsDINumber>
        <pkgQuantity>4</pkgQuantity>
        <pkgDiscontinueDate xsi:nil="true"></pkgDiscontinueDate>
        <pkgStatus>In Commercial Distribution</pkgStatus>
        <pkgType>Case</pkgType>
      </identifier>
      <identifier>
        <deviceId>M991NM82330A242</deviceId>
        <deviceIdType>Package</deviceIdType>
        <deviceIdIssuingAgency>HIBCC</deviceIdIssuingAgency>
        <containsDINumber>M991NM82330A241</containsDINumber>
        <pkgQuantity>8</pkgQuantity>
        <pkgDiscontinueDate xsi:nil="true"></pkgDiscontinueDate>
        <pkgStatus>In Commercial Distribution</pkgStatus>
        <pkgType>Box</pkgType>
      </identifier>
      <identifier>
        <deviceId>M991NM82330A241</deviceId>
        <deviceIdType>Primary</deviceIdType>
        <deviceIdIssuingAgency>HIBCC</deviceIdIssuingAgency>
        <containsDINumber xsi:nil="true"></containsDINumber>
        <pkgQuantity xsi:nil="true"></pkgQuantity>
        <pkgDiscontinueDate xsi:nil="true"></pkgDiscontinueDate>
        <pkgStatus xsi:nil="true"></pkgStatus>
        <pkgType xsi:nil="true"></pkgType>
      </identifier>
    </identifiers>
    <brandName>Clear-View MAX &quot;Sub-Q&quot; Infusion Set</brandName>
    <versionModelNumber>ClearView™MAX</versionModelNumber>
    <catalogNumber>NM82330A-24</catalogNumber>
    <dunsNumber>013861471</dunsNumber>
    <companyName>NORFOLK MEDICAL</companyName>
    <deviceCount>1</deviceCount>
    <deviceDescription>24G x 12mm x 24&quot; Clear-View MAX &quot;Sub-Q&quot; Infusion Set</deviceDescription>
    <DMExempt>true</DMExempt>
    <premarketExempt>false</premarketExempt>
    <deviceHCTP>false</deviceHCTP>
    <deviceKit>false</deviceKit>
    <deviceCombinationProduct>false</deviceCombinationProduct>
    <singleUse>true</singleUse>
    <lotBatch>true</lotBatch>
    <serialNumber>false</serialNumber>
    <manufacturingDate>true</manufacturingDate>
    <expirationDate>true</expirationDate>
    <donationIdNumber>false</donationIdNumber>
    <labeledContainsNRL>false</labeledContainsNRL>
    <labeledNoNRL>true</labeledNoNRL>
    <MRISafetyStatus>Labeling does not contain MRI Safety Information</MRISafetyStatus>
    <rx>true</rx>
    <otc>false</otc>
    <contacts>
      <customerContact>
        <phone>+1(847)674-7075</phone>
        <phoneExtension>102</phoneExtension>
        <email>[email protected]</email>
      </customerContact>
    </contacts>
    <premarketSubmissions>
      <premarketSubmission>
        <submissionNumber>K870188</submissionNumber>
        <supplementNumber>000</supplementNumber>
      </premarketSubmission>
    </premarketSubmissions>
    <gmdnTerms>
      <gmdn>
        <gmdnCode>35833</gmdnCode>
        <gmdnPTName>Electric infusion pump administration set, single-use</gmdnPTName>
        <gmdnPTDefinition>A collection of sterile devices (e.g., plastic tubing, check valve, roller clamp, Y-site connector, Luer, needle/catheter) intended to be used in combination with an electrically-powered infusion pump for the intravenous (IV), subcutaneous, intramuscular, or epidural administration of medication. This is a single-use device.</gmdnPTDefinition>
        <implantable>false</implantable>
        <gmdnCodeStatus>Active</gmdnCodeStatus>
      </gmdn>
    </gmdnTerms>
    <productCodes>
      <fdaProductCode>
        <productCode>FPA</productCode>
        <productCodeName>Set, administration, intravascular</productCodeName>
      </fdaProductCode>
    </productCodes>
    <deviceSizes>
      <deviceSize>
        <sizeType>Needle Gauge</sizeType>
        <size value="24" unit="Gauge"/>
        <sizeText xsi:nil="true"></sizeText>
      </deviceSize>
    </deviceSizes>
    <environmentalConditions/>
    <sterilization>
      <deviceSterile>true</deviceSterile>
      <sterilizationPriorToUse>true</sterilizationPriorToUse>
      <methodTypes>
        <sterilizationMethod>Ethylene Oxide</sterilizationMethod>
      </methodTypes>
    </sterilization>
  </device>

</gudid>

我想将其转换为数据框。标识符具有层次结构并且存在三个标识符。我编写的代码如下,但在这种情况下,所有标识符都放在标识符列的一个单元格中,导致无法区分这些值。

setwd('D:/')

doc <- read_xml('xmltest.xml')
xml <- xmlParse(doc)

df <- xmlToDataFrame(xml)

此外,我只想将设备节点转换为数据帧,而头节点保持不变。

r xml dataframe
1个回答
0
投票

这里是一个使用 jsonlite 库将列表转换为数据框的 around 方法。

library(xml2)
library(jsonlite)
page <- read_xml('xmltest.xml' )

#strip the names spaces
xml_ns_strip(page)

#convert to a list and then use json lite library to covert into a data frame
xml_find_all(page, ".//identifier") %>% as_list() %>% jsonlite::toJSON() %>% jsonlite::fromJSON(simplifyDataFrame = TRUE)
© www.soinside.com 2019 - 2024. All rights reserved.