XML模式到Hive模式

问题描述 投票:1回答:1

我正在尝试将xml文件加载到hive表中。我正在使用xml serde here。我能够加载简单的平面xml文件。但是当xml中有嵌套元素时,我使用hive复杂数据类型来存储它们(例如,array<struct>)。下面是我尝试加载的示例xml。我的目标是将所有元素,属性和内容加载到hive表中。

<description action="up">
    <name action="aorup" ln="te">
    this is name1
    </name>
    <name action="aorup" ln="tm">
    this is name2
    </name>
    <name action="aorup" ln="hi">
    this is name2
    </name>
</description>

我想要获得的Hive输出是......

{action:"up", name:[{action:"aorup", ln:"te", content:"this is name1"}, {action:"aorup", ln:"tm", content:"this is name2"}, {action:"aorup", ln:"hi", content:"this is name3"}]}

我想将整个xml加载到一个hive列​​中。我尝试了以下方法:

CREATE TABLE description(
description STRUCT<
Action:STRING, 
name:ARRAY<STRUCT<
    Action:STRING, ln:STRING, content:STRING
    >>
>)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"xml.processor.class"="com.ximpleware.hive.serde2.xml.vtd.XmlProcessor",
"column.xpath.description"="/description")
STORED AS INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
TBLPROPERTIES ("xmlinput.start"="<description ","xmlinput.end"= "</description>");

但我得到Label字段的空值。有人能帮我吗?

谢谢

xml hadoop hive hive-serde
1个回答
1
投票
create external table description
(
    description struct<action:string,description:array<struct<action:string,ln:string,name:string>>>
)
row format serde 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
with serdeproperties 
(
    "column.xpath.description" = "/description"
)
stored as 
inputformat     'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
outputformat    'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
tblproperties   
(
    "xmlinput.start" = "<description "
   ,"xmlinput.end"   = "</description>"
)
;

select * from description
;

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|                                                                                         description                                                                                         |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| {"action":"up","description":[{"action":"aorup","ln":"te","name":"this is name1"},{"action":"aorup","ln":"tm","name":"this is name2"},{"action":"aorup","ln":"hi","name":"this is name2"}]} |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
© www.soinside.com 2019 - 2024. All rights reserved.