我有父子关系最大15级。我需要找出子节点的每个父节点的所有层次结构表。我曾尝试递归查询,但它不是在蜂巢和Impala的工作。请建议一些查询来解决这个说法。
+---------------+----------------+
|parent_node_id | child_node_id |
+---------------+----------------+
| C1 | C11 |
+---------------+----------------+
| C11 | C12 |
+---------------+----------------+
| C12 | 123 |
+---------------+----------------+
| C12 | 456 |
+---------------+----------------+
| P1 | C1 |
+---------------+----------------+
| P1 | C2 |
+---------------+----------------+
| P2 | C3 |
+---------------+----------------+
+-------+----------+
| node | balance |
+-------+----------+
| 123 | 100 |
+-------+----------+
| 456 | 150 |
+-------+----------+
| C2 | 400 |
+-------+----------+
| C3 | 200 |
+-------+----------+
输出数据:
+---------------+-------------+---------+----------+----------+
|Parent_node_id | Level 1 | Level 2 | Level 2 | Level3 |
+---------------+-------------+---------+----------+----------+
| P1 | C1 | C11 | C12 | 123 |
+---------------+-------------+---------+----------+----------+
| P1 | C1 | C11 | C12 | 456 |
+---------------+-------------+---------+----------+----------+
| P1 | C2 | NULL | NULL | NULL |
+---------------+-------------+---------+----------+----------+
| P2 | C3 | NULL | NULL | NULL |
+---------------+-------------+---------+----------+----------+
+-------+----------+
| node | balance |
+-------+----------+
| 123 | 100 |
+-------+----------+
| 456 | 150 |
+-------+----------+
| C12 | 250 |
+-------+----------+
| C11 | 250 |
+-------+----------+
| C1 | 250 |
+-------+----------+
| P1 | 650 |
+-------+----------+
| C2 | 200 |
+-------+----------+
| P2 | 200 |
+-------+----------+
既然你知道,可以在层次结构中的最大关系,
下面是查询 -
hive> select * from hier;
OK
parent child child_desc
C1 C11 Party
C1 C111 Placement
C111 C1111 XM
C12 C121 Brand
C2 C12 Equity
XN12 C1 Private
XN12 C2 Public
Time taken: 0.123 seconds, Fetched: 7 row(s)
查询 -
with lvl1 as
(
select p1.parent, p1.child child1, p1.child_desc child_desc1, p2.child child2, p2.child_desc child_desc2
from hier p1 left outer join hier p2
on p1.child = p2.parent
)
, lvl2 as
(
select lvl1.parent, lvl1.child1, lvl1.child_desc1, lvl1.child2 child2, lvl1.child_desc2, p3.parent p2,
p3.child as child3, p3.child_desc child_desc3
from lvl1 left outer join hier p3 on nvl(lvl1.child2,'A') = nvl(p3.parent,'A')
)
, lvl3 as
(
select lvl2.parent, lvl2.child1 child1, lvl2.child_desc1, lvl2.child2 child2, lvl2.child_desc2,
lvl2.child3 hild3, lvl2.child_desc3, p3.child as child4, p3.child as child_desc4
from lvl2 left outer join hier p3 on nvl(lvl2.child3,'A') = nvl(p3.parent,'A')
)
select * from lvl3;
C1 C11 Party NULL NULL NULL NULL NULL NULL
C1 C111 Placement C1111 XM NULL NULL NULL NULL
C111 C1111 XM NULL NULL NULL NULL NULL NULL
C12 C121 Brand NULL NULL NULL NULL NULL NULL
C2 C12 Equity C121 Brand NULL NULL NULL NULL
XN12 C1 Private C11 Party NULL NULL NULL NULL
XN12 C1 Private C111 Placement C1111 XM NULL NULL
XN12 C2 Public C12 Equity C121 Brand NULL NULL
以上查询是3级,你可以把它复制到级别15。
我已经测试了上面的查询3级。
hive> select * from hier;
OK
parent child
C1 C11
C11 C12
C12 123
P1 C1
P2 C2
O / P -
C1 C11 C12 123 NULL
C11 C12 123 NULL NULL
C12 123 NULL NULL NULL
P1 C1 C11 C12 123
P2 C2 NULL NULL NULL
如果你想为正的水平的查询,你不想硬编码查询,这将需要一些脚本(具有良好的工作量)来创建查询或多次重复相同的查询。
希望这可以帮助