尝试从表(first_stage)更新表(最终)的值,这两个表具有相同的字段名但不同的值,表内容是:
(Min_Date) date,
(Max_Date) date,
(NoofDays) int,
(IMSI) string,
(Site) string,
(Down_Link) int,
(Up_Link) int,
(Connection) int
基于IMSI和Site,如果它存在于表行,则将最小日期作为Min_Date,最大日期作为Max_Date,并获取
min(Min_Date),max(Max_Date)sum(NoofDays),sum(Down_Link),sum(up_Link),sum(connection)
如果行id与表(Final)不匹配(IMSI,Site),则将行插入final表。我还是新手用sql
table first_stage:
MinDate Max_Date NoofDays IMSI Site Down_link Up_link Connection
2019-03-22 2019-03-26 1 222 google 1 1 1
2019-03-26 2019-03-27 3 222 youtube 1 1 1
2019-03-02 2019-03-27 5 333 facebook 2 3 1
2019-03-02 2019-03-27 5 111 facebook 20 33 11
table final:
MinDate Max_Date NoofDays IMSI Site Down_link Up_link Connection
2019-03-01 2019-03-27 1 222 google 2 2 1
2019-03-12 2019-03-25 1 222 youtube 2 2 2
2019-03-25 2019-03-27 4 333 facebook 3 6 1
它必须与IMSI和Site匹配才能进行更新声明,更新后的最终表必须如下所示:
table final:
MinDate Max_Date NoofDays IMSI Site Down_link Up_link Connection
2019-03-01 2019-03-27 2 222 google 3 3 2
2019-03-12 2019-03-27 4 222 youtube 3 3 3
2019-03-02 2019-03-27 9 333 facebook 5 9 2
2019-03-02 2019-03-27 5 111 facebook 20 33 11
我从未使用过vertica,但我认为这可能有效:
MERGE
INTO FINAL
USING FIRST_STAGE
ON IMSI = FIRST_STAGE.IMSI and Site = FIRST_STAGE.Site
WHEN MATCHED THEN UPDATE SET
Min_Date = least(FIRST_STAGE.Min_Date, Min_Date),
Max_Date = greatest(FIRST_STAGE.Max_Date, Max_Date),
NoofDays = FIRST_STAGE.NoofDays + NoofDays,
Down_Link = FIRST_STAGE.Down_Link + Down_Link,
up_Link = FIRST_STAGE.up_Link + up_Link,
connection = FIRST_STAGE.connection + connection
WHEN NOT MATCHED THEN INSERT ( Min_Date,
Max_Date,
NoofDays,
IMSI,
Site,
Down_Link,
Up_Link,
Connection )
VALUES ( FIRST_STAGE.Min_Date,
FIRST_STAGE.Max_Date,
FIRST_STAGE.NoofDays,
FIRST_STAGE.IMSI,
FIRST_STAGE.Site,
FIRST_STAGE.Down_Link,
FIRST_STAGE.Up_Link,
FIRST_STAGE.Connection )