ColA | ColB | ColC |
---|---|---|
A | 1 | 12 |
A | 2 | 12 |
A | 3 | 12 |
乙 | 1 | 12 |
乙 | 2 | 12 |
乙 | 3 | 12 |
C | 1 | 12 |
C | 2 | 12 |
C | 3 | 12 |
我有这样的表格数据。期望结果是
ColA | ColB | ColC |
---|---|---|
A | 1 | 12 |
乙 | 2 | 12 |
C | 3 | 12 |
或
ColA | ColB | ColC |
---|---|---|
A | 3 | 12 |
乙 | 2 | 12 |
C | 1 | 12 |
或
ColA | ColB | ColC |
---|---|---|
A | 1 | 12 |
乙 | 3 | 12 |
C | 2 | 12 |
如果插入了4条记录,则在最终临时表中加入16(4*4)条记录。在这种情况下如何删除不需要的数据?
这是试过的
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Col1 ORDER BY Col2) AS Row_count
FROM table_1
)
DELETE FROM cte WHERE rn<>1;
我明白了。这是错误的。 ColB数据删除后不能重复,每条记录必须唯一。
ColA | ColB | ColC |
---|---|---|
A | 1 | 12 |
乙 | 1 | 12 |
C | 1 | 12 |
您可以使用两个分区的组合并检查差异:
WITH cte AS (
SELECT ColA, ColB, ColC,
ROW_NUMBER() OVER (PARTITION BY ColA ORDER BY ColB) AS Row_count1,
ROW_NUMBER() OVER (PARTITION BY ColB ORDER BY ColA) AS Row_count2
FROM table_1
)
DELETE FROM cte WHERE Row_count1<>Row_count2
在这里查看数据库小提琴 https://dbfiddle.uk/aLsQLHIt
我想对行进行两次排序:第一次是在 colB、colC 和 colA 上,所以中间结果是:
ColA | ColB | ColC | rank_bc | rank_a |
---|---|---|---|---|
A | 1 | 12 | 1 | 1 |
A | 2 | 12 | 2 | 1 |
A | 3 | 12 | 3 | 1 |
乙 | 3 | 12 | 3 | 2 |
乙 | 2 | 12 | 2 | 2 |
乙 | 1 | 12 | 1 | 2 |
C | 1 | 12 | 1 | 3 |
C | 2 | 12 | 2 | 3 |
C | 3 | 12 | 3 | 3 |
之后剩下要做的就是用 rank_bc = rank_a 过滤行
with test_data_ranked as (
select ColA, ColB, ColC,
dense_rank() over (order by colB, colC) rank_bc,
dense_rank() over (order by colA) rank_a
from test_data)
select colA, colB, colC
from test_data_ranked
where rank_bc = rank_a;
对于此示例数据,您只需要
DENSE_RANK()
窗口函数即可为 ColA
的每个值获取组号:
WITH cte AS (SELECT *, DENSE_RANK() OVER (ORDER BY ColA) rn FROM t)
DELETE FROM cte WHERE ColB <> rn;
查看演示。
您可以通过使用查询存储来解决 cte 和窗口函数的问题
;WITH cte AS (
select *,row_number() OVER (PARTITION BY Rw order by ColB ) AS Rwn from (
SELECT *,
row_number() OVER (PARTITION BY ColB order by colA ) AS Rw
FROM Ta
)d
)
delete FROM cte WHERE Rwn>1;
您可以使用以下语句插入基础数据:
drop table if exists Ta
create table Ta(
ColA varchar(100), ColB int, ColC int)
insert into Ta(ColA, ColB,ColC) values ('A', 1, 12)
insert into Ta(ColA, ColB,ColC) values ('A', 2, 12)
insert into Ta(ColA, ColB,ColC) values ('A', 3, 12)
insert into Ta(ColA, ColB,ColC) values ('B', 1, 12)
insert into Ta(ColA, ColB,ColC) values ('B', 2, 12)
insert into Ta(ColA, ColB,ColC) values ('B', 3, 12)
insert into Ta(ColA, ColB,ColC) values ('C', 1, 12)
insert into Ta(ColA, ColB,ColC) values ('C', 2, 12)
insert into Ta(ColA, ColB,ColC) values ('C', 3, 12)