保留具有最高年份(每个名称)的整套重复项

问题描述 投票:0回答:1

我在 SQL 中有这个表:

CREATE TABLE books_returned 
(
    name VARCHAR(50),
    book_id VARCHAR(50),
    year_book_returned INT
);

INSERT INTO books_returned (name, book_id, year_book_returned) 
VALUES
('john', 'julius ceasar', 2010),
('john', 'julius caesar', 2010),
('john', 'hamlet', 2010),
('john', 'hamlet', 2010),
('john', 'othello', 2009),
('john', 'othello', 2009),
('kevin', 'macbeth', 2015),
('kevin', 'tempest', 2020),
('david', 'romeojuliet', 2010),
('david', 'romeojuliet', 2010),
('david', 'romeojuliet', 2010),
('david', 'king lear', 2005);

 name       book_id           year_book_returned
 ------------------------------------------------
  john     julius ceasar      2010
  john     julius caesar      2010
  john     hamlet             2010
  john     hamlet             2010
  john     othello            2009
  john     othello            2009
 kevin     macbeth            2015
 kevin     tempest            2020
 david     romeojuliet        2010
 david     romeojuliet        2010
 david     romeojuliet        2010
 david     king lear          2005

对于每个名称,我想保留年份最大的书的所有列和所有行。

由于存在平局,我不在乎选择哪一组重复项。以下两个选项对我来说都可以:

输出#1:可接受

 name      book_id      year_book_returned
 ------------------------------------------
 john      hamlet               2010
 john      hamlet               2010
 evin      tempest              2020
 david     romeojuliet          2010
 david     romeojuliet          2010
 david     romeojuliet          2010

输出#2:可接受

 name       book_id year_book_returned
 --------------------------------------------------
  john julius ceasar               2010
  john julius caesar               2010
 kevin       tempest               2020
 david   romeojuliet               2010
 david   romeojuliet               2010
 david   romeojuliet               2010

我尝试了这个查询,首先在子查询中找到最大年份并将其连接回原始表:

SELECT br.*
FROM books_returned br
JOIN (
    SELECT name, MAX(year_book_returned) as max_year
    FROM books_returned
    GROUP BY name
) as subquery
ON br.name = subquery.name AND br.year_book_returned = subquery.max_year;

但是这是不正确的:这显示了约翰的所有书籍:

 name       book_id year_book_returned
  john julius ceasar               2010
  john julius caesar               2010
  john        hamlet               2010
  john        hamlet               2010
 kevin       tempest               2020
 david   romeojuliet               2010
 david   romeojuliet               2010
 david   romeojuliet               2010

有人可以告诉我如何正确执行此操作吗?我正在考虑使用

partition row_number order by random()
语句创建一个排名变量,然后选择最低排名?

sql db2
1个回答
0
投票

我刚刚知道如何使用双秩程序来做到这一点:

WITH LatestYearBooks AS (
    SELECT name, book_id, year_book_returned,
           RANK() OVER (
               PARTITION BY name
               ORDER BY year_book_returned DESC
           ) AS year_rank
    FROM books_returned
),
RankedBooks AS (
    SELECT name, book_id, year_book_returned,
           RANK() OVER (
               PARTITION BY name, year_book_returned
               ORDER BY book_id
           ) AS book_rank
    FROM LatestYearBooks
    WHERE year_rank = 1
)
SELECT name, book_id, year_book_returned
FROM RankedBooks
WHERE book_rank = 1;

这给出了所需的结果:

  name     book_id year_book_returned
 david romeojuliet               2010
 david romeojuliet               2010
 david romeojuliet               2010
  john      hamlet               2010
  john      hamlet               2010
 kevin     tempest               2020
© www.soinside.com 2019 - 2024. All rights reserved.