我在 SQL 中有这个表:
CREATE TABLE books_returned
(
name VARCHAR(50),
book_id VARCHAR(50),
year_book_returned INT
);
INSERT INTO books_returned (name, book_id, year_book_returned)
VALUES
('john', 'julius ceasar', 2010),
('john', 'julius caesar', 2010),
('john', 'hamlet', 2010),
('john', 'hamlet', 2010),
('john', 'othello', 2009),
('john', 'othello', 2009),
('kevin', 'macbeth', 2015),
('kevin', 'tempest', 2020),
('david', 'romeojuliet', 2010),
('david', 'romeojuliet', 2010),
('david', 'romeojuliet', 2010),
('david', 'king lear', 2005);
name book_id year_book_returned
------------------------------------------------
john julius ceasar 2010
john julius caesar 2010
john hamlet 2010
john hamlet 2010
john othello 2009
john othello 2009
kevin macbeth 2015
kevin tempest 2020
david romeojuliet 2010
david romeojuliet 2010
david romeojuliet 2010
david king lear 2005
对于每个名称,我想保留年份最大的书的所有列和所有行。
由于存在平局,我不在乎选择哪一组重复项。以下两个选项对我来说都可以:
输出#1:可接受
name book_id year_book_returned
------------------------------------------
john hamlet 2010
john hamlet 2010
evin tempest 2020
david romeojuliet 2010
david romeojuliet 2010
david romeojuliet 2010
输出#2:可接受
name book_id year_book_returned
--------------------------------------------------
john julius ceasar 2010
john julius caesar 2010
kevin tempest 2020
david romeojuliet 2010
david romeojuliet 2010
david romeojuliet 2010
我尝试了这个查询,首先在子查询中找到最大年份并将其连接回原始表:
SELECT br.*
FROM books_returned br
JOIN (
SELECT name, MAX(year_book_returned) as max_year
FROM books_returned
GROUP BY name
) as subquery
ON br.name = subquery.name AND br.year_book_returned = subquery.max_year;
但是这是不正确的:这显示了约翰的所有书籍:
name book_id year_book_returned
john julius ceasar 2010
john julius caesar 2010
john hamlet 2010
john hamlet 2010
kevin tempest 2020
david romeojuliet 2010
david romeojuliet 2010
david romeojuliet 2010
有人可以告诉我如何正确执行此操作吗?我正在考虑使用
partition row_number order by random()
语句创建一个排名变量,然后选择最低排名?
我刚刚知道如何使用双秩程序来做到这一点:
WITH LatestYearBooks AS (
SELECT name, book_id, year_book_returned,
RANK() OVER (
PARTITION BY name
ORDER BY year_book_returned DESC
) AS year_rank
FROM books_returned
),
RankedBooks AS (
SELECT name, book_id, year_book_returned,
RANK() OVER (
PARTITION BY name, year_book_returned
ORDER BY book_id
) AS book_rank
FROM LatestYearBooks
WHERE year_rank = 1
)
SELECT name, book_id, year_book_returned
FROM RankedBooks
WHERE book_rank = 1;
这给出了所需的结果:
name book_id year_book_returned
david romeojuliet 2010
david romeojuliet 2010
david romeojuliet 2010
john hamlet 2010
john hamlet 2010
kevin tempest 2020