特定数据的错误desc排序性能 - Postgresql 10.3

Question

我有一个关于一个奇怪的（？）案例的问题，我发现它在Postgresql中排序（具体来说：10.3）。

我有一个表users与以下列：

id - varchar(36) - id是UUID格式
firstname - varchar(255)，
lastname - varchar(255)。

创建以下索引：

create unique index users_pkey on users (id);  
create index user_firstname on users (firstname);  
create index user_lastname on users (lastname);

现在，让我们考虑每个数据集的两个查询。

我放入表~100k行，其中firstname是一个随机的10个字符串。 1A） select id, firstname from users order by firstname asc, id asc limit 50; 以及此查询的执行计划： Limit (cost=7665.06..7665.18 rows=50 width=48) (actual time=105.012..105.016 rows=50 loops=1) -> Sort (cost=7665.06..7915.07 rows=100003 width=48) (actual time=105.012..105.014 rows=50 loops=1) Sort Key: firstname, id Sort Method: top-N heapsort Memory: 31kB -> Seq Scan on users (cost=0.00..4343.03 rows=100003 width=48) (actual time=0.009..21.510 rows=100003 loops=1) Planning time: 0.066 ms Execution time: 105.031 ms 图1b） select id, firstname from users order by firstname desc, id desc limit 50; 排序被更改 - desc而不是asc 以及此查询的执行计划： Limit (cost=7665.06..7665.18 rows=50 width=48) (actual time=105.586..105.590 rows=50 loops=1) -> Sort (cost=7665.06..7915.07 rows=100003 width=48) (actual time=105.586..105.589 rows=50 loops=1) Sort Key: firstname DESC, id DESC Sort Method: top-N heapsort Memory: 31kB -> Seq Scan on users (cost=0.00..4343.03 rows=100003 width=48) (actual time=0.010..21.670 rows=100003 loops=1) Planning time: 0.068 ms Execution time: 105.606 ms

到现在为止还挺好。两个方向的排序需要相似的时间。

我们考虑第二个数据集。我放入表~100k行，其中firstname是以下格式的字符串：JohnXXXXX，其中XXXXX是数字序列，即John00000，John00001，John00002，John00003，...，John99998，John99999。图2a） select id, firstname from users order by firstname asc, id asc limit 50; 以及此查询的执行计划： Limit (cost=7665.06..7665.18 rows=50 width=43) (actual time=99.572..99.577 rows=50 loops=1) -> Sort (cost=7665.06..7915.07 rows=100003 width=43) (actual time=99.572..99.573 rows=50 loops=1) Sort Key: firstname, id Sort Method: top-N heapsort Memory: 29kB -> Seq Scan on users (cost=0.00..4343.03 rows=100003 width=43) (actual time=0.009..23.660 rows=100003 loops=1) Planning time: 0.064 ms Execution time: 99.592 ms 图2b） select id, firstname from users order by firstname desc, id desc limit 50; 排序被更改 - desc而不是asc 以及此查询的执行计划： Limit (cost=7665.06..7665.18 rows=50 width=43) (actual time=659.786..659.791 rows=50 loops=1) -> Sort (cost=7665.06..7915.07 rows=100003 width=43) (actual time=659.785..659.786 rows=50 loops=1) Sort Key: firstname DESC, id DESC Sort Method: top-N heapsort Memory: 32kB -> Seq Scan on users (cost=0.00..4343.03 rows=100003 width=43) (actual time=0.010..21.510 rows=100003 loops=1) Planning time: 0.066 ms Execution time: 659.804 ms

对于第二个数据集，第二个查询（2b）慢7倍。

总结一下：

+----------------+------------+------------+
| Query\Data set |     1      |      2     |
+----------------+------------+------------+
|  1             | 105.031 ms | 99.592 ms  |
|  2             | 105.606 ms | 659.804 ms |
+----------------+------------+------------+

最后，我的问题。为什么第二个数据集的第二个查询比其他数据集慢6-7倍？

Answer 1

添加额外的50k数据后，您是否重建了索引？检查碎片。

特定数据的错误desc排序性能 - Postgresql 10.3

问题描述投票：1回答：1

1个回答

最新问题

特定数据的错误desc排序性能 - Postgresql 10.3

问题描述 投票：1回答：1

1个回答

最新问题

问题描述投票：1回答：1