特定数据的错误desc排序性能 - Postgresql 10.3

问题描述 投票:1回答:1

我有一个关于一个奇怪的(?)案例的问题,我发现它在Postgresql中排序(具体来说:10.3)。

我有一个表users与以下列:

  • id - varchar(36) - id是UUID格式
  • firstname - varchar(255)
  • lastname - varchar(255)

创建以下索引:

create unique index users_pkey on users (id);  
create index user_firstname on users (firstname);  
create index user_lastname on users (lastname);  

现在,让我们考虑每个数据集的两个查询。

  1. 我放入表~100k行,其中firstname是一个随机的10个字符串。 1A) select id, firstname from users order by firstname asc, id asc limit 50; 以及此查询的执行计划: Limit (cost=7665.06..7665.18 rows=50 width=48) (actual time=105.012..105.016 rows=50 loops=1) -> Sort (cost=7665.06..7915.07 rows=100003 width=48) (actual time=105.012..105.014 rows=50 loops=1) Sort Key: firstname, id Sort Method: top-N heapsort Memory: 31kB -> Seq Scan on users (cost=0.00..4343.03 rows=100003 width=48) (actual time=0.009..21.510 rows=100003 loops=1) Planning time: 0.066 ms Execution time: 105.031 ms 图1b) select id, firstname from users order by firstname desc, id desc limit 50; 排序被更改 - desc而不是asc 以及此查询的执行计划: Limit (cost=7665.06..7665.18 rows=50 width=48) (actual time=105.586..105.590 rows=50 loops=1) -> Sort (cost=7665.06..7915.07 rows=100003 width=48) (actual time=105.586..105.589 rows=50 loops=1) Sort Key: firstname DESC, id DESC Sort Method: top-N heapsort Memory: 31kB -> Seq Scan on users (cost=0.00..4343.03 rows=100003 width=48) (actual time=0.010..21.670 rows=100003 loops=1) Planning time: 0.068 ms Execution time: 105.606 ms

到现在为止还挺好。两个方向的排序需要相似的时间。

  1. 我们考虑第二个数据集。我放入表~100k行,其中firstname是以下格式的字符串:JohnXXXXX,其中XXXXX是数字序列,即John00000,John00001,John00002,John00003,...,John99998,John99999。 图2a) select id, firstname from users order by firstname asc, id asc limit 50; 以及此查询的执行计划: Limit (cost=7665.06..7665.18 rows=50 width=43) (actual time=99.572..99.577 rows=50 loops=1) -> Sort (cost=7665.06..7915.07 rows=100003 width=43) (actual time=99.572..99.573 rows=50 loops=1) Sort Key: firstname, id Sort Method: top-N heapsort Memory: 29kB -> Seq Scan on users (cost=0.00..4343.03 rows=100003 width=43) (actual time=0.009..23.660 rows=100003 loops=1) Planning time: 0.064 ms Execution time: 99.592 ms 图2b) select id, firstname from users order by firstname desc, id desc limit 50; 排序被更改 - desc而不是asc 以及此查询的执行计划: Limit (cost=7665.06..7665.18 rows=50 width=43) (actual time=659.786..659.791 rows=50 loops=1) -> Sort (cost=7665.06..7915.07 rows=100003 width=43) (actual time=659.785..659.786 rows=50 loops=1) Sort Key: firstname DESC, id DESC Sort Method: top-N heapsort Memory: 32kB -> Seq Scan on users (cost=0.00..4343.03 rows=100003 width=43) (actual time=0.010..21.510 rows=100003 loops=1) Planning time: 0.066 ms Execution time: 659.804 ms

对于第二个数据集,第二个查询(2b)慢7倍。

总结一下:

+----------------+------------+------------+
| Query\Data set |     1      |      2     |
+----------------+------------+------------+
|  1             | 105.031 ms | 99.592 ms  |
|  2             | 105.606 ms | 659.804 ms |
+----------------+------------+------------+

最后,我的问题。为什么第二个数据集的第二个查询比其他数据集慢6-7倍?

sql database postgresql sorting postgresql-10
1个回答
0
投票

添加额外的50k数据后,您是否重建了索引?检查碎片。

最新问题
© www.soinside.com 2019 - 2025. All rights reserved.