我正在 AWS 中运行两个 Postgres 15(x86_64-pc-linux-gnu 上的 PostgreSQL 15.7,由 gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12),64 位编译)RDS 实例,一个用于我的“登台”环境,一个用于我的“生产”环境。我们正在运行一个查询,该查询在“生产”环境中比在“暂存”环境中花费的时间要长得多。 生产环境的数据甚至比登台少(至少在选定/连接的表中)。另外,生产环境并没有被大量使用,我们正处于早期测试阶段,所以基本上只有一个人在晚上使用它进行测试。
这是查询:
SELECT arenas.id,
arenas.display_name,
arenas.cover_image_path,
arenas.slug,
addresses.zip_code,
addresses.street,
addresses.number,
addresses.complement,
addresses.district,
addresses.latitude,
addresses.longitude,
cities.NAME AS city_name,
states.NAME AS state_name,
states.uf AS state_uf,
Array_to_json(Array_agg(DISTINCT ss2.sport_code)) AS available_sports,
Earth_distance(Ll_to_earth (addresses.latitude, addresses.longitude),
Ll_to_earth (-10.5555, -41.2751)) AS
meters_distance_between_user_and_arena
FROM "arenas"
INNER JOIN "addresses"
ON "arenas"."id" = "addresses"."addressable_id"
AND "addresses"."addressable_type" = 'App\Models\Arena'
AND "addresses"."type" = 'COMMERCIAL'
AND Earth_distance(Ll_to_earth (addresses.latitude,
addresses.longitude),
Ll_to_earth (-10.5555, -41.2751)) < 20000
INNER JOIN "services"
ON "services"."arena_id" = "arenas"."id"
AND "services"."status" = 'A'
AND "services"."deleted_at" IS NULL
AND "is_private" = false
INNER JOIN "service_sport"
ON "service_sport"."service_id" = "services"."id"
AND "service_sport"."sport_code" = 'BEACH_TENNIS'
INNER JOIN "service_prices"
ON "service_prices"."service_id" = "services"."id"
AND "service_prices"."is_default" = true
INNER JOIN "field_service"
ON "field_service"."service_id" = "services"."id"
INNER JOIN "fields"
ON "fields"."arena_id" = "arenas"."id"
AND "fields"."status" = 'A'
AND "fields"."deleted_at" IS NULL
INNER JOIN "contacts"
ON "contacts"."contactable_id" = "arenas"."id"
AND "contacts"."contactable_type" = 'App\Models\Arena'
AND "contacts"."is_main" = true
INNER JOIN "field_time_slots"
ON "field_time_slots"."arena_id" = "arenas"."id"
INNER JOIN "cities"
ON "cities"."ibge_code" = "addresses"."city_ibge_code"
INNER JOIN "states"
ON "states"."ibge_code" = "cities"."state_ibge_code"
INNER JOIN "sports"
ON "sports"."code" = "service_sport"."sport_code"
INNER JOIN "service_sport" AS "ss2"
ON "ss2"."arena_id" = "arenas"."id"
WHERE "approved_at" IS NOT NULL
AND EXISTS (SELECT *
FROM "subscriptions"
WHERE "arenas"."id" = "subscriptions"."arena_id"
AND "type" = 'access'
AND ( "ends_at" IS NULL
OR ( "ends_at" IS NOT NULL
AND "ends_at" > '2024-10-06 01:31:18' ) )
AND "stripe_status" != 'incomplete_expired'
AND "stripe_status" != 'unpaid'
AND "stripe_status" != 'past_due'
AND "stripe_status" != 'incomplete')
AND "business_hours_data" IS NOT NULL
AND "arenas"."deleted_at" IS NULL
GROUP BY "arenas"."id",
"arenas"."cover_image_path",
"addresses"."latitude",
"addresses"."longitude",
"addresses"."zip_code",
"addresses"."street",
"addresses"."number",
"addresses"."complement",
"addresses"."district",
"cities"."name",
"states"."name",
"states"."uf"
ORDER BY "meters_distance_between_user_and_arena" ASC;
这是来自生产环境的
:
Sort (cost=55657.12..55795.57 rows=55380 width=315) (actual time=563.084..563.104 rows=1 loops=1)
Sort Key: (sec_to_gc(cube_distance((ll_to_earth((addresses.latitude)::double precision, (addresses.longitude)::double precision))::cube, '(3491544.0649759113, -4339378.172513269, -3108045.069568795)'::cube)))
Sort Method: quicksort Memory: 25kB
-> GroupAggregate (cost=12417.08..43152.98 rows=55380 width=315) (actual time=563.077..563.097 rows=1 loops=1)
Group Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
-> Sort (cost=12417.08..12555.53 rows=55380 width=286) (actual time=222.049..445.141 rows=102240 loops=1)
Sort Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
Sort Method: external merge Disk: 28144kB
-> Hash Join (cost=17.39..668.95 rows=55380 width=286) (actual time=0.709..15.847 rows=102240 loops=1)
Hash Cond: (arenas.id = field_time_slots.arena_id)
-> Hash Join (cost=9.60..37.48 rows=260 width=382) (actual time=0.604..1.425 rows=480 loops=1)
Hash Cond: (arenas.id = ss2.arena_id)
-> Nested Loop (cost=7.39..32.21 rows=52 width=339) (actual time=0.523..1.121 rows=96 loops=1)
-> Seq Scan on sports (cost=0.00..1.16 rows=1 width=9) (actual time=0.006..0.009 rows=1 loops=1)
Filter: (code = 'BEACH_TENNIS'::text)
Rows Removed by Filter: 12
-> Hash Join (cost=7.39..30.53 rows=52 width=350) (actual time=0.515..1.070 rows=96 loops=1)
Hash Cond: (cities.state_ibge_code = states.ibge_code)
-> Nested Loop (cost=5.78..28.77 rows=52 width=340) (actual time=0.488..0.953 rows=96 loops=1)
-> Nested Loop (cost=5.49..11.03 rows=52 width=332) (actual time=0.466..0.640 rows=96 loops=1)
Join Filter: (arenas.id = services.arena_id)
-> Nested Loop (cost=2.05..4.72 rows=4 width=305) (actual time=0.422..0.444 rows=4 loops=1)
Join Filter: (arenas.id = fields.arena_id)
-> Nested Loop (cost=2.05..3.62 rows=1 width=289) (actual time=0.412..0.417 rows=1 loops=1)
Join Filter: (arenas.id = addresses.addressable_id)
-> Merge Join (cost=2.05..2.08 rows=1 width=174) (actual time=0.023..0.027 rows=1 loops=1)
Merge Cond: (arenas.id = contacts.contactable_id)
-> Sort (cost=1.02..1.02 rows=1 width=158) (actual time=0.012..0.013 rows=1 loops=1)
Sort Key: arenas.id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on arenas (cost=0.00..1.01 rows=1 width=158) (actual time=0.006..0.006 rows=1 loops=1)
Filter: ((approved_at IS NOT NULL) AND (business_hours_data IS NOT NULL) AND (deleted_at IS NULL))
-> Sort (cost=1.03..1.04 rows=1 width=16) (actual time=0.008..0.009 rows=1 loops=1)
Sort Key: contacts.contactable_id
Sort Method: quicksort Memory: 25kB
-> Seq Scan on contacts (cost=0.00..1.02 rows=1 width=16) (actual time=0.006..0.006 rows=1 loops=1)
Filter: (is_main AND ((contactable_type)::text = 'App\Models\Arena'::text))
Rows Removed by Filter: 1
-> Seq Scan on addresses (cost=0.00..1.52 rows=1 width=115) (actual time=0.386..0.387 rows=1 loops=1)
Filter: (((addressable_type)::text = 'App\Models\Arena'::text) AND ((type)::text = 'COMMERCIAL'::text) AND (sec_to_gc(cube_distance((ll_to_earth((latitude)::double precision, (longitude)::double precision))::cube, '(3491544.0649759113, -4339378.172513269, -3108045.069568795)'::cube)) < '20000'::double precision))
-> Seq Scan on fields (cost=0.00..1.05 rows=4 width=16) (actual time=0.009..0.020 rows=4 loops=1)
Filter: ((deleted_at IS NULL) AND ((status)::text = 'A'::text))
-> Materialize (cost=3.43..5.57 rows=13 width=27) (actual time=0.011..0.035 rows=24 loops=4)
-> Hash Join (cost=3.43..5.50 rows=13 width=27) (actual time=0.041..0.092 rows=24 loops=1)
Hash Cond: (service_prices.service_id = service_sport.service_id)
-> Seq Scan on service_prices (cost=0.00..1.75 rows=36 width=8) (actual time=0.006..0.032 rows=36 loops=1)
Filter: is_default
Rows Removed by Filter: 39
-> Hash (cost=3.41..3.41 rows=2 width=51) (actual time=0.030..0.036 rows=3 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Hash Join (cost=2.20..3.41 rows=2 width=51) (actual time=0.025..0.034 rows=3 loops=1)
Hash Cond: (service_sport.service_id = services.id)
-> Hash Join (cost=1.07..2.27 rows=3 width=27) (actual time=0.014..0.019 rows=3 loops=1)
Hash Cond: (field_service.service_id = service_sport.service_id)
-> Seq Scan on field_service (cost=0.00..1.13 rows=13 width=8) (actual time=0.003..0.004 rows=13 loops=1)
-> Hash (cost=1.06..1.06 rows=1 width=19) (actual time=0.005..0.006 rows=1 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on service_sport (cost=0.00..1.06 rows=1 width=19) (actual time=0.003..0.004 rows=1 loops=1)
Filter: (sport_code = 'BEACH_TENNIS'::text)
Rows Removed by Filter: 4
-> Hash (cost=1.07..1.07 rows=4 width=24) (actual time=0.008..0.009 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on services (cost=0.00..1.07 rows=4 width=24) (actual time=0.004..0.006 rows=4 loops=1)
Filter: ((deleted_at IS NULL) AND (NOT is_private) AND ((status)::text = 'A'::text))
Rows Removed by Filter: 2
-> Memoize (cost=0.29..8.31 rows=1 width=24) (actual time=0.001..0.001 rows=1 loops=96)
Cache Key: addresses.city_ibge_code
Cache Mode: logical
Hits: 95 Misses: 1 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Index Scan using cities_ibge_code_unique on cities (cost=0.28..8.30 rows=1 width=24) (actual time=0.016..0.016 rows=1 loops=1)
Index Cond: (ibge_code = addresses.city_ibge_code)
-> Hash (cost=1.27..1.27 rows=27 width=16) (actual time=0.020..0.020 rows=27 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 10kB
-> Seq Scan on states (cost=0.00..1.27 rows=27 width=16) (actual time=0.006..0.010 rows=27 loops=1)
-> Hash (cost=2.15..2.15 rows=5 width=43) (actual time=0.076..0.078 rows=5 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Nested Loop (cost=1.03..2.15 rows=5 width=43) (actual time=0.070..0.074 rows=5 loops=1)
Join Filter: (ss2.arena_id = subscriptions.arena_id)
-> HashAggregate (cost=1.03..1.04 rows=1 width=16) (actual time=0.060..0.061 rows=1 loops=1)
Group Key: subscriptions.arena_id
Batches: 1 Memory Usage: 24kB
-> Seq Scan on subscriptions (cost=0.00..1.02 rows=1 width=16) (actual time=0.008..0.009 rows=1 loops=1)
Filter: (((ends_at IS NULL) OR ((ends_at IS NOT NULL) AND (ends_at > '2024-10-06 01:31:18'::timestamp without time zone))) AND ((stripe_status)::text <> 'incomplete_expired'::text) AND ((stripe_status)::text <> 'unpaid'::text) AND ((stripe_status)::text <> 'past_due'::text) AND ((stripe_status)::text <> 'incomplete'::text) AND ((type)::text = 'access'::text))
-> Seq Scan on service_sport ss2 (cost=0.00..1.05 rows=5 width=27) (actual time=0.006..0.007 rows=5 loops=1)
-> Hash (cost=5.13..5.13 rows=213 width=16) (actual time=0.098..0.099 rows=213 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 18kB
-> Seq Scan on field_time_slots (cost=0.00..5.13 rows=213 width=16) (actual time=0.023..0.055 rows=213 loops=1)
Planning Time: 8.780 ms
Execution Time: 568.033 ms
这是来自 staging 环境的 解释分析:
Sort (cost=102.30..102.31 rows=2 width=405) (actual time=85.416..85.430 rows=1 loops=1)
Sort Key: (sec_to_gc(cube_distance((ll_to_earth((addresses.latitude)::double precision, (addresses.longitude)::double precision))::cube, '(3491544.0649759113, -4339378.172513269, -3108045.069568795)'::cube)))
Sort Method: quicksort Memory: 25kB
-> GroupAggregate (cost=101.18..102.29 rows=2 width=405) (actual time=85.406..85.420 rows=1 loops=1)
Group Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
-> Sort (cost=101.18..101.19 rows=2 width=397) (actual time=65.212..66.575 rows=10800 loops=1)
Sort Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
Sort Method: quicksort Memory: 3448kB
-> Nested Loop (cost=72.78..101.17 rows=2 width=397) (actual time=34.249..43.485 rows=10800 loops=1)
-> Index Only Scan using sports_pkey on sports (cost=0.15..8.17 rows=1 width=32) (actual time=0.019..0.024 rows=1 loops=1)
Index Cond: (code = 'BEACH_TENNIS'::text)
Heap Fetches: 1
-> Hash Join (cost=72.63..92.98 rows=2 width=429) (actual time=34.228..41.163 rows=10800 loops=1)
Hash Cond: (ss2.arena_id = arenas.id)
-> Seq Scan on service_sport ss2 (cost=0.00..17.50 rows=750 width=48) (actual time=0.004..0.011 rows=5 loops=1)
-> Hash (cost=72.62..72.62 rows=1 width=493) (actual time=34.210..34.221 rows=2700 loops=1)
Buckets: 4096 (originally 1024) Batches: 1 (originally 1) Memory Usage: 1053kB
-> Nested Loop (cost=50.45..72.62 rows=1 width=493) (actual time=0.641..31.447 rows=2700 loops=1)
-> Nested Loop (cost=50.30..72.44 rows=1 width=452) (actual time=0.629..23.566 rows=2700 loops=1)
-> Nested Loop Semi Join (cost=50.01..64.14 rows=1 width=468) (actual time=0.617..7.491 rows=2700 loops=1)
Join Filter: (arenas.id = subscriptions.arena_id)
-> Hash Join (cost=49.88..61.77 rows=8 width=452) (actual time=0.605..2.421 rows=2700 loops=1)
Hash Cond: (field_time_slots.arena_id = arenas.id)
-> Seq Scan on field_time_slots (cost=0.00..10.23 rows=423 width=16) (actual time=0.004..0.047 rows=423 loops=1)
-> Hash (cost=49.86..49.86 rows=1 width=436) (actual time=0.589..0.596 rows=18 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 14kB
-> Nested Loop (cost=9.47..49.86 rows=1 width=436) (actual time=0.317..0.582 rows=18 loops=1)
Join Filter: (arenas.id = contacts.contactable_id)
Rows Removed by Join Filter: 18
-> Nested Loop (cost=9.47..48.81 rows=1 width=420) (actual time=0.312..0.543 rows=18 loops=1)
Join Filter: (services.id = service_sport.service_id)
-> Nested Loop (cost=9.32..48.58 rows=1 width=412) (actual time=0.299..0.474 rows=58 loops=1)
Join Filter: (services.id = field_service.service_id)
-> Nested Loop (cost=0.57..34.38 rows=1 width=404) (actual time=0.289..0.377 rows=34 loops=1)
Join Filter: (arenas.id = fields.arena_id)
Rows Removed by Join Filter: 30
-> Nested Loop (cost=0.42..26.21 rows=1 width=388) (actual time=0.279..0.335 rows=16 loops=1)
Join Filter: (services.id = service_prices.service_id)
Rows Removed by Join Filter: 76
-> Nested Loop (cost=0.42..25.01 rows=1 width=380) (actual time=0.272..0.305 rows=4 loops=1)
Join Filter: (addresses.addressable_id = arenas.id)
-> Nested Loop (cost=0.28..16.84 rows=1 width=268) (actual time=0.262..0.288 rows=4 loops=1)
Join Filter: (addresses.addressable_id = services.arena_id)
Rows Removed by Join Filter: 4
-> Index Scan using addresses_addressable_type_addressable_id_index on addresses (cost=0.14..8.67 rows=1 width=244) (actual time=0.254..0.272 rows=2 loops=1)
Index Cond: ((addressable_type)::text = 'App\Models\Arena'::text)
Filter: (((type)::text = 'COMMERCIAL'::text) AND (sec_to_gc(cube_distance((ll_to_earth((latitude)::double precision, (longitude)::double precision))::cube, '(3491544.0649759113, -4339378.172513269, -3108045.069568795)'::cube)) < '20000'::double precision))
-> Index Scan using services_arena_id_name_deleted_at_unique on services (cost=0.14..8.16 rows=1 width=24) (actual time=0.004..0.006 rows=4 loops=2)
Filter: ((NOT is_private) AND ((status)::text = 'A'::text))
Rows Removed by Filter: 1
-> Index Scan using arenas_pkey on arenas (cost=0.14..8.16 rows=1 width=112) (actual time=0.003..0.003 rows=1 loops=4)
Index Cond: (id = services.arena_id)
Filter: ((approved_at IS NOT NULL) AND (business_hours_data IS NOT NULL) AND (deleted_at IS NULL))
-> Seq Scan on service_prices (cost=0.00..1.12 rows=6 width=8) (actual time=0.002..0.004 rows=23 loops=4)
Filter: is_default
-> Index Scan using fields_arena_id_name_deleted_at_unique on fields (cost=0.14..8.16 rows=1 width=16) (actual time=0.001..0.002 rows=4 loops=16)
Filter: ((status)::text = 'A'::text)
-> Bitmap Heap Scan on field_service (cost=8.76..14.14 rows=5 width=8) (actual time=0.001..0.001 rows=2 loops=34)
Recheck Cond: (service_id = service_prices.service_id)
Heap Blocks: exact=34
-> Bitmap Index Scan on field_service_arena_id_service_id_field_id_unique (cost=0.00..8.76 rows=5 width=0) (actual time=0.001..0.001 rows=2 loops=34)
Index Cond: (service_id = service_prices.service_id)
-> Index Only Scan using service_sport_service_id_sport_code_unique on service_sport (cost=0.15..0.22 rows=1 width=40) (actual time=0.001..0.001 rows=0 loops=58)
Index Cond: ((service_id = field_service.service_id) AND (sport_code = 'BEACH_TENNIS'::text))
Heap Fetches: 18
-> Seq Scan on contacts (cost=0.00..1.04 rows=1 width=16) (actual time=0.001..0.001 rows=2 loops=18)
Filter: (is_main AND ((contactable_type)::text = 'App\Models\Arena'::text))
Rows Removed by Filter: 1
-> Index Scan using subscriptions_arena_id_stripe_status_index on subscriptions (cost=0.14..0.28 rows=1 width=16) (actual time=0.001..0.001 rows=1 loops=2700)
Index Cond: (arena_id = field_time_slots.arena_id)
Filter: (((ends_at IS NULL) OR ((ends_at IS NOT NULL) AND (ends_at > '2024-10-06 01:31:18'::timestamp without time zone))) AND ((stripe_status)::text <> 'incomplete_expired'::text) AND ((stripe_status)::text <> 'unpaid'::text) AND ((stripe_status)::text <> 'past_due'::text) AND ((stripe_status)::text <> 'incomplete'::text) AND ((type)::text = 'access'::text))
-> Index Scan using cities_ibge_code_unique on cities (cost=0.28..8.30 rows=1 width=24) (actual time=0.005..0.005 rows=1 loops=2700)
Index Cond: (ibge_code = addresses.city_ibge_code)
-> Index Scan using states_pkey on states (cost=0.15..0.18 rows=1 width=56) (actual time=0.002..0.002 rows=1 loops=2700)
Index Cond: (ibge_code = cities.state_ibge_code)
Planning Time: 6.426 ms
Execution Time: 85.641 ms
知道为什么会发生这种情况吗?我没有关注性能,因为我们还处于早期阶段,并且认为现在这并不重要,因为我们的数据太少了。 我们已经尝试将服务器实例从 db.t3.micro 升级到 db.t3.small,没有任何变化。我们还尝试在另一个可用区恢复它,但没有任何反应。我尝试在本地恢复生产转储并运行查询,它的成本为 6000,但仍然远低于 50000。在本地开发环境中运行查询时,它的成本也为 100。
主要区别在这里:
生产:
-> Sort (cost=12417.08..12555.53 rows=55380 width=286) (actual time=222.049..445.141 rows=102240 loops=1)
Sort Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
Sort Method: external merge Disk: 28144kB
-> Sort (cost=101.18..101.19 rows=2 width=397) (actual time=65.212..66.575 rows=10800 loops=1)
Sort Key: arenas.id, addresses.latitude, addresses.longitude, addresses.zip_code, addresses.street, addresses.number, addresses.complement, addresses.district, cities.name, states.name, states.uf
Sort Method: quicksort Memory: 3448kB
排序必须在生产系统上处理十倍多的行(102240 与 10800)。最重要的是,在临时系统上对 10800 行进行排序可以在内存中进行,而生产系统必须写入和读取临时文件,因为 100000 行不适合
work_mem
。
您可以通过增加
work_mem
来提高生产系统的性能(但请注意不要耗尽内存)。
从 PostgreSQL v16 开始,快速计划可能会更快,因为
commit b592422095添加了对
GROUP BY
使用“增量排序”的功能。