生产服务器上的事务处理速度慢20倍

Question

一个我的开发服务器测试事务（一系列更新等）在大约2分钟内运行。在生产服务器上大约25分钟。

服务器读取文件并插入记录。它开始很快，但随着它的进展变得越来越慢。每个插入的记录都有一个聚合表更新，并且该更新会逐渐减慢。该聚合更新确实使用插入查询正在写入的表。

配置仅在max_worker_processes（开发8，prod 16），shared_buffers（dev 128MB，prod 512MB），wal_buffers（Dev 4MB，prod 16MB）中有所不同。

我已经尝试调整一些配置并且还转储整个数据库并重新执行initdb以防万一它没有正确升级（到9.6）。什么都没有用。

我希望有经验的人可以告诉我该寻找什么。

编辑：在收到一些评论之后，我能够弄清楚发生了什么并开始了解，但我认为必须有更好的方法。首先发生的是这样的：

最初表中没有相关索引的数据，postgresql计算出这个计划。请注意，表中的数据与相关的“businessIdentifier”索引或“transactionNumber”没有任何关系。

 Aggregate  (cost=16.63..16.64 rows=1 width=4) (actual time=0.031..0.031 rows=1 loops=1)
   ->  Nested Loop  (cost=0.57..16.63 rows=1 width=4) (actual time=0.028..0.028 rows=0 loops=1)
         ->  Index Scan using transactionlinedateindex on "transactionLine" ed  (cost=0.29..8.31 rows=1 width=5) (actual time=0.028..0.028 rows=0 loops=1)
               Index Cond: ((("businessIdentifier")::text = '36'::text) AND ("reconciliationNumber" = 4519))
         ->  Index Scan using transaction_pkey on transaction eh  (cost=0.29..8.31 rows=1 width=9) (never executed)
               Index Cond: ((("businessIdentifier")::text = '36'::text) AND (("transactionNumber")::text = (ed."transactionNumber")::text))
               Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
 Planning time: 0.915 ms
 Execution time: 0.100 ms

然后随着数据的插入，它变成了一个非常糟糕的计划。本例中为474ms。它需要执行数千次，具体取决于上传的内容，因此474ms是不好的。

 Aggregate  (cost=16.44..16.45 rows=1 width=4) (actual time=474.222..474.222 rows=1 loops=1)
   ->  Nested Loop  (cost=0.57..16.44 rows=1 width=4) (actual time=474.218..474.218 rows=0 loops=1)
         Join Filter: ((eh."transactionNumber")::text = (ed."transactionNumber")::text)
         ->  Index Scan using transaction_pkey on transaction eh  (cost=0.29..8.11 rows=1 width=9) (actual time=0.023..0.408 rows=507 loops=1)
               Index Cond: (("businessIdentifier")::text = '37'::text)
               Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
         ->  Index Scan using transactionlineprovdateindex on "transactionLine" ed  (cost=0.29..8.31 rows=1 width=5) (actual time=0.934..0.934 rows=0 loops=507)
               Index Cond: (("businessIdentifier")::text = '37'::text)
               Filter: ("reconciliationNumber" = 4519)
               Rows Removed by Filter: 2520
 Planning time: 0.848 ms
 Execution time: 474.278 ms

真空分析修复它。但是，在提交事务之前，您无法运行真空分析。在真空分析之后，postgresql使用不同的计划，并且它回落到0.1毫秒。

 Aggregate  (cost=16.63..16.64 rows=1 width=4) (actual time=0.072..0.072 rows=1 loops=1)
   ->  Nested Loop  (cost=0.57..16.63 rows=1 width=4) (actual time=0.069..0.069 rows=0 loops=1)
         ->  Index Scan using transactionlinedateindex on "transactionLine" ed  (cost=0.29..8.31 rows=1 width=5) (actual time=0.067..0.067 rows=0 loops=1)
               Index Cond: ((("businessIdentifier")::text = '37'::text) AND ("reconciliationNumber" = 4519))
         ->  Index Scan using transaction_pkey on transaction eh  (cost=0.29..8.31 rows=1 width=9) (never executed)
               Index Cond: ((("businessIdentifier")::text = '37'::text) AND (("transactionNumber")::text = (ed."transactionNumber")::text))
               Filter: ("transactionStatus" = 'posted'::"transactionStatusItemType")
 Planning time: 1.134 ms
 Execution time: 0.141 ms

我的工作是在大约100次插入后进行，然后运行真空分析，然后继续。唯一的问题是，如果其余数据中的某些内容失败并且回滚，则仍会插入100条记录。

有没有更好的方法来处理这个？我应该升级到版本10或11或postgresql并且会有帮助吗？

Answer 1

每个插入的记录都有一个聚合表更新，并且该更新会逐渐减慢。

这是一个想法：将工作流程更改为（1）将外部数据导入表格，使用COPY界面，（2）索引和分析数据，（3）运行最后的UPDATE以及所有必需的连接/分组以进行实际转换和更新聚合表。

所有这些都可以在一次长期交易中完成 - 如果需要的话。

只有当整个事物长时间锁定一些重要的数据库对象时，您应该考虑将其拆分为单独的事务/批处理（以某种通用方式，按日期/时间或按ID分区处理数据）。

但是，在提交事务之前，您无法运行真空分析。

要获得查询计划的更新成本，您只需要ANALYZE而不是VACUUM。

生产服务器上的事务处理速度慢20倍

问题描述投票：0回答：1

1个回答

最新问题

生产服务器上的事务处理速度慢20倍

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1