优化Mariadb查询4+百万行

问题描述 投票:0回答:1

我正在开发一个聊天应用程序并运行 MariaDb。我正在尝试创建一个查询来获取所有最近的对话,即最后一条消息以及来自哪个用户,类似于 iMessage/Messenger/WhatsApp 在打开它时所做的操作。

chatmessages
表包含 4+M 行。我尝试使用
PARTITION BY OVER
来获得所需的结果,但性能非常差(1.5 秒)。我尝试了另一种分组方法,但在花了几天优化它之后,我最终得到了最糟糕的时间🤡(1.7秒)。

我做错了什么,我该如何优化。我将分享我创建的 2 个变体。理想情况下,我试图让查询在 0.2 秒内执行。主表是

chatmessages
保存消息,
sessions
其中每条聊天消息引用一个会话 ID,以及
visitors
,代表访问者。每个会话都与一个访客链接。

首次尝试 - 使用 PARTITON OVER

这里发生了很多事情,但是单独的partition by 就需要1s来执行。

DROP PROCEDURE IF EXISTS `GetRecentConversations`$$

CREATE DEFINER=`root`@`localhost` PROCEDURE `GetRecentConversations`(
    `limit` INT,
    `offset` INT,
    `agent` CHAR(95),
    `property` VARCHAR(100),
    `closed` TINYINT
    )
BEGIN
    SET `limit` = IFNULL(`limit`, 30);
    SET `offset` = IFNULL(`offset`, 0);
    SET `agent` = IFNULL(`agent`, '');
    SET `property` = IFNULL(`property`, 0);
    SET `closed` = IFNULL(`closed`, 0);
    SELECT * 
    FROM
    (
        SELECT A.VisitorId, V.Name AS 'VisitorName', V.Picture AS 'ProfilePic', GROUP_CONCAT(DISTINCT CT.`TagsId`) AS 'TagsId' , 
            V.IsBlocked, A.MessageId, A.Message, A.CreatedAt AS 'MessageCreatedAt' , A.Attachment, A.ArticleId, A.SessionId, 
            V.PropertyId, P.Name AS 'PropertyName', AG.Agents, IF(AG.Agents IS NULL, 0,1) AS 'IsAttended',
            MAX( CASE WHEN A.FromAgentId IS NULL THEN A.CreatedAt END) AS 'LastSeen', A.rn
        FROM (
          SELECT S.Id AS 'SessionId', S.`VisitorId`, C.Message, C.Attachment, C.ArticleId, C.Id AS 'MessageId', C.CreatedAt, S.EndedOn, C.FromAgentId,
                ROW_NUMBER() OVER (PARTITION BY S.`VisitorId` ORDER BY  C.`CreatedAt` DESC) AS rn
          FROM Sessions AS S
          LEFT JOIN ChatMessages C ON S.Id = C.SessionId
          WHERE S.IsDeleted = 0
        ) AS A
        LEFT JOIN Visitors V ON A.VisitorId = V.Id
        LEFT JOIN Properties P ON V.PropertyId = P.Id
        LEFT JOIN `contacttagvisitor` CT ON V.Id = CT.VisitorsId
        LEFT JOIN (
                SELECT SessionId, GROUP_CONCAT(AgentId) AS 'Agents'
                FROM sessionagents SA 
                LEFT JOIN aspnetusers U ON SA.AgentId = U.Id
                WHERE U.IsBot = 0
                GROUP BY SessionId
            ) AS AG ON A.SessionId =  AG.SessionId
        WHERE     ( (`agent`    = '') OR ( FIND_IN_SET( `agent`, AG.Agents ) > 0  )  ) AND 
              ( (`property` = 0 ) OR ( FIND_IN_SET( CAST(V.PropertyId AS CHAR),`property`) > 0)  ) AND 
              ( ( (`closed` = 0) AND (A.EndedOn IS NULL) ) OR 
                ( (`closed` = 1) AND (A.EndedOn IS NOT NULL) )
                )
        GROUP BY A.VisitorId
        HAVING (rn = 1) /* get only the last message in the session, done after grouping (HAVING) because we need the MAX LastSeen to work correctly */
        
    ) AS F
    ORDER BY MessageCreatedAt DESC LIMIT `limit` OFFSET `offset`;
END$$

DELIMITER ;

第二次尝试 - 尝试避免窗口函数


CREATE DEFINER=`root`@`localhost` PROCEDURE `GetRecentConversationsOptimized`(
    `limit` INT,
    `offset` INT,
    `agent` CHAR(95),
    `property` VARCHAR(100),
    `closed` TINYINT
    )
BEGIN

    SET `limit` = IFNULL(`limit`, 30);
    SET `offset` = IFNULL(`offset`, 0);
    SET `agent` = IFNULL(`agent`, '');
    SET `property` = IFNULL(`property`, 0);
    SET `closed` = IFNULL(`closed`, 0);

    WITH agents AS (
        SELECT SessionId AS 'AgentsSessionId', GROUP_CONCAT(AgentId) AS 'Agents'
        FROM sessionagents SA 
        LEFT JOIN aspnetusers U ON SA.AgentId = U.Id
        WHERE U.IsBot = 0
        GROUP BY SessionId
    ),
    wtags AS (
        SELECT vs.Id AS 'VId', GROUP_CONCAT(DISTINCT CT.`TagsId`) AS 'TagsId' 
        FROM visitors vs
        INNER JOIN `contacttagvisitor` CT ON CT.VisitorsId = vs.Id
        GROUP BY vs.Id  
    ),
    ms AS (
        SELECT s.VisitorId, s.`Id` AS 'SessionId',  m.Message, m.`Attachment`, m.`ArticleId`, m.`Id` AS 'MessageId', m.`CreatedAt` AS 'MessageCreatedAt', s.`EndedOn`, m.`FromAgentId`,
            V.Name, V.Picture, V.IsBlocked, 
            P.Name AS 'PropertyName', P.Id AS 'PropertyId'
        FROM chatmessages m
        LEFT JOIN sessions S ON s.Id = m.SessionId
        LEFT JOIN visitors V ON S.VisitorId = V.Id
        LEFT JOIN properties P ON V.`PropertyId` = P.Id

        /*LEFT JOIN (
                SELECT SessionId, GROUP_CONCAT(AgentId) AS 'Agents'
                FROM sessionagents SA 
                LEFT JOIN aspnetusers U ON SA.AgentId = U.Id
                WHERE U.IsBot = 0
                GROUP BY SessionId
            ) AS AG ON S.Id =  AG.SessionId*/
        WHERE     /*( (@agent    = '') OR ( FIND_IN_SET( @agent, AG.Agents ) > 0  )  ) AND */
              ( (`property` = 0 ) OR ( FIND_IN_SET( CAST(V.PropertyId AS CHAR),`property`) > 0)  ) AND 
              ( ( (`closed` = 0) AND (S.EndedOn IS NULL) ) OR 
                ( (`closed` = 1) AND (S.EndedOn IS NOT NULL) )
                )   
        
    ) 
    
    SELECT ms.VisitorId, ms.Name AS 'VisitorName' , ms.Picture AS 'ProfilePic', wt.TagsId, 
        ms.IsBlocked, ms.MessageId, ms.Message, ms.MessageCreatedAt, ms.Attachment, ms.ArticleId, ms.SessionId, 
        ms.PropertyId, ms.PropertyName,  ag.Agents, IF(ag.Agents IS NULL, 0,1) AS 'IsAttended', LS.LastSeen
    FROM ms 
    LEFT JOIN wtags wt ON ms.VisitorId = wt.VId
    LEFT JOIN agents ag ON ms.SessionId = ag.AgentsSessionId
    LEFT JOIN ( 
        SELECT VisitorId AS 'VLSId', MAX( CASE WHEN ms.FromAgentId IS NULL THEN ms.MessageCreatedAt END) AS 'LastSeen'
        FROM ms
        GROUP BY VisitorId
    ) LS ON ms.VisitorId = LS.VLSId
    WHERE (VisitorId,MessageCreatedAt) IN 
        ( SELECT VisitorId, MAX(MessageCreatedAt)
          FROM ms
          GROUP BY VisitorId
        )
    ORDER BY MessageCreatedAt DESC LIMIT `limit` OFFSET `offset`;

    
END$$

DELIMITER ;

相关表定义供参考

  1. 聊天消息
CREATE TABLE `chatmessages` (
  `Id` char(36) NOT NULL,
  `SessionId` char(36) NOT NULL DEFAULT '00000000-0000-0000-0000-000000000000',
  `Message` longtext DEFAULT NULL,
  `FromAgentId` varchar(95) DEFAULT NULL,
  `isDeleted` tinyint(1) NOT NULL,
  `CreatedAt` datetime(6) NOT NULL,
  `CreatedBy` varchar(36) NOT NULL,
  `LastUpdatedAt` datetime(6) NOT NULL,
  `LastUpdatedBy` varchar(36) NOT NULL,
  `isRead` tinyint(1) NOT NULL DEFAULT 0,
  `Attachment` char(36) DEFAULT NULL,
  `ArticleId` int(11) DEFAULT NULL,
  `isPrivate` tinyint(1) NOT NULL DEFAULT 0,
  PRIMARY KEY (`Id`),
  KEY `IX_ChatMessages_FromAgentId` (`FromAgentId`),
  KEY `IX_ChatMessages_SessionId` (`SessionId`),
  KEY `IX_ChatMessages_ArticleId` (`ArticleId`),
  KEY `IX_ChatMessages_CreatedAt` (`CreatedAt`),
  CONSTRAINT `FK_ChatMessages_AspNetUsers_FromAgentId` FOREIGN KEY (`FromAgentId`) REFERENCES `aspnetusers` (`Id`),
  CONSTRAINT `FK_ChatMessages_KbArticles_ArticleId` FOREIGN KEY (`ArticleId`) REFERENCES `kbarticles` (`Id`),
  CONSTRAINT `FK_ChatMessages_Sessions_SessionId` FOREIGN KEY (`SessionId`) REFERENCES `sessions` (`Id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
  1. 会议
CREATE TABLE `sessions` (
  `Id` char(36) NOT NULL,
  `VisitorId` char(36) NOT NULL DEFAULT '00000000-0000-0000-0000-000000000000',
  `PropertyId` bigint(20) NOT NULL DEFAULT 0,
  `EndedOn` datetime(6) DEFAULT NULL,
  `isDeleted` tinyint(1) NOT NULL,
  `CreatedAt` datetime(6) NOT NULL,
  `CreatedBy` varchar(36) NOT NULL,
  `LastUpdatedAt` datetime(6) NOT NULL,
  `LastUpdatedBy` varchar(36) NOT NULL,
  `FacebookPageId` int(11) DEFAULT NULL,
  `isOfflineHours` tinyint(1) NOT NULL DEFAULT 0,
  `isHandedOver` tinyint(1) NOT NULL DEFAULT 0,
  PRIMARY KEY (`Id`),
  KEY `IX_Sessions_PropertyId` (`PropertyId`),
  KEY `IX_Sessions_VisitorId` (`VisitorId`),
  KEY `IX_Sessions_FacebookPageId` (`FacebookPageId`),
  CONSTRAINT `FK_Sessions_FacebookPages_FacebookPageId` FOREIGN KEY (`FacebookPageId`) REFERENCES `facebookpages` (`Id`),
  CONSTRAINT `FK_Sessions_Properties_PropertyId` FOREIGN KEY (`PropertyId`) REFERENCES `properties` (`Id`) ON DELETE CASCADE,
  CONSTRAINT `FK_Sessions_Visitors_VisitorId` FOREIGN KEY (`VisitorId`) REFERENCES `visitors` (`Id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci
  1. 访客
CREATE TABLE `visitors` (
  `Id` char(36) NOT NULL,
  `Email` varchar(255) DEFAULT NULL,
  `Name` varchar(255) DEFAULT NULL,
  `Phone` varchar(255) DEFAULT NULL,
  `IP` varchar(45) DEFAULT NULL,
  `CountryCode` varchar(2) DEFAULT NULL,
  `Location` varchar(100) DEFAULT NULL,
  `Platform` varchar(100) DEFAULT NULL,
  `PropertyId` bigint(20) NOT NULL DEFAULT 0,
  `LastSeen` datetime(6) NOT NULL,
  `isDeleted` tinyint(1) NOT NULL,
  `CreatedAt` datetime(6) NOT NULL,
  `CreatedBy` varchar(36) NOT NULL,
  `LastUpdatedAt` datetime(6) NOT NULL,
  `LastUpdatedBy` varchar(36) NOT NULL,
  `isBlocked` tinyint(1) NOT NULL DEFAULT 0,
  `Picture` char(36) DEFAULT NULL,
  `ExternalId` varchar(128) DEFAULT NULL,
  PRIMARY KEY (`Id`),
  KEY `IX_Visitors_PropertyId` (`PropertyId`),
  CONSTRAINT `FK_Visitors_Properties_PropertyId` FOREIGN KEY (`PropertyId`) REFERENCES `properties` (`Id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci

预期产出

VisitorId                             VisitorName   ProfilePic  IsBlocked  MessageId                             MessageCreatedAt            Attachment  ArticleId  SessionId                             PropertyId  IsAttended  LastSeen                    
83788605-2db3-49d5-bff7-472ad944ab21  extreme                   0          2318b732-88d0-47af-a388-3540743e49bd  2024-12-08 17:43:32.487863                         845df654-7268-484f-b6c0-63b4bcd827c0  1           1           2024-12-08 17:43:14.654195  
cbd82eba-9d5b-438f-a26b-0cfb37e24ca2  kai                       0          df9af340-26e6-4b8b-a009-199d43c631d6  2024-12-08 16:41:19.302200                         b5c0931d-f22e-4a62-bb93-d59f87d244c8  1           1           2024-12-08 16:41:19.302200  
5adcfa73-6223-444f-ab7e-80890b9e0367  moncoachdata              0          023a6de9-14be-4aeb-91c5-0ebc859a4da6  2024-12-08 15:15:54.533801                         554c322e-7b23-4238-95c2-41bd524b604f  1           1           2024-12-08 15:15:09.801974  
0100c4c3-74bb-4950-832f-1016471fdf8e                            0          458894d4-655d-46db-8bfb-b4734572e542  2024-12-08 14:48:37.568726                         b74496a2-ac0a-4bae-9923-bc049bfb3ad2  1           1           2024-12-08 13:28:12.729363  
c19cc8e5-b66f-423a-b3d0-39c9e3289980  support                   0          b6dc6a4b-e47b-45c2-87db-67152c803130  2024-12-08 12:33:59.969425              290        c66d114d-028a-4c9f-a66f-638c3b03d58d  1           1           2024-12-08 12:33:22.281014  
6baf0caa-d5dd-4620-ab3a-53fac178d2e7                            0          8f80518d-845f-4f82-a0e6-f5c39b8a9682  2024-12-08 08:59:44.332164                         6ba3ad53-e7dc-4272-a1c0-54f72e787ce8  1           1           2024-12-08 08:59:29.475363  
2b32a463-7498-4c7b-91d2-e4f5da55dbfc  sara                      0          895dc0da-b714-4e4f-bd24-945bd306b291  2024-12-08 08:46:07.301215                         e47a2432-cc94-4c8b-8b75-4c05b55ae114  1           1           2024-12-08 08:45:09.945227  
7dd08158-e88c-4596-8bf0-cbb04fe66131                            0          5aaaddad-85d6-41d0-b31f-fbeb6960c880  2024-12-08 08:01:25.456433                         f96abf69-cca1-4fc7-885f-079b8604d7ae  1           0           2024-12-08 08:01:25.456433  
c1ebbafe-a3fb-4f30-b431-7c956b917cb5                            0          5c75b60d-66cd-446b-9cbf-c4ee90aaaa46  2024-12-08 07:56:45.233012                         b776ecc1-f232-4f5d-9b9f-beaf029ceb58  1           0           2024-12-08 07:56:45.233012  
d975b3f5-beca-4368-b1cd-a816bcb0f9ba                            0          70044674-0da0-410d-8e04-97ab0568c63b  2024-12-08 07:06:23.526073                         cc468f1f-5be8-4f6f-8e0b-dcfbde1626a8  1           0           2024-12-08 07:06:23.526073  

解释结果

  1. 第一个 PARTITION OVER QUERY 解释结果(查询中最慢的部分)
id  select_type  table  type  possible_keys              key                        key_len  ref                        rows   Extra                         
1   SIMPLE       S      ALL                                                                                             17232  Using where; Using temporary  
1   SIMPLE       C      ref   IX_ChatMessages_SessionId  IX_ChatMessages_SessionId  144      db_prod_mirror.S.Id  2                                    
  1. 第二种方法解释(整个查询)
id  select_type      table        type    possible_keys                                        key                         key_len  ref                                 rows   Extra                                                                
1   PRIMARY          <subquery6>  ALL     distinct_key                                                                                                                  17232  Using temporary; Using filesort                                      
1   PRIMARY          m            ref     IX_ChatMessages_SessionId,IX_ChatMessages_CreatedAt  IX_ChatMessages_CreatedAt   8        <subquery6>.MAX(MessageCreatedAt)   1      Using index condition                                                
1   PRIMARY          S            eq_ref  PRIMARY,IX_Sessions_VisitorId                        PRIMARY                     144      db_prod_mirror.m.SessionId    1      Using where                                                          
1   PRIMARY          V            eq_ref  PRIMARY                                              PRIMARY                     144      <subquery6>.VisitorId               1      Using where                                                          
1   PRIMARY          P            eq_ref  PRIMARY                                              PRIMARY                     8        db_prod_mirror.V.PropertyId   1                                                                           
1   PRIMARY          <derived3>   ref     key0                                                 key0                        145      <subquery6>.VisitorId               10     Using where                                                          
1   PRIMARY          <derived2>   ref     key0                                                 key0                        145      db_prod_mirror.m.SessionId    2                                                                           
1   PRIMARY          <derived5>   ref     key0                                                 key0                        145      <subquery6>.VisitorId               2      Using where                                                          
6   MATERIALIZED     S            ALL     PRIMARY,IX_Sessions_VisitorId                                                                                                 17232  Using where; Using temporary                                         
6   MATERIALIZED     V            eq_ref  PRIMARY                                              PRIMARY                     144      db_prod_mirror.S.VisitorId    1      Using where                                                          
6   MATERIALIZED     m            ref     IX_ChatMessages_SessionId                            IX_ChatMessages_SessionId   144      db_prod_mirror.S.Id           2                                                                           
5   LATERAL DERIVED  S            ref     PRIMARY,IX_Sessions_VisitorId                        IX_Sessions_VisitorId       144      db_prod_mirror.S.VisitorId    1      Using index condition; Using where; Using temporary; Using filesort  
5   LATERAL DERIVED  V            eq_ref  PRIMARY                                              PRIMARY                     144      db_prod_mirror.S.VisitorId    1      Using where                                                          
5   LATERAL DERIVED  m            ref     IX_ChatMessages_SessionId                            IX_ChatMessages_SessionId   144      db_prod_mirror.S.Id           2                                                                           
3   DERIVED          CT           index   IX_ContactTagVisitor_VisitorsId                      PRIMARY                     148                                          467    Using index; Using temporary; Using filesort                         
3   DERIVED          vs           eq_ref  PRIMARY                                              PRIMARY                     144      db_prod_mirror.CT.VisitorsId  1      Using index                                                          
2   LATERAL DERIVED  SA           ref     IX_SessionAgents_AgentId,IX_SessionAgents_SessionId  IX_SessionAgents_SessionId  144      db_prod_mirror.S.Id           1      Using index condition                                                
2   LATERAL DERIVED  U            eq_ref  PRIMARY                                              PRIMARY                     382      db_prod_mirror.SA.AgentId     1      Using where                                                          
mariadb query-optimization
1个回答
0
投票

原因

没有彻底看你的查询,但问题是没有索引。您在解释结果部分中发布的第一个表(第

Type
列,第一行,值
ALL
)显示,根据MariaDB文档(https://mariadb.com/kb/en/explain) /#类型列):

ALL
:对表进行全表扫描(读取所有行)。如果表很大并且该表与前一个表连接,那么这很糟糕!当优化器找不到任何可用索引来访问行时,就会发生这种情况。

修复

我想您想按聊天消息的创建时间搜索聊天消息(例如,获取指定用户编写的十条最新消息)。对于这个用例,我将为

chatmessages
表添加一个新索引:

create index chatmessages_by_creation_idx on chatmessages (CreatedAt);

索引中可能不止一列;也许这会更适合您的用例:

-- Index by whole pair (CreatedAt, CreatedBy):
-- Good if these two are usually used in conjunction
create index chatmessages_by_creation_idx on chatmessages (CreatedAt, CreatedBy);

您可能尝试创建这两个索引,您可能想要使用不同的列,也许包含更多列会更好。如果不是生产数据库,我建议尝试使用索引(或者在您自己的计算机上使用测试数据创建一个数据库副本,并在那里尝试索引)。

资源

有关索引的更多信息请参见此处(从最简单到最详细排序):

关于 MariaDB 的

EXPLAIN

© www.soinside.com 2019 - 2024. All rights reserved.