使用大数据查询调优

Question

我需要一个非常慢的查询帮助。

我的数据模型：

2379k类别（节点）
1746315k用户（节点）
376900k产品（节点）
40m视图（客户端 - >查看（日期） - >产品）（边缘）
2m属于（产品 - >属于 - >类别）（边缘）

我创建了以下索引：

CREATE CONSTRAINT ON (product: Product) ASSERT product.idProduct IS UNIQUE;
CREATE CONSTRAINT ON (customer: Customer) ASSERT customer.idCustomer IS UNIQUE;
CREATE CONSTRAINT ON (category: Category) ASSERT category.idCategory IS UNIQUE;

我想显示“Who view also view”中的建议：

我有两个基本查询，一个带有类别过滤器，另一个没有。

查询没有过滤器：

MATCH (p:Product {idProduct: "178293"})<-[:VIEW]-(c:Customer)-[:VIEW]->(rec:Product)
WHERE not (rec.idProduct = "178293") 
WITH rec.idProduct AS recommendation, count(*) as views
ORDER BY views DESC LIMIT 25
RETURN recommendation, views;

运行大约需要10秒钟。

使用过滤器查询：

MATCH (p:Product {idProduct: "178293"})<-[:VIEW]-(c:Customer)-[:VIEW]->(rec:Product)-[BELONG]->(ca:Category {idCategory: "173"})
WHERE not (rec.idProduct = "178293") 
WITH rec.idProduct AS recommendation, count(*) as views
ORDER BY views DESC LIMIT 25
RETURN recommendation, views;

运行大约需要60秒。

我想收到一些调整此查询的提示。

我正在使用neo4j 3.3.3社区版。

我的电脑是I7，配备8GB内存，ssd和运行ubuntu 14.04

查询在浏览器中执行。

谢谢！

Answer 1

尝试给Cypher规划者两个index hints，以鼓励它们使用这两个索引。默认情况下，您的计划程序可能只使用其中一个索引。

MATCH (p:Product {idProduct: "178293"})<-[:VIEW]-(c:Customer)-[:VIEW]->(rec:Product)-[BELONG]->(ca:Category {idCategory: "173"})
USING INDEX p:Product(idProduct)
USING INDEX ca:Category(idCategory)
WHERE NOT (rec.idProduct = "178293")
WITH rec.idProduct AS recommendation, count(*) as views
ORDER BY views DESC
LIMIT 25
RETURN recommendation, views;

您可以通过使用PROFILE运算符来发现规划器是否生成使用这两个索引的计划。

顺便说一下，如果WHERE NOT (rec.idProduct = "178293")节点只能与Customer节点有一个VIEW关系，则可能不需要你的Product子句（因为MATCH子句会自动过滤掉具有重复关系的匹配）。或者，如果需要进行WHERE测试，您可以将其简化为WHERE rec <> p，因为Products具有独特的idProduct值。

使用大数据查询调优

问题描述投票：0回答：1

1个回答

最新问题

使用大数据查询调优

问题描述 投票：0回答：1

1个回答

最新问题

问题描述投票：0回答：1