我在一个表中有超过 1,000,000 条数据行(示例表名称为:test1),并运行下面的查询,加载数据需要 10 秒以上,所以您能指导我们提高加载速度或任何临时条件吗? table 可以创建并使用吗?
我的询问
select distinct(email),'Prev Year Exist' as status
from test1
where created_date>='2024-04-01' and created_date<='2025-03-31'
AND ((pan in (select pan from test1 where created_date>='2023-04-01' and created_date<='2024-03-31'))
OR (mobileno in (select mobileno from test1 where created_date>='2023-04-01' and created_date<='2024-03-31'))
OR (email in (select email from test1 where created_date>='2023-04-01' and created_date<='2024-03-31')));
================================================== 加载数据非常慢,因此使用我的查询加载速度应该更快
您不需要创建多个子查询,在您的情况下,简单的联接更有意义,因为所有子查询都具有相同的条件。我会进行下面的查询
SELECT DISTINCT t1.email, 'Prev Year Exist' AS status
FROM test1 t1
LEFT JOIN test1 t2 ON (
(t1.pan = t2.pan OR t1.mobileno = t2.mobileno OR t1.email = t2.email)
AND t2.created_date >= '2023-04-01' AND t2.created_date <= '2024-03-31'
)
WHERE t1.created_date >= '2024-04-01'
AND t1.created_date <= '2025-03-31'
AND t2.email IS NOT NULL;
为了提高效率,您还可以在created_date列上创建索引