我一直在尝试运行此查询来查找匹配的帐户,每次运行查询时,HUE 环境都会完成 75%,然后就停留在那里。我不确定如何排除故障,因为我一直在阅读论坛等试图找出问题所在。我无法使用 COMPUTE STATS,因为不允许查看。
SELECT
d.eid,
d.loyaltyProgramId,
d.playerAccountNumber,
dcp1.PhoneNumber,
dcp1.IsPrimary,
dcp1.IsPreferredContactNumber,
d.firstName,
d.LastName,
d.BirthDate,
d.Gender,
d.IsBanned,
d.bancode,
dc2_cust.eid,
dc2_cust.loyaltyprogramid,
dc2_cust.playerAccountNumber,
dcp2_cust.PhoneNumber,
dc2_cust.FirstName,
dc2_cust.LastName,
dc2_cust.BirthDate,
dc2_cust.Gender,
dc2_cust.IsBanned,
dc2_cust.BanCode,
CONCAT(d.playeraccountnumber, '-', d.LoyaltyProgramId, ',', dc2_cust.playeraccountnumber, '-', dc2_cust.loyaltyprogramid) AS killkey
FROM gmscompliance_ref.ballybi_dcustomer d
JOIN gmscompliance_ref.ballybi_dcustomerphone dcp1 ON d.customerkey = dcp1.customerkey AND d.loyaltyprogramid = dcp1.loyaltyprogramid
JOIN gmscompliance_ref.ballybi_dcustomerphone dcp2_cust ON dcp1.customerkey < dcp2_cust.customerkey and (translate(dcp1.PhoneNumber, '-', ' ') = translate(dcp2_cust.PhoneNumber, '-', ' ') OR dcp1.PhoneNumber = dcp2_cust.PhoneNumber)
JOIN gmscompliance_ref.ballybi_dcustomer dc2_cust ON dcp2_cust.customerkey = dc2_cust.customerkey AND dcp2_cust.loyaltyprogramid = dc2_cust.loyaltyprogramid
WHERE
d.PlayerAccountStatus = 'Active'
AND dc2_cust.PlayerAccountStatus = 'Active'
AND d.eid <> 0
AND d.LoyaltyProgramId <> 'GEO'
AND d.FirstName = dc2_cust.FirstName
AND d.eid <> dc2_cust.Eid
ORDER BY d.eid;
当impala处于hue状态时,状态栏显示75%是impala读取文件所花费的时间,而不是所有步骤的75%。尝试进入查询计划部分(点击状态栏右侧的查询id,例如: fb4a404290538b7d:9c46dee100000000 )看看哪一步慢。通过您的查询,我认为您应该注意加入表“dcp2_cust”可能会导致数据爆炸的情况。