我正在尝试运行此查询,在没有任何索引的情况下大约需要 76 秒,现在我在其中添加了索引,现在大约需要 80 秒。
该表目前有 24k 条记录。
这是 DDL :
CREATE TABLE personal_details (
pno VARCHAR(20) PRIMARY KEY, -- Personal number or unique ID
first_name VARCHAR(20), -- First name
middle_name VARCHAR(20), -- Middle name (nullable if not always used)
last_name VARCHAR(20), -- Last name
dob DATE, -- Date of birth (YYYY-MM-DD format)
gender CHAR(1), -- Gender (M/F/O)
email VARCHAR(50), -- Email address
phone_number VARCHAR(20), -- Phone number (variable format, max 20 chars)
address VARCHAR(255), -- Address (full address string)
city VARCHAR(50), -- City
state VARCHAR(50), -- State
zip_code VARCHAR(10), -- Postal or zip code
country VARCHAR(50), -- Country
marital_status VARCHAR(20), -- Marital status (Single/Married/etc.)
nationality VARCHAR(50), -- Nationality
occupation VARCHAR(50), -- Job title or occupation
salary DECIMAL(10, 2), -- Salary (up to 10 digits, 2 decimal places)
hire_date DATE, -- Hire date (YYYY-MM-DD format)
department VARCHAR(50), -- Department name
is_active BOOLEAN -- Status flag for active/inactive (True/False)
);
这是需要 76 秒的查询:
SELECT
pd1.pno AS pno1,
pd1.first_name AS first_name1,
pd1.last_name AS last_name1,
pd2.pno AS pno2,
pd2.first_name AS first_name2,
pd2.last_name AS last_name2,
pd3.pno AS pno3,
pd3.first_name AS first_name3,
pd3.last_name AS last_name3
FROM
personal_details pd1
JOIN
personal_details pd2 ON pd1.city = pd2.city
JOIN
personal_details pd3 ON pd2.state = pd3.state
WHERE
pd1.dob BETWEEN '1980-01-01' AND '1990-12-31'
AND pd2.gender = 'M'
AND pd3.salary > 50000
ORDER BY
pd1.pno, pd2.pno, pd3.pno;
这是我创建的索引(尝试和测试了很少)
CREATE INDEX idx_pno ON personal_details (pno);
CREATE INDEX idx_composite_query ON personal_details (city, state, dob, gender, salary);
CREATE INDEX idx_city ON personal_details (city);
CREATE INDEX idx_state ON personal_details (state);
CREATE INDEX idx_dob ON personal_details (dob);
CREATE INDEX idx_gender ON personal_details (gender);
CREATE INDEX idx_salary ON personal_details (salary);
即使创建这些索引后,查询也花费了很多时间(大约 80 秒,比正常查询要长)
我试图了解索引是如何工作的,这就是为什么我创建了这个自连接的查询,我原以为执行时间会减少,但它增加了。
我尝试创建单个索引和复合索引,但都给了我几乎相同的结果。
尝试阅读几篇文章并根据其进行更改,但无济于事,有点迷失在这里。
我还尝试使用
EXPLAIN ANALYZE
分析查询
EXPLAIN
"-> Sort: pd1.pno, pd2.pno, pd3.pno (actual time=19130..19461 rows=2.25e+6 loops=1)
-> Stream results (cost=346436 rows=105833) (actual time=129..1905 rows=2.25e+6 loops=1)
-> Inner hash join (pd3.state = pd2.state) (cost=346436 rows=105833) (actual time=129..634 rows=2.25e+6 loops=1)
-> Filter: (pd3.salary > 50000) (cost=50.4 rows=813) (actual time=0.0498..23.2 rows=22872 loops=1)
-> Table scan on pd3 (cost=50.4 rows=24394) (actual time=0.0433..19.1 rows=24000 loops=1)
-> Hash
-> Nested loop inner join (cost=7124 rows=3905) (actual time=0.105..62.5 rows=4916 loops=1)
-> Filter: ((pd2.gender = 'M') and (pd2.city is not null)) (cost=2633 rows=2439) (actual time=0.058..20.3 rows=7968 loops=1)
-> Table scan on pd2 (cost=2633 rows=24394) (actual time=0.0541..17.6 rows=24000 loops=1)
-> Index lookup on pd1 using idx_composite_query (city=pd2.city), with index condition: (pd1.dob between '1980-01-01' and '1990-12-31') (cost=0.4 rows=1.6) (actual time=0.00432..0.00515 rows=0.617 loops=7968)
"
这就是向我展示的。此查询仅用于我自学的教育目的
您可以尝试添加这三个索引。其他索引可以去掉,没什么用
CREATE INDEX idx_bob_pno_state_city ON personal_details (bob, pno, state, city);
CREATE INDEX idx_city_gender_pno ON personal_details (city, gender, pno);
CREATE INDEX idx_state_salary_pno ON personal_details (city, salary, pno);