在 SQLite 中,我想从此结果中删除重复项(gist,sqlime.org demo):
有些功能适用于所有产品,有些功能仅适用于某些产品。结果有不正确的行,如第二个
2024 B12XBE A1C B1 B1 hardened steal UT gehärteter Stahl
。两个 feature_text 都应以 UT
开头。但它包含重复的行。如何避免重复行?我想了解底层逻辑(笛卡尔积或类似)。
样品:
DROP TABLE IF EXISTS product_features;
DROP TABLE IF EXISTS feature_text;
CREATE TABLE product_features (
year INTEGER, -- (4)
product TEXT, -- (6)
feature TEXT -- (3)
);
CREATE TABLE feature_text (
year INTEGER, -- (4)
product_group TEXT,
-- (2) usually matches the first two digits of product
-- but is sometimes empty / NULL or blank ' '
feature TEXT, -- (3)
language INTEGER, -- (2)
description1 TEXT -- up to 40
);
INSERT INTO product_features (year, product, feature) VALUES
(2024, 'B12XBE', 'A1C'), (2024, 'B12XBE', '7B0'),
(2024, 'B12XBE', 'DUL'), (2024, 'UTB2EF', 'A1C'),
(2024, 'UTB2EF', '7B0'), (2024, 'UTB2EF', 'DUL'),
(2024, 'X9Y8Z7', 'DUH'), (2024, 'X9Y8Z7', '7B0');
INSERT INTO feature_text
(year, product_group, feature, language, description1)
VALUES
(2024, 'B1', 'A1C', 1, 'B1 hardened steel'),
(2024, 'B1', '7B0', 1, 'B1 diamond tip'),
(2024, 'B1', 'DSP', 1, 'display 1 inch'),
(2024, 'B1', 'DSP', 1, 'Anzeige 1,5 cm'),
(2024, 'UT', 'A1C', 1, 'UT hardened steel'),
(2024, 'UT', '7B0', 1, 'UT diamond tip'),
(2024, 'UT', 'DSP', 1, 'display 1,5 inch'),
(2024, 'UT', 'DSP', 1, 'Anzeige 2,25 cm'),
(2024, ' ', 'DUL', 1, '10mm for light duty'),
(2024, 'X9', '7B0', 1, 'X9 diamond tip'),
(2024, ' ', 'DUH', 1, '13mm for heavy duty'),
(2024, 'B1', 'A1C', 2, 'B1 gehärteter Stahl'),
(2024, 'B1', '7B0', 2, 'B1 Diamant Spitze'),
(2024, 'UT', 'A1C', 2, 'UT gehärteter Stahl'),
(2024, 'UT', '7B0', 2, 'UT Diamant Spitze'),
(2024, ' ', 'DUL', 2, '10mm für leichte Aufgaben'),
(2024, 'X9', '7B0', 2, 'X9 Diamant Spitze'),
(2024, ' ', 'DUH', 2, '13mm für schwere Aufgaben'),
(2024, NULL, 'DSP', 1, 'display'),
(2024, NULL, 'DSP', 1, 'Anzeige');
我的做法:
WITH prft_en as (
SELECT
pf.year, pf.product, pf.feature,
COALESCE(ft1.product_group, ft2.product_group) AS product_group,
COALESCE(ft1.description1, ft2.description1) AS feature_text_en,
COALESCE(ft1.language, ft2.language) AS language_en
FROM product_features pf
LEFT JOIN feature_text ft1
ON pf.year = ft1.year
AND pf.feature = ft1.feature
AND substr(pf.product, 1,2) = ft1.product_group
LEFT JOIN feature_text ft2
ON pf.year = ft2.year
AND pf.feature = ft2.feature
AND ft2.product_group is ' ' OR ft2.product_group is NULL
WHERE ft1.language = 1 OR ft2.language = 1
), prft_de AS (
SELECT
pf.year, pf.product, pf.feature,
COALESCE(ft1.product_group, ft2.product_group) AS product_group,
COALESCE(ft1.description1, ft2.description1) AS feature_text_de,
COALESCE(ft1.language, ft2.language) AS language_de
FROM product_features pf
LEFT JOIN feature_text ft1
ON pf.year = ft1.year
AND pf.feature = ft1.feature
AND substr(pf.product, 1,2) = ft1.product_group
LEFT JOIN feature_text ft2
ON pf.year = ft2.year
AND pf.feature = ft2.feature
AND ft2.product_group is ' ' OR ft2.product_group is NULL
WHERE ft1.language = 2 OR ft2.language = 2
)
查询CTE:
SELECT prft_en.year, prft_en.product,
prft_en.feature, prft_en.product_group,
prft_en.feature_text_en, prft_de.feature_text_de
FROM prft_en
LEFT JOIN prft_de
ON prft_en.year = prft_de.year
AND prft_en.feature = prft_de.feature
AND SUBSTR(prft_en.product, 1,2) = prft_en.product_group
我认为整个 CTE 的东西确实不是你想要的。我认为您想要每个产品一行,所以从那开始
select
f.*,
substring(f.product, 1, 2) as product_group
from product_features f
然后对于每个产品,您想要获取相应的描述(如果存在),这可以简单地在英语子查询列中完成:
(
select description1
from feature_text t
where f.year = t.year
and (product_group = t.product_group or t.product_group = ' ' or t.product_group is null)
and f.feature = t.feature
and language = 1
) as feature_text_en
对于德国人来说:
(
select description1
from feature_text t
where f.year = t.year
and (product_group = t.product_group or t.product_group = ' ' or t.product_group is null)
and f.feature = t.feature
and language = 2
) as feature_text_de
这是 两者都在 sqlime 上 并生成此结果表:
年 | 产品 | 特点 | 产品组 | feature_text_en | feature_text_de |
---|---|---|---|---|---|
2024 | B12XBE | A1C | B1 | B1硬化偷窃 | B1 斯塔尔 |
2024 | B12XBE | 7B0 | B1 | B1钻石尖头 | B1 钻石斯皮策 |
2024 | B12XBE | DUL | B1 | 10mm 适用于轻型 | 10mm 轻量级 |
2024 | UTB2EF | A1C | UT | UT硬化窃取 | UT gehärteter Stahl |
2024 | UTB2EF | 7B0 | UT | UT金刚石刀尖 | UT 钻石斯皮茨 |
2024 | UTB2EF | DUL | UT | 10mm 适用于轻型 | 10mm 轻量级 |
2024 | X9Y8Z7 | 呃 | X9 | 13mm 适用于重型 | 13mm für schwere Aufgaben |
2024 | X9Y8Z7 | 7B0 | X9 | X9钻石尖头 | X9 钻石斯皮策 |
我想您所需要的只是按复杂条件进行左连接。像这样:
select
product_features.year,
product_features.product,
product_features.feature,
substr(product_features.product, 1, 2) AS product_group,
text_en.description1 AS en,
text_de.description1 AS de
FROM product_features
LEFT JOIN feature_text AS text_en
ON text_en.year = product_features.year
AND text_en.feature = product_features.feature
AND (substr(product_features.product, 1, 2) = text_en.product_group OR text_en.product_group IS NULL OR TRIM(text_en.product_group) = '')
AND text_en.language = 1
LEFT JOIN feature_text AS text_de
ON text_de.year = product_features.year
AND text_de.feature = product_features.feature
AND (substr(product_features.product, 1, 2) = text_de.product_group OR text_de.product_group IS NULL OR TRIM(text_de.product_group) = '')
AND text_de.language = 2
请注意,如果为单个产品和产品组提供不同的描述,您无论如何都会得到重复的内容。如果您需要解决此类重复,那么每个 LEFT JOIN 都需要我制作 2 次(对于产品和组),然后应使用 COALESCE 进行适当的描述。
编辑:示例如何避免后一种重复
select
product_features.year,
product_features.product,
product_features.feature,
substr(product_features.product, 1, 2) AS product_group,
COALESCE(text_en.description1, text_en_gen.description1) AS en,
COALESCE(text_de.description1, text_de_gen.description1) AS de
FROM product_features
LEFT JOIN feature_text AS text_en
ON text_en.year = product_features.year
AND text_en.feature = product_features.feature
AND (substr(product_features.product, 1, 2) = text_en.product_group)
AND text_en.language = 1
LEFT JOIN feature_text AS text_en_gen
ON text_en_gen.year = product_features.year
AND text_en_gen.feature = product_features.feature
AND (text_en_gen.product_group IS NULL OR TRIM(text_en_gen.product_group) = '')
AND text_en_gen.language = 1
LEFT JOIN feature_text AS text_de
ON text_de.year = product_features.year
AND text_de.feature = product_features.feature
AND (substr(product_features.product, 1, 2) = text_de.product_group)
AND text_de.language = 2
LEFT JOIN feature_text AS text_de_gen
ON text_de_gen.year = product_features.year
AND text_de_gen.feature = product_features.feature
AND (text_de_gen.product_group IS NULL OR TRIM(text_de_gen.product_group) = '')
AND text_de_gen.language = 2