我使用 SQL API 在 CosmosDB 中拥有数百万个文档,我需要从所有文档中找到唯一的类别。
文档如下所示,您可以在描述下方看到类别数组,我不关心它们的顺序,我只需要知道集合中所有文档中的所有唯一文档,我需要这个,以便稍后我可以在类别上创建查询,但这是稍后的问题,我首先需要将它们全部列出来,以便我知道所有可能的选项是什么,但我无法找出执行此操作的查询,以便我只获得类别名称。
{
"id": "56d934d3-90bf-4f5a-b602-e515fefa599f",
"_id": "5bf6705f9568cf00013cd13c",
"vendor": "XXX",
"updatedAt": "2018-11-23T03:55:30.044Z",
"locales": [
{
"title": "Cold shoulder t-shirt",
"description": "Because collar bones. Trending cold shoulder t-shirt in 100% organic cotton. Classic, wide and boxy t-shirt fit with cut-out details. In black, because black tees and fashion are like this (insert friendly hand gesture). This style is online exclusive.",
"categories": [
"Women",
"clothing",
"tops"
],
"brand": null,
"images": [
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_102],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_203],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[01_0659881_001_301],type[ECOMLOOK],device[hdpi],quality[80],ImageVersion[2018081]&call=url[file:/product/main]",
"https://lp.xxx.com/app002prod?set=source[02_0659881_001_101],type[PRODUCT],device[hdpi],quality[80],ImageVersion[1.0]&call=url[file:/product/main]"
],
"country": "SE",
"currency": "SEK",
"language": "en",
"variants": [
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XXS",
"color": "Black magic"
}
},
{
"artno": "xxx",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XS",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "XL",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "S",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 1,
"attributes": {
"size": "M",
"color": "Black magic"
}
},
{
"artno": "0659881001",
"urls": [
"https://click.linksynergy.com/link?id=INtcw3sexSw&offerid=491018&type=2&murl=https%3A%2F%2Fwww.xxx.com%2Fen_sek%2Fclothing%2Ftops%2Fproduct.cold-shoulder-t-shirt-black-magic.0659881001.html"
],
"price": 80,
"stock": 0,
"attributes": {
"size": "L",
"color": "Black magic"
}
}
]
}
],
"_rid": "QEwcALNbIz8GAAAAAAAAAA==",
"_self": "dbs/QEwcAA==/colls/QEwcALNbIz8=/docs/QEwcALNbIz8GAAAAAAAAAA==/",
"_etag": "\"6a0003c6-0000-0000-0000-5bf7958c0000\"",
"_attachments": "attachments/",
"_ts": 1542952332
}
请看我的测试,它可以获得所有唯一的类别名称。
样本文件:
[
{
"id": "1",
"locales": [
{
"categories": [
"Women",
"clothing",
"tops"
]
}
]
},
{
"id": "2",
"locales": [
{
"categories": [
"Men",
"test",
"tops"
]
}
]
}
]
SQL:
SELECT distinct cat FROM c
join l in c.locales
join cat in l.categories
输出:
[
{
"cat": "Women"
},
{
"cat": "clothing"
},
{
"cat": "tops"
},
{
"cat": "Men"
},
{
"cat": "test"
}
]
如果不想区分大小写,只需在sql中使用
LOWER
函数即可。
SELECT distinct Lower(cat) FROM c
join l in c.locales
join cat in l.categories
如果你想得到
["Women","clothing","tops","Men","test"]
,它不能直接在单个sql中解析为数组,你可以使用存储过程来解析输出数组。
例如,在存储过程中添加以下代码。
var returnArray = [];
for(var i=0 ;i<array.size;i++){
returnArray.push(array[i].value)
}
return returnArray;
选择不同的”: 查询的这一部分指示 CosmosDB 仅返回指定字段中的唯一值。 “c.类别”: 这是指每个文档中的“类别”字段,这是从中提取唯一值的地方。 要记住的要点: 字段选择: 确保您在“DISTINCT”之后指定的字段包含您要从中检索唯一值的类别。 性能考虑: 根据数据的大小,使用“DISTINCT”可能需要扫描大量文档,因此请考虑对“类别”字段建立索引以获得更好的性能。 欲了解更多信息并购买,请访问:https://biancaspender.com/