如何使用查询从 cosmos db 中提取大型 json 数据?

问题描述 投票:0回答:1

我需要从一个大的json中提取某些信息。

我的json如下:

{
    "id": "abc",
    "doc1": {
        "documentName": "xyz",
        "documentUrl": "xyz",
        "documentType": ".xlsx",
        "contents": [
            {
                "Date": "2022-06-10T00:00:00",
                "Type": "Interaction",
                "Subject": "ABC",
                "Description": "My name is ABC."
            },
            {
                "Date": "2022-12-01T00:00:00",
                "Type": "Interaction",
                "Subject": "DEF",
                "Description": "I live in a town named DEF."
            },
            {
                "Date": "2023-03-15T00:00:00",
                "Type": "Interaction",
                "Subject": "IJK",
                "Description": "He is known as IJK."
            }
        ]
    },
    "doc2": {
        "documentName": "wyc",
        "documentUrl": "wyc",
        "documentType": ".xlsx",
        "contents": [
            {
                "Date": "2023-12-05T00:00:00",
                "Type": "Task",
                "Subject": "KLM",
                "Description": "She has a friend who is called as KLM.",
                "Status": "Completed"
            },
            {
                "Date": "2023-03-15T00:00:00",
                "Type": "Task",
                "Subject": "ROQ",
                "Description": "The dessert is named as ROQ.",
                "Status": "Completed"
            },
            {
                "Date": "2023-07-15T00:00:00",
                "Type": "Task",
                "Subject": "VDI",
                "Description": "We need to know the name of the school that VDI goes to.",
                "Status": "Open"
            }
        ]
    },
    "doc3": {
        "documentName": "ckl",
        "documentUrl": "ckl",
        "documentType": ".pdf",
        "contents": [
            {
                "pageNo": 1,
                "pageText": "Hi this place is known to have awesome desserts."
            },
            {
                "pageNo": 2,
                "pageText": "Hello World."
            },
            {
                "pageNo": 3,
                "pageText": "It is a beautiful day."
            },
            {
                "pageNo": 4,
                "pageText": "Sorry I think you have reached the wrong number."
            }
        ]
    }
}

我正在尝试从 doc1 中提取

"Date", "Subject", "Description"
,从 doc2 中提取
"Date", "Subject", "Description" and "Status"
,从 doc3 中提取
"documentUrl" and "pageText"
(仅从
"pageNo" 2 and 3
)。

sql azure nosql azure-cosmosdb azure-cosmosdb-sqlapi
1个回答
0
投票

你可以看到我只为第2页和第3页设置了数组过滤器。您当然可以更改它,将数据放在自己的字段中,而不是像我下面所做的那样:

SELECT c.id, ARRAY(SELECT VALUE t.Date FROM t in c.doc1.contents) AS doc1Date,ARRAY(SELECT VALUE t.Subject FROM t in c.doc1.contents) AS doc1Subject, ARRAY(SELECT VALUE t.Description FROM t in c.doc1.contents) AS doc1Description,ARRAY(SELECT VALUE t.Date FROM t in c.doc2.contents) AS doc2Date, ARRAY(SELECT VALUE t.Status FROM t in c.doc2.contents) AS doc2Status, c.doc3.documentUrl AS doc3url, ARRAY(SELECT VALUE t.pageText FROM t in c.doc3.contents WHERE t["pageNo"] IN (2, 3)) AS doc3pageText FROM c WHERE c.id = "abc" 

它会给出如下响应。

[
    {
        "id": "abc",
        "doc1Date": [
            "2022-06-10T00:00:00",
            "2022-12-01T00:00:00",
            "2023-03-15T00:00:00"
        ],
        "doc1Subject": [
            "ABC",
            "DEF",
            "IJK"
        ],
        "doc1Description": [
            "My name is ABC.",
            "I live in a town named DEF.",
            "He is known as IJK."
        ],
        "doc2Date": [
            "2023-12-05T00:00:00",
            "2023-03-15T00:00:00",
            "2023-07-15T00:00:00"
        ],
        "doc2Status": [
            "Completed",
            "Completed",
            "Open"
        ],
        "doc3url": "ckl",
        "doc3pageText": [
            "Hello World.",
            "It is a beautiful day."
        ]
    }
]
© www.soinside.com 2019 - 2024. All rights reserved.