我在 cosmos db 中的项目有大量非结构化数据。这些项目的一部分包含如下所示的信息:
object_results": {
"read_file": {
"step_name": "ReadFile",
"step_id": "read_file",
"start_time": "2024-05-23 16:33:56",
"end_time": null,
"status": "success"
},
"provide_note": {
"error_message": "Missing category_name for note_key daily_dashboard",
"step_name": "ProvideNote",
"step_id": "provide_note",
"start_time": "2024-05-23 16:35:27",
"end_time": null,
"status": "failed"
}
}
有时,
object_results
字典中的项目数量可能会动态变化。它可以包含更多具有不同键名称的子项目,如下所示:
"object_results": {
"display_result": {
"step_name": "DisplayResult",
"step_id": "display_result",
"start_time": "2024-05-15 18:54:27",
"end_time": null,
"status": "failed"
},
"provide_note": {
"error_message": "Missing category_name for note_key daily_dashboard",
"step_name": "ProvideNote",
"step_id": "provide_note",
"start_time": "2024-05-15 18:58:14",
"end_time": null,
"status": "success"
},
"get_response": {
"error_message": "Missing response_subject for response_key consumer_complaints",
"step_name": "GetResponse",
"step_id": "get_response",
"start_time": "2024-05-15 20:13:45",
"end_time": null,
"status": "failed"
}
}
现在,我试图从这些数据中提取的是相应状态为“失败”的所有数据。但问题是我只能通过手动编写每个 WHERE 条件来完成此操作,如下所示:
select * from response_results res
where res.object_results.provide_note.status = 'failed'
or res.object_results.display_result.status = 'failed'
or res.object_results.get_response.status = 'failed'
有没有办法使 WHERE 条件通用,以便我可以获取包含失败状态的所有数据,无论它属于哪个键?
您引入了数据模型反模式,使用属性名称来定义特定的对象类型。此外,您还有数量可变的子文档。真正的解决方案是使用以下任一方法正确重新建模:
事实证明,您已经将对象类型存储在
step_name
中。考虑到这一点,转移到数组看起来像:
{
"object_results": [
{
"step_name": "ReadFile",
"step_id": "read_file",
"start_time": "2024-05-23 16:33:56",
"end_time": null,
"status": "success"
},
{
"error_message": "Missing category_name for note_key daily_dashboard",
"step_name": "ProvideNote",
"step_id": "provide_note",
"start_time": "2024-05-23 16:35:27",
"end_time": null,
"status": "failed"
}
]
}
现在您的搜索已简化为任意数量的对象:
SELECT *
FROM res
WHERE ARRAY_CONTAINS(res.object_results, {status:"failed"},true)
同样,如果将每个对象结果存储为单独的文档而不是数组,则结果将类似于:
SELECT *
FROM res
WHERE res.object_results.status="failed"
或者如果您单独的文档包含顶层的所有内容:
SELECT *
FROM res
WHERE res.status="failed"