我有一个 MongoDB 数据库,其中数据结构多年来发生了变化。数据示例如下:
{ entry_id: 111, entrant_company:"Magic", year:2021, credit_name:"Bob Jones", credit_role:"Designer"}
{ entry_id: 111, entrant_company:"Magic", year:2021, credit_name:"Allan Jeggs", credit_role:"Director"}
{ entry_id: 111, entrant_company:"Magic", year:2021, credit_name:"Sarah Jones", credit_role:"Artist"}
{ entry_id: 348, year:2022, credit_name:"Glitter", credit_role:"Advertising Agency"}
{ entry_id: 348, year:2022, credit_name:"Jim Jones", credit_role:"Designer"}
{ entry_id: 348, year:2022, credit_name:"Allan Yi", credit_role:"Publisher"}
{ entry_id: 555, year:2023, credit_name:"Sparkle", credit_role:"Advertising Agency"}
{ entry_id: 555, year:2023, credit_name:"Josh James", credit_role:"Designer"}
{ entry_id: 555, year:2023, credit_name:"Ellen Dodd", credit_role:"Publisher"}
...
对于 2021 年以上的年份,我缺少
entrant_company
键,而是在 credit_role
等于“广告代理机构”时提供此信息。
按
entry_id
对我的数据进行分组,如何添加名为 entrant_company
的键并将其设置为等于 credit_name
,其中 credit_role
是“广告代理机构”并且 year
大于 2021 ?
为了清楚起见,我的目标是让数据看起来像这样:
{ entry_id: 111, entrant_company:"Magic", year:2021, credit_name:"Bob Jones", credit_role:"Designer"}
{ entry_id: 111, entrant_company:"Magic", year:2021, credit_name:"Allan Jeggs", credit_role:"Director"}
{ entry_id: 111, entrant_company:"Magic", year:2021, credit_name:"Sarah Jones", credit_role:"Artist"}
{ entry_id: 348, entrant_company:"Glitter", year:2022, credit_name:"Glitter", credit_role:"Advertising Agency"}
{ entry_id: 348, entrant_company:"Glitter", year:2022, credit_name:"Jim Jones", credit_role:"Designer"}
{ entry_id: 348, entrant_company:"Glitter", year:2022, credit_name:"Allan Yi", credit_role:"Publisher"}
{ entry_id: 555, entrant_company:"Sparkle", year:2023, credit_name:"Sparkle", credit_role:"Advertising Agency"}
{ entry_id: 555, entrant_company:"Sparkle", year:2023, credit_name:"Josh James", credit_role:"Designer"}
{ entry_id: 555, entrant_company:"Sparkle", year:2023, credit_name:"Ellen Dodd", credit_role:"Publisher"}
我可以像这样更新单个条目,但我想为每个给定的
entrant_company
更新 entry_id
:
db.entries.updateMany(
{ year: { $gt: 2021 } },
[{
$set: {
entrant_company: {
$cond: {
if: { $eq: ["$credit_role", "Advertising Agency"] },
then: "$credit_name",
else: "$$REMOVE"
}
}
}
}])
您需要通过 entry_id
加入基于
credit_role: "Advertising Agency"
和 $lookup
的文档的 entries集合。但是,使用
$lookup
聚合阶段无法在 .update()
操作中执行。
因此,您需要切换到
.aggregate()
查询和 $merge
阶段来替换文档。
db.entries.aggregate([
{
$match: {
$expr: {
$and: [
{
$gt: [
"$year",
2021
]
},
{
$eq: [
{
$type: "$entrant_company"
},
"missing"
]
}
]
}
}
},
{
$lookup: {
from: "entries",
let: {
entry_id: "$entry_id"
},
pipeline: [
{
$match: {
$expr: {
$and: [
{
$eq: [
"$entry_id",
"$$entry_id"
]
},
{
$eq: [
"$credit_role",
"Advertising Agency"
]
}
]
}
}
}
],
as: "joined_entrant_company"
}
},
{
$set: {
entrant_company: {
$first: "$joined_entrant_company.credit_name"
}
}
},
{
$unset: "joined_entrant_company"
},
{
$merge: {
into: "entries",
on: "_id",
whenMatched: "replace"
}
}
])