我在mongodb集合中有一些从2013年7月13日到2013年7月18日的文档。但收集过程中遗漏了几天的数据。我需要在最后五个文档的帮助下插入该数据。
我的数据看起来像
{
_id: 1,
"timestamp": "2023-07-13T10:00:00Z"
"code": "bc",
"energy": 2333
},
{
_id: 2,
"timestamp": "2023-07-13T10:10:00Z"
"code": "abc",
"energy": 2340
},
{
_id: 3,
"timestamp": "2023-07-14T10:30:00Z"
"code": "abc",
"energy": 2349
},
{
_id: 4,
"timestamp": "2023-07-14T10:40:00Z"
"code": "ah",
"energy": 2355
},
{
_id: 5,
"timestamp": "2023-07-14T10:50:00Z"
"code": "kk",
"energy": 2360
},
{
_id: 6,
"timestamp": "2023-07-14T11:00:00Z"
"code": "la",
"energy": 2370
},
{
_id: 7,
"timestamp": "2023-07-14T10:00:00Z"
"code": "as",
"energy": 3455
},
{
_id: 8,
"timestamp": "2023-07-17T10:10:00Z"
"code": "uj",
"energy": 4567659
},
{
_id: 9,
"timestamp": "2023-07-17T10:30:00Z"
"code": "la",
"energy": 564546
},
{
_id: 10,
"timestamp": "2023-07-17T10:40:00Z"
"code": "ws",
"energy": 5654348
},
{
_id: 11,
"timestamp": "2023-07-18T10:50:00Z"
"code": "lk",
"energy": 6765436
},
{
_id: 12,
"timestamp": "2023-07-18T11:00:00Z"
"code": "pl",
"energy": 7654223
}
这里缺少第 15 个和第 16 个数据,所以我需要插入这些文档并更新能量场值,取最后 5 个文档能量场值的平均值,然后在第 15 个日期文档中提及该值,就像下一个文档一样,请对此提供帮助。
输出:
{
_id: 1,
"timestamp": "2023-07-13T10:00:00Z"
"code": "bc",
"energy": 2333
},
{
_id: 2,
"timestamp": "2023-07-13T10:10:00Z"
"code": "abc",
"energy": 2340
},
{
_id: 3,
"timestamp": "2023-07-14T10:30:00Z"
"code": "abc",
"energy": 2349
},
{
_id: 4,
"timestamp": "2023-07-14T10:40:00Z"
"code": "ah",
"energy": 2355
},
{
_id: 5,
"timestamp": "2023-07-14T10:50:00Z"
"code": "kk",
"energy": 2360
},
{
_id: 6,
"timestamp": "2023-07-14T11:00:00Z"
"code": "la",
"energy": 2370
},
{
_id: 7,
"timestamp": "2023-07-14T10:00:00Z"
"code": "as",
"energy": 2380
},
{
_id: 8,
"timestamp": "2023-07-15T10:00:00Z"
"code": "as",
"energy": 9910
},
{
_id: 9,
"timestamp": "2023-07-15T10:15:00Z"
"code": "as",
"energy": 11447
},
{
_id: 10,
"timestamp": "2023-07-15T10:30:00Z"
"code": "as",
"energy": 19309.4
},
{
_id: 11,
"timestamp": "2023-07-15T10:45:00Z"
"code": "as",
"energy": 29968
},
{
_id: 12,
"timestamp": "2023-07-15T11:00:00Z"
"code": "as",
"energy": 49040
},
{
_id: 13,
"timestamp": "2023-07-16T10:00:00Z"
"code": "as",
"energy": 80442
},
{
_id: 14,
"timestamp": "2023-07-16T10:15:00Z"
"code": "as",
"energy": 1025852
},
{
_id: 16,
"timestamp": "2023-07-17T10:10:00Z"
"code": "uj",
"energy": 4567659
},
{
_id: 17,
"timestamp": "2023-07-17T10:30:00Z"
"code": "la",
"energy": 564546
},
{
_id: 18,
"timestamp": "2023-07-17T10:40:00Z"
"code": "ws",
"energy": 5654348
},
{
_id: 19,
"timestamp": "2023-07-18T10:50:00Z"
"code": "lk",
"energy": 6765436
},
{
_id: 20,
"timestamp": "2023-07-18T11:00:00Z"
"code": "pl",
"energy": 7654223
}
你可以试试这个:
db.collection.aggregate([
{
$set: {
day: {
$dateTrunc: {
date: "$timestamp",
unit: "day",
timezone: "Europe/Zurich"
}
}
}
},
{
$setWindowFields: {
sortBy: { timestamp: 1 },
output: {
average: {
$avg: "$energy",
window: { documents: [-5, "current"] }
}
}
}
},
{
$densify: {
field: "day",
range: {
step: 1,
unit: "day",
bounds: "full"
}
}
},
{
$fill: {
sortBy: { day: 1, timestamp: 1 },
output: {
average: { method: "locf" }
}
}
},
{
$set: {
timestamp: { $ifNull: ["$timestamp", "$day"] },
average: "$$REMOVE",
energy: { $ifNull: ["$energy", "$average"] },
day: "$$REMOVE",
}
}
])
注意,
$fill
还提供了线性回归函数,可能拟合得更好。