3.6 聚合操作 🎉
...大约 2 分钟
3.6 聚合操作 🎉
通过聚合操作可以处理多个文档,并返回计算后的结果。
- 对多个文档进行分组
- 对分组的文档执行操作并返回单个结果
- 分析数据变化
聚合管道 💎
分别由多个阶段来处理文档,每个阶段的输出是下个阶段的输入,返回的是一组文档的处理结果,例如,total、average、maxmium、minimium。
测试数据
db.j3.insertMany( [
{ _id: 0, name: "Pepperoni", size: "small", price: 19,
quantity: 10, date: ISODate( "2030-03-13T08:14:30Z" ) },
{ _id: 1, name: "Pepperoni", size: "medium", price: 20,
quantity: 20, date : ISODate( "2030-03-13T09:13:24Z" ) },
{ _id: 2, name: "Pepperoni", size: "large", price: 21,
quantity: 30, date : ISODate( "2030-03-17T09:22:12Z" ) },
{ _id: 3, name: "Cheese", size: "small", price: 12,
quantity: 15, date : ISODate( "2030-03-13T11:21:39.736Z" ) },
{ _id: 4, name: "Cheese", size: "medium", price: 13,
quantity:50, date : ISODate( "2031-01-12T21:23:13.331Z" ) },
{ _id: 5, name: "Cheese", size: "large", price: 14,
quantity: 10, date : ISODate( "2031-01-12T05:08:13Z" ) },
{ _id: 6, name: "Vegan", size: "small", price: 17,
quantity: 10, date : ISODate( "2030-01-13T05:08:13Z" ) },
{ _id: 7, name: "Vegan", size: "medium", price: 18,
quantity: 10, date : ISODate( "2030-01-13T05:10:13Z" ) }
] )
计算尺寸为 medium 的订单中,每种类型的订单数量
db.j3.aggregate([
// Stage 1: 匹配size:"medium"的文档
{
$match: { size: "medium" }
},
// Stage 2: 根据name统计过滤后的文档,并把"quantity"值相加
{
$group: { _id: "$name", totalQuantity: { $sum: "$quantity" } }
}
])
输出结果:
[
{ _id: 'Cheese', totalQuantity: 50 },
{ _id: 'Vegan', totalQuantity: 10 },
{ _id: 'Pepperoni', totalQuantity: 20 }
]
更复杂的例子:
db.j3.aggregate([
// Stage 1: 根据日期范围过滤
{
$match:{
"date": { $gte: new ISODate( "2030-01-01" ), $lt: new ISODate( "2030-01-30" ) }
}
},
// Stage 2: 对过滤后文档以日期为条件进行分组并计算
{
$group:{
_id: { $dateToString: { format: "%Y-%m-%d", date: "$date" } },
totalOrderValue: { $sum: { $multiply: [ "$price", "$quantity" ] } },
averageOrderQuantity: { $avg: "$quantity" }
}
},
// Stage 3: 按照订单价值倒序排列文档
{
$sort: { totalOrderValue: -1 }
}
])
输出结果:
[
{ _id: '2030-01-13', totalOrderValue: 350, averageOrderQuantity:10 }
]
统计集合中文档数量
db.j3.count()
根据指定的字段进行过滤,去掉重复的文档
db.j3.distinct("name")
聚合管道顺序优化 💎
聚合管道在执行的过程中有一个优化的阶段,以提高性能。
$addFields: {
maxTime: { $max: "$times" },
minTime: { $min: "$times" }
} },
{ $project: {
_id: 1, name: 1, times: 1, maxTime: 1, minTime: 1,
avgTime: { $avg: ["$maxTime", "$minTime"] }
} },
{ $match: {
name: "Joe Schmoe",
maxTime: { $lt: 20 },
minTime: { $gt: 5 },
avgTime: { $gt: 7 }
} }
优化思路:优化器把$match
阶段分成了 4 个独立的过滤器,尽可能把过滤器放在 $project
操作前面,优化后的聚合管道如下:
{ $match: { name: "Joe Schmoe" } },
{ $addFields: {
maxTime: { $max: "$times" },
minTime: { $min: "$times" }
} },
{ $match: { maxTime: { $lt: 20 }, minTime: { $gt: 5 } } },
{ $project: {
_id: 1, name: 1, times: 1, maxTime: 1, minTime: 1,
avgTime: { $avg: ["$maxTime", "$minTime"] }
} },
{ $match: { avgTime: { $gt: 7 } } }
再例如:
{ $sort: { age : -1 } },
{ $match: { status: 'A' } }
优化后:
{ $match: { status: 'A' } },
{ $sort: { age : -1 } }
限制事项
- 回结果集不能超过 16M 字节
- 单个管道中的 stage 数量不能超过 1000 个(MongoDB 5.0)
Powered by Waline v2.15.7