原因:mongodb每一个文档默认只有16M。聚合的结果是一个BSON文档,当超过16M大小时,就会报内存不够错误。
exceeded memory limit for $group.but didn't allow external sort.
可以采用打开使用磁盘来解决大小问题。例如
db.flowlog.aggregate([{$group:{_id:"$_id"}}], {allowDiskUse: true})
java代码片段
AggregationOptions options = new AggregationOptions.Builder().allowDiskUse(true).build();
Aggregation agg = Aggregation.newAggregation().withOptions(options);
但是如果结果集超过了16M,那么依然会报错误。
采用一个下面的聚合方法
Aggregation agg = Aggregation.newAggregation(
Aggregation.group(field1
, field2
, field3)
.sum(field4).as("sampleField1")
.sum(field5).as("sampleField2"),
Aggregation.project(field4, field5),
new AggregationOperation() {
@Override
public DBObject toDBObject(AggregationOperationContext context) {
return new BasicDBObject("$out", "test"); } }).withOptions(options);
mongo.aggregate(agg, sourceCollection, Test.class);
红色部分是重点,构造这个agg可以将得到的结果导入插入到out中,并且不会有16M的限制问题。
如果要在聚合的时候增加一个常量,可采用以下形式
Aggregation agg = Aggregation.newAggregation(
Aggregation.group(
, OnofflineUserHistoryField.MAC
, StalogField.UTC_CODE)
.sum(OnofflineUserHistoryField.WIFI_UP_DOWN).as(OnofflineUserHistoryField.WIFI_UP_DOWN)
.sum(OnofflineUserHistoryField.ACTIVE_TIME).as(OnofflineUserHistoryField.ACTIVE_TIME),
Aggregation.project("mac","buildingId","utcCode",OnofflineUserHistoryField.ACTIVE_TIME, OnofflineUserHistoryField.WIFI_UP_DOWN).and( new AggregationExpression() {
@Override
public DBObject toDbObject(AggregationOperationContext context) {
return new BasicDBObject(
"$cond", new Object[]{
new BasicDBObject(
"$eq", new Object[]{ "$tenantId", 0}
),
20161114,
20161114
});
}
}).as("day").andExclude("_id"), 或者
and(new AggregationExpression() {
@Override
public DBObject toDbObject(AggregationOperationContext context) {
return new BasicDBObject("$add", new Object[] { 20141114 });
}
}).as("day").andExclude("_id"),
new AggregationOperation() {
@Override
public DBObject toDBObject(AggregationOperationContext context) {
return new BasicDBObject("$out", "dayStaInfoTmp");
}
}).withOptions(options); 红色和棕色部分为聚合中增加常量的两种方法。目前没有找到更方便的聚合添加常量的方法。
|