Category : aggregation-framework

What is the most effecient way grouping items in mongodb collection set their count to a new field? I want to find the counts of the same-field documents (Let’s say same brands) and set that count value to that brands as a new field. I find the counts with following: result = collection.aggregate( [ { ..

Read more

I need to find count of same groups (like count of same colour products or same price products) and I try to do aggreagtion using ‘$group’. result = collection.aggregate( [ { "$group" : {"_id":group_aggregation_format, "count": {"$sum":1}} } ]) print(result) group_aggreagtion_format is a data like {‘title’: ‘$title’, ‘colour’: ‘$colour’} Then I get this error | INFO:dill:# ..

Read more

I have issues understanding aggregations and their results with pymongo. Im trying to send an aggregation pipeline and it seems that I get all the expected data, but I can’t use them somehow. This is my script import pymongo, json from bson.json_util import dumps client = pymongo.MongoClient(‘URL’) results = client[‘analytics’][‘metrics’].aggregate([ { ‘$match’: { ‘name’: ‘metricName’ ..

Read more

I have a MongoDB database containing frequencies of words in the document level as shown below. I have about 175k documents in the same format, totaling about 2.5GB. { "_id": xxx, "title": "zzz", "vectors": { "word1": 28, "word2": 22, "word3": 12, "word4": 7, "word5": 4 } Now I want to iterate through all documents, calculate ..

Read more

I have a MongoDB collection with documents like: {‘city’: ‘NYC’, ‘value’: ‘blue’}, {‘city’: ‘NYC’, ‘value’: ‘red’}, {‘city’: ‘Boston’, ‘value’: ‘blue’}, {‘city’: ‘Boston’, ‘value’: ‘green’} I want to aggregate distinct values of city with a list of distinct values of value, like: {‘city’: ‘NYC’, ‘values’: [‘blue’, ‘red’]}, {‘city’: ‘Boston’, ‘values’: [‘blue’, ‘green’]} How can I do ..

Read more

I have a MongoDB collection with documents like: {‘date’: 2020-01-01T00:00:00.000+00:00, ‘population’: 110, ‘state’: ‘NY’, ‘start’: 2020-01-01T00:00:00.000+00:00, ‘end’: 2021-05-26T00:00:00.000+00:00}, {‘date’: 2020-01-02T00:00:00.000+00:00, ‘population’: 112, ‘state’: ‘NY’, ‘start’: 2020-01-01T00:00:00.000+00:00, ‘end’: 2021-05-26T00:00:00.000+00:00}, … {‘date’: 2020-03-15T00:00:00.000+00:00, ‘population’: 119, ‘state’: ‘NY’, ‘start’: 2020-01-01T00:00:00.000+00:00, ‘end’: 2021-05-26T00:00:00.000+00:00}, {‘date’: 2020-03-16T00:00:00.000+00:00, ‘population’: 131, ‘state’: ‘NY’, ‘start’: 2020-01-01T00:00:00.000+00:00, ‘end’: 2021-05-26T00:00:00.000+00:00}, {‘date’: 2020-03-17T00:00:00.000+00:00, ‘population’: 138, ‘state’: ‘NY’, ..

Read more

I have a MongoDB collection with optional raw fields Age, YOB and DOB, and standardized versions of any of these fields when present: {‘_id’: 123, ‘Age’: ‘30.0’, ‘DOB’: ‘1990-01-01’, ‘Standardized_Age_from_Age_field’: 30, ‘Standardized_Age_from_DOB_field’: 30}, {‘_id’: 123, ‘Age’: ’21’, ‘DOB’: ‘xx/xx/xxxx’, ‘Standardized_Age_from_Age_field’: 21, ‘Standardized_Age_from_DOB_field’: null}, {‘_id’: 123, ‘Age’: ’19’, ‘YOB’: ‘119’, ‘Standardized_Age_from_Age_field’: 19, ‘Standardized_Age_from_YOB_field’: 119}, … I ..

Read more