Django accumulating data before storing to DB

  django, gunicorn, python

In my django website I need a tracker, which measuses how long a user performed particular activity each day.
For this purpose, browser sends an ajax request to server every 30 seconds while the user is performing his activity. So when recieving this request, the server increments user activity counter by 30 seconds. These counters are stored in the database.

I thought it would be quite inefficient to update data in the database every 30 seconds for every website user. So my idea was to accumulate all tracked time in a global dictionary of {user_id: seconds}. So when the ajax activity request is recieved I could just find the user_id in the dictionary and increase corresponding seconds value.
Then this dictionary could be flushed to the database every 10 minutes.

I relalise that this scheme is not super reilable, and if server crashes I will lose up to last 10 minutes of activity for all users, and I’m ok with that.

What bothers me is:

  1. As far as I understand, django running with gunicorn can have many worker processes, so I won’t be able to have a global dictionary. I can’t be even sure that the same user will always by handled by the same process.

  2. I’d like to flush the dictionary to the database when the worker process is about to be stopped, which doesn’t seem to be a trivial task.

Two above problems and the lack of obvious solution look like a hint that I’m doing something wrong.
Maybe I’m overoptimizing things and it’s ok to store to the database every 30 seconds for each user, or maybe there is a better well-known practice to cache stuff in memory before submitting it to the database, which I’m missing?

Could someone guide me to the right direction, please?

Source: Python Questions

LEAVE A COMMENT