numpy.random.choice with percentages not working in practice

  flask, numpy, numpy-random, python, random

I’m running python code that’s similar to:

import numpy

def get_user_group(user, groups):
    if not user.group_id:
        user.group_id = assign(groups)
    return user.group_id

def assign(groups):
    for group in groups:
        ids.append(group.id)
        percentages.append(group.percentage) # e.g. .33

    assignment = numpy.random.choice(ids, p=percentages)
    return assignment

We are running this in the wild against tens of thousands of users. I’ve noticed that the assignments do not respect the actual group percentages. E.G. if our percentages are [.9, .1] we’ve noticed a consistent hour over hour split of 80% and 20%. We’ve confirmed the inputs of the choice function are correct and mismatch from actual behavior.

Does anyone have a clue why this could be happening? Is it because we are using the global numpy? Some groups will be split between [.9, .1] while others are [.33,.34,.33] etc. Is it possible that different sets of groups are interfering with each other?

We are running this code in a python flask web application on a number of nodes.

Any recommendations on how to get reliable "random" weighted choice?

Source: Python Questions

LEAVE A COMMENT