Python: Calculate normalized weights with constraints that no weight is greater than 1/sqrt(N)

  dataframe, normalization, numpy, pandas, python

I have the following dataframe:

df = pd.DataFrame({'a': [53, 54.06, 53.654, 55.2], 'b': [np.nan, 54.1121, 53.98, 55.12], 'c': [np.nan, 2, 53.322, 54.99],
               'd': [np.nan, 53.1, 53.212, 55.002], 'e': [np.nan, 53, 53.2, 55.021], 'f': [np.nan, 53.11, 53.120, 55.3]})

df:
         a      b       c      d      e      f
    0    53.000 NaN     NaN    NaN    NaN    NaN
    1    54.060 54.1121 2.000  53.100 53.000 53.11
    2    53.654 53.9800 53.322 53.212 53.200 53.12
    3    55.200 55.1200 54.990 55.002 55.021 55.30

I have N=5. I want to normalize the dataframe using

df2 = df.div(df.sum(axis=1), axis=0)

and then I want to apply the constraint that "If any weight is greater than 1/sqrt(N), it should be clipped to 1/sqrt(N) and the remaining weight is distributed in a pro-rata fashion based on their relative weights."

For example, 1/sqrt(N) = 0.4472. If a weight is 0.6, then it would be clipped to 0.4472, and the remaining weight of 0.6-0.4472 = 0.1528 will be distributed to the next largest weight such that it does not exceed 0.4472. If the next largest weight is 0.4, then we would add 0.0472 (from the 0.1528) and have a remaining weight of 0.1056.

Also, if the remaining weights are equal to one another, then the clipped weight should be evenly distributed to them. How can this be done?

Thank you.

Source: Python Questions

LEAVE A COMMENT