Capped / Constrained Weights in Python Pandas

I have a dataframe of weights, in which I want to constrain the maximum weight for any one element to 30%. However in doing this, the sum of the weights becomes less than 1... so the weights of all other elements should be uniformly increased, and then repetitively capped at 30% until the sum of all weights is 1.

for example:

enter image description here

If my data is in a pandas data frame, how can I do this efficiently?
Note: in reality I have like 20 elements which I want to cap at 10%... so there is much more processing involved. I also intent to run this step 1000s of times.

Thank you,
CWSE

2 Answers
2

Here's one vectorised solution. The idea is to calculate an adjustment and distribute it proportionately among the non-capped values.

df = pd.DataFrame('Elements': list('ABCDE'), 'Uncon': [0.53, 0.34, 0.06, 0.03, 0.03]) df['Con'] = np.minimum(0.30, df['Uncon']) nonmax = df['Con'].ne(0.30) adj = (1 - df['Con'].sum()) * df['Uncon'].loc[nonmax] / df['Uncon'].loc[nonmax].sum() df['Con'] = df['Con'].mask(nonmax, df['Uncon'] + adj) print(df) Elements Uncon Con 0 A 0.53 0.3 1 B 0.34 0.3 2 C 0.06 0.2 3 D 0.03 0.1 4 E 0.03 0.1

thank you jpp, however this was a simple example, what about when the adjusted values breach the constraint? We need to iteratively re-solve the weights until they all sum to 1 and no individual value is > 0.3. if there are many weights and the constraint is lower, often each iteration the lower weights will breach the cap. How can I do this efficiently?
– cwse
Aug 21 at 21:52

@cwse, Sorry, I don't understand the problem. The above answers your original question as stated, right? If you have another, please ask a new question with a more appropriate example.
– jpp
Aug 21 at 23:16

jpp, please see my draft answer ^
– cwse
Aug 22 at 4:07

@jpp

The following is a rough approach, modified from your answer to iteratively solveand re-cap. It doenst produce a perfect answer though... and having a while loop makes it inefficient. Any ideas how this could be improved?

import pandas as pd import numpy as np cap = 0.1 df = pd.DataFrame('Elements': list('ABCDEFGHIJKLMNO'), 'Values': [17,11,7,5,4,4,3,2,1.5,1,1,1,0.8,0.6,0.5]) df['Uncon'] = df['Values']/df['Values'].sum() df['Con'] = np.minimum(cap, df['Uncon']) while df['Con'].sum() < 1 or len(df['Con'][df['Con']>cap]) >=1: df['Con'] = np.minimum(cap, df['Con']) nonmax = df['Con'].ne(cap) adj = (1 - df['Con'].sum()) * df['Con'].loc[nonmax] / df['Uncon'].loc[nonmax].sum() df['Con'] = df['Con'].mask(nonmax, df['Con'] + adj) print(df) print(df['Con'].sum())

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Dfyjkt