Python: Column and Row operations without using loop

I have a the below df1:

df1

Date Tickers Qty 01-01-2018 ABC 25 02-01-2018 BCD 25 02-01-2018 XYZ 31 05-01-2018 XYZ 25

and another df2 as below

df2

Date ABC BCD XYZ 01-01-2018 123 5 78 02-01-2018 125 7 79 03-01-2018 127 6 81 04-01-2018 126 7 82 05-01-2018 124 6 83

I want a resultant column in df1 which is the product of the correct column and row in df2 - getting the right ticker's rate on the given date and let the other dates have nan within df1

df1

df2

df1

Date df1['Product'] 01-01-2018 3075 02-01-2018 175 02-01-2018 2449 03-01-2018 nan 04-01-2018 nan 05-01-2018 2075

This seems like standard python operation, but I just am unable to achieve this without writing a loop - which is taking a very long time to execute:

I merged the above 2 tables on Date and then ran the below loop

Date

for i in range(len(df1)): try: df1['Product'][i] = df1[df1['Ticker'][i]][i] except ValueError: df['Product'][i] = np.nan

Is there any better pythonic way of achieving this and not writing this loop pls?

I would like to add that i could have more than one tickers on a given date and so using date as a pure index may not help?

– spiff
Sep 12 '18 at 6:53

2 Answers
2

Use:

df11 = df1.pivot('Date', 'Tickers','Qty') df22 = df2.set_index('Date') s = df22.mul(df11).bfill(axis=1).iloc[:, 0] print (s) Date 01-01-2018 3075.0 02-01-2018 175.0 03-01-2018 NaN 04-01-2018 NaN 05-01-2018 2075.0 Name: ABC, dtype: float64

Solution for add new column to df1:

df1

df11 = df1.pivot('Date', 'Tickers','Qty') df22 = df2.set_index('Date') df = df1.join(df22.mul(df11).stack().rename('new'), on=['Date','Tickers'], how='left') print (df) Date Tickers Qty new 0 01-01-2018 ABC 25 3075.0 1 02-01-2018 BCD 25 175.0 2 05-01-2018 XYZ 25 2075.0

EDIT:

If pairs Dates with Tickers are duplicated, solution above is not possible use.

Date

Tickers

print (df1) Date Tickers Qty 0 01-01-2018 ABC 25 1 01-01-2018 ABC 20 <-added duplicated pairs 01-01-2018 and ABC 2 02-01-2018 XYZ 31 3 02-01-2018 BCD 25 4 05-01-2018 XYZ 25 df3 = df1[['Date']].copy() #add new values to column df3['new'] = df2.set_index('Date').lookup(df1['Date'], df1['Tickers']) * df1['Qty'] #add missing values to duplicated Dates df3 = df2[['Date']].drop_duplicates().merge(df3, how='left') print (df3) Date new 0 01-01-2018 3075.0 1 01-01-2018 2460.0 2 02-01-2018 2449.0 3 02-01-2018 175.0 4 03-01-2018 NaN 5 04-01-2018 NaN 6 05-01-2018 2075.0

df11 = df1.pivot('Date', 'Tickers','Qty') gives ValueError: Index contains duplicate entries, cannot reshape

– spiff
Sep 12 '18 at 6:57

df11 = df1.pivot('Date', 'Tickers','Qty')

ValueError: Index contains duplicate entries, cannot reshape

apologies for wasting your time so far by not providing the info that there can be multiple tickers on a single date..

– spiff
Sep 12 '18 at 6:58

could you help with the updated info pls?

– spiff
Sep 12 '18 at 7:38

@spiff - Yes, sure. Give me some time.

– jezrael
Sep 12 '18 at 7:38

ah ofcourse - thank you very much!

– spiff
Sep 12 '18 at 7:39

you need to set 'Date' as index and multiply,

df1=df1.set_index('Date') df2=df2.set_index('Date') df3=(df2['ABC']*df1['Qty']).reset_index() print(df3) Date 0 0 01-01-2018 3075.0 1 02-01-2018 3125.0 2 03-01-2018 NaN 3 04-01-2018 NaN 4 05-01-2018 3100.0

thanks vm - but i think this may not work if I have more than one tickers on the same date? Edit: maybe I should have explained that in my example itself, apologies for that

– spiff
Sep 12 '18 at 6:38

Thanks for contributing an answer to Stack Overflow!

But avoid …

To learn more, see our tips on writing great answers.

Required, but never shown

By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

搜尋此網誌

Dfyjkt