Pandas Replace 1st Result in a DataFrame
Pandas Replace 1st Result in a DataFrame
Let's say I have a dataframe that looks like this:
df4
df4 = pd.DataFrame('Q':['apple', 'apple', 'orange', 'Apple', 'orange'], 'R':['a.txt', 'a.txt', 'a.txt', 'b.txt', 'b.txt'])
>>> df4
Q R
0 apple a.txt
1 apple a.txt
2 orange a.txt
3 Apple b.txt
4 orange b.txt
What I would like to output is this:
Q R
0 breakfast a.txt
1 apple a.txt
2 orange a.txt
3 breakfast b.txt
4 orange b.txt
In other words, case insensitive, I want to search every row in a dataframe, find the first occurrence of certain words (in this case, that word is apple), and replace it with another word.
Is there a way to do this?
2 Answers
2
Here's a vectorised solution with groupby
and idxmin
:
groupby
idxmin
v = df.Q.str.lower().eq('apple')
v2 = (~v).cumsum().where(v)
df.loc[v2.groupby(v2).idxmin().values, 'Q'] = 'breakfast'
df
Q R
0 breakfast a.txt
1 apple a.txt
2 orange a.txt
3 breakfast b.txt
4 orange b.txt
@LunchBox Change eq to str.contains?
– coldspeed
Aug 27 at 4:45
That did not work, that was the first thing I tried :(
– LunchBox
Aug 27 at 4:46
"Series object has no attribute contains"
– LunchBox
Aug 27 at 4:46
This is perfect if I could make it work with contains
– LunchBox
Aug 27 at 4:49
I just really wanted to answer this question.
def swap_first(s):
swap = 1
luk4 = 'apple'
for x in s:
if x.lower() in luk4 and swap:
yield 'breakfast'
swap ^= 1
else:
yield x
if x not in luk4:
swap ^= 1
df4.assign(Q=[*swap_first(df4.Q)])
Q R
0 breakfast a.txt
1 apple a.txt
2 orange a.txt
3 breakfast b.txt
4 orange b.txt
By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.
This is perfect, instead of ".eq" apple, which I assume means "equals apple", I wonder if there is a way to use regex to say "contains apple"
– LunchBox
Aug 27 at 4:33