How to select the first child of each element in list in Beautiful Soup

How to select the first child of each element in list in Beautiful Soup



I want to get the text from the first inner div in each outer div


<body>
<div class="outer">
<div class="inner">text1</div>
<div class="inner">text2</div>
<div class="inner">text3</div>
</div>
<div class="outer">
<div class="inner">text4</div>
</div>
<div class="outer">
<div class="inner">text5</div>
<div class="inner">text6</div>
</div>
</body>



This is means retrieving text1, text4, text5



I've experimented with the code shown below but can't get it to work


outers = soup.select('body > .outer')
for outer in outers:
inners = outer.select_one('.inner')
for inner in inners:
print(inner.text)



Thanks in advanced




2 Answers
2



May be this works,


soup = BeautifulSoup(text, 'html.parser')
for outer in soup.find_all('div', class_='outer'):
inners = outer.find('div', class_='inner')
for inner in inners:
print(inner)


# Output as:
# text1
# text4
# text5



OR
You can use this way,


soup = BeautifulSoup(text, 'html.parser')
for outer in soup.find_all('div', class_='outer'):
inners = outer.find('div', class_='inner')
print(inners.get_text())





This gives TypeError: 'NoneType' object is not iterable after the for inner in inners: line
– tokism
Aug 27 at 14:00



TypeError: 'NoneType' object is not iterable


for inner in inners:





@tokism Hope this works in you case
– utks009
Aug 27 at 14:06





The edited solution works great! I was trying to iterate over the non-iterable inners as it is only a single element. Thanks
– tokism
Aug 27 at 14:26


inners



Welcome to StackOverflow!



This code worked for me:


[div.find("div", "class": "inner") for div in soup.findAll("div", "class": "outer")]



That is, a one-line version of the same thing.





I can't figure out how to print each section of text. Any ideas?
– tokism
Aug 27 at 14:03





Do you mean you want to print the text in the div? If not; have you tried putting my code in a print() command? If yes, maybe try div.findAll("div", "class": "inner"))[0].text instead of div.findAll("div", "class": "inner"))[0].
– Josh Friedlander
Aug 27 at 14:05


div.findAll("div", "class": "inner"))[0].text


div.findAll("div", "class": "inner"))[0]





@JoshFriedlander, find_all(...)[0] is equivalent to find(...).
– Keyur Potdar
Aug 27 at 14:08


find_all(...)[0]


find(...)





Thanks Keyur! I'll update my answer.
– Josh Friedlander
Aug 27 at 14:09





Also, you don't have to convert it to a list. find_all returns a list.
– Keyur Potdar
Aug 27 at 14:09


find_all






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)