Python Beautiful Soup scrape page containing JSP/JS
Python Beautiful Soup scrape page containing JSP/JS
i am trying to scrape the price from this page : url = https://www.renodepot.com/en/steph-round-base-shower-kit-69375118
the price information is given in the span tag and I am not able to scrape it. the simple code which I am using for this is
from requests import get
from bs4 import BeautifulSoup
response = get(url)
html_soup = BeautifulSoup(response.text, 'html.parser')
ProductPrice = html_soup.find('div',class_ = 'product_price_wrapper')
but this returns nothing, I think
BEGIN RenoProdDetailPriceSnippet.jsp
which appears just above the price div tab is causing the information to be protected.
I even tried doing it with selenium but was not successful.
I tried many other combination to get the price but was not able to get the same.
So, I am looking for some ideas to solve this.
Thanks
requests
selenium
Possible dupe of stackoverflow.com/questions/8049520/…
– DYZ
Sep 16 '18 at 23:28
Possible duplicate of Parse the JavaScript returned from BeautifulSoup
– Chris
Sep 16 '18 at 23:30
I tried doing this with selenium too, but was not able to get the required information. I even tried headless web drivers but was not successful.
– Jaskaran Singh
Sep 17 '18 at 5:07
1 Answer
1
You cannot scrape the page because it requires the completion of a reCAPTCHA to access. This is specifically designed to stop bots.
If you examine html_soup
you will find that you are actually searching the reCAPTCHA page, not the desired product page.
html_soup
I opened the page and did not find any CAPTCHAs.
– DYZ
Sep 16 '18 at 23:39
That's interesting, perhaps it's location based? When I examine
html_soup
it contains the CAPTCHA page.– Joon-Ho Son
Sep 16 '18 at 23:45
html_soup
Thanks for contributing an answer to Stack Overflow!
But avoid …
To learn more, see our tips on writing great answers.
Required, but never shown
Required, but never shown
By clicking "Post Your Answer", you agree to our terms of service, privacy policy and cookie policy
You cannot scrape dynamically generated pages with
requests
. Useselenium
or a similar web driver.– DYZ
Sep 16 '18 at 23:25