Scrap the web page using jsoup










2















I need to scrap the postcode from below html code by using the jsoup. I only need postcode which is part of href attribute of a tag, called W2:




<a href="/properties-for-sale/w2/chpk3848653" class="property_photo_holder" style="backgroundimage:url(https://assets.foxtons.co.uk/w/480/1523289105/chpk3848653-23.jpg)"></a>



This is html code:



</div>

<div id="property_1062067" class="property_summary">

<h6><a href="/properties-for-sale/w2/chpk3848653">Lancaster Gate, <span class="property_address_location_name">Bayswater,</span> W2</a></h6>


Can anyone help ?
Thank you.










share|improve this question



















  • 1





    What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?

    – Subhasish Bhattacharjee
    Nov 11 '18 at 9:18











  • I just tried to show what data exactly I want to scrap. Please see the below

    – Hakan
    Nov 11 '18 at 9:48











  • >Bayswater,</span> W2</a></h6>

    – Hakan
    Nov 11 '18 at 9:48











  • This is my code which I tried to scrap

    – Hakan
    Nov 11 '18 at 9:51











  • Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) System.out.println(postcode.text());

    – Hakan
    Nov 11 '18 at 9:51















2















I need to scrap the postcode from below html code by using the jsoup. I only need postcode which is part of href attribute of a tag, called W2:




<a href="/properties-for-sale/w2/chpk3848653" class="property_photo_holder" style="backgroundimage:url(https://assets.foxtons.co.uk/w/480/1523289105/chpk3848653-23.jpg)"></a>



This is html code:



</div>

<div id="property_1062067" class="property_summary">

<h6><a href="/properties-for-sale/w2/chpk3848653">Lancaster Gate, <span class="property_address_location_name">Bayswater,</span> W2</a></h6>


Can anyone help ?
Thank you.










share|improve this question



















  • 1





    What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?

    – Subhasish Bhattacharjee
    Nov 11 '18 at 9:18











  • I just tried to show what data exactly I want to scrap. Please see the below

    – Hakan
    Nov 11 '18 at 9:48











  • >Bayswater,</span> W2</a></h6>

    – Hakan
    Nov 11 '18 at 9:48











  • This is my code which I tried to scrap

    – Hakan
    Nov 11 '18 at 9:51











  • Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) System.out.println(postcode.text());

    – Hakan
    Nov 11 '18 at 9:51













2












2








2








I need to scrap the postcode from below html code by using the jsoup. I only need postcode which is part of href attribute of a tag, called W2:




<a href="/properties-for-sale/w2/chpk3848653" class="property_photo_holder" style="backgroundimage:url(https://assets.foxtons.co.uk/w/480/1523289105/chpk3848653-23.jpg)"></a>



This is html code:



</div>

<div id="property_1062067" class="property_summary">

<h6><a href="/properties-for-sale/w2/chpk3848653">Lancaster Gate, <span class="property_address_location_name">Bayswater,</span> W2</a></h6>


Can anyone help ?
Thank you.










share|improve this question
















I need to scrap the postcode from below html code by using the jsoup. I only need postcode which is part of href attribute of a tag, called W2:




<a href="/properties-for-sale/w2/chpk3848653" class="property_photo_holder" style="backgroundimage:url(https://assets.foxtons.co.uk/w/480/1523289105/chpk3848653-23.jpg)"></a>



This is html code:



</div>

<div id="property_1062067" class="property_summary">

<h6><a href="/properties-for-sale/w2/chpk3848653">Lancaster Gate, <span class="property_address_location_name">Bayswater,</span> W2</a></h6>


Can anyone help ?
Thank you.







java html parsing web-scraping jsoup






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 11 '18 at 11:25









Dinko Pehar

1,4163424




1,4163424










asked Nov 11 '18 at 8:46









HakanHakan

113




113







  • 1





    What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?

    – Subhasish Bhattacharjee
    Nov 11 '18 at 9:18











  • I just tried to show what data exactly I want to scrap. Please see the below

    – Hakan
    Nov 11 '18 at 9:48











  • >Bayswater,</span> W2</a></h6>

    – Hakan
    Nov 11 '18 at 9:48











  • This is my code which I tried to scrap

    – Hakan
    Nov 11 '18 at 9:51











  • Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) System.out.println(postcode.text());

    – Hakan
    Nov 11 '18 at 9:51












  • 1





    What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?

    – Subhasish Bhattacharjee
    Nov 11 '18 at 9:18











  • I just tried to show what data exactly I want to scrap. Please see the below

    – Hakan
    Nov 11 '18 at 9:48











  • >Bayswater,</span> W2</a></h6>

    – Hakan
    Nov 11 '18 at 9:48











  • This is my code which I tried to scrap

    – Hakan
    Nov 11 '18 at 9:51











  • Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) System.out.println(postcode.text());

    – Hakan
    Nov 11 '18 at 9:51







1




1





What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?

– Subhasish Bhattacharjee
Nov 11 '18 at 9:18





What do you mean by "I only need postcode which is W2" ? Also, may you post something you tried?

– Subhasish Bhattacharjee
Nov 11 '18 at 9:18













I just tried to show what data exactly I want to scrap. Please see the below

– Hakan
Nov 11 '18 at 9:48





I just tried to show what data exactly I want to scrap. Please see the below

– Hakan
Nov 11 '18 at 9:48













>Bayswater,</span> W2</a></h6>

– Hakan
Nov 11 '18 at 9:48





>Bayswater,</span> W2</a></h6>

– Hakan
Nov 11 '18 at 9:48













This is my code which I tried to scrap

– Hakan
Nov 11 '18 at 9:51





This is my code which I tried to scrap

– Hakan
Nov 11 '18 at 9:51













Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) System.out.println(postcode.text());

– Hakan
Nov 11 '18 at 9:51





Elements postcodes = doc.select("span.property_address_location_name"); for (Element postcode : postcodes) System.out.println(postcode.text());

– Hakan
Nov 11 '18 at 9:51












1 Answer
1






active

oldest

votes


















0














You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:



Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();

Elements elements = document.select("a");

String href = elements.attr("href");


Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:



String regex = "[a-zA-Z0-9]11";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);

String postalCode = matcher.find().group(0);


That's all, if you need anything else feel free to ask! Hope this helped you!






share|improve this answer























  • Something is wrong with this code. Thanks for anyway

    – Hakan
    Nov 13 '18 at 19:07











  • @Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!

    – alvarobartt
    Nov 14 '18 at 8:35











  • This is the code how I scraped all other attributes...etc.

    – Hakan
    Nov 15 '18 at 10:52











  • //Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");

    – Hakan
    Nov 15 '18 at 10:52











  • foxtons.co.uk/… This is the link to web scraping.

    – Hakan
    Nov 15 '18 at 10:53










Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247125%2fscrap-the-web-page-using-jsoup%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:



Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();

Elements elements = document.select("a");

String href = elements.attr("href");


Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:



String regex = "[a-zA-Z0-9]11";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);

String postalCode = matcher.find().group(0);


That's all, if you need anything else feel free to ask! Hope this helped you!






share|improve this answer























  • Something is wrong with this code. Thanks for anyway

    – Hakan
    Nov 13 '18 at 19:07











  • @Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!

    – alvarobartt
    Nov 14 '18 at 8:35











  • This is the code how I scraped all other attributes...etc.

    – Hakan
    Nov 15 '18 at 10:52











  • //Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");

    – Hakan
    Nov 15 '18 at 10:52











  • foxtons.co.uk/… This is the link to web scraping.

    – Hakan
    Nov 15 '18 at 10:53















0














You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:



Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();

Elements elements = document.select("a");

String href = elements.attr("href");


Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:



String regex = "[a-zA-Z0-9]11";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);

String postalCode = matcher.find().group(0);


That's all, if you need anything else feel free to ask! Hope this helped you!






share|improve this answer























  • Something is wrong with this code. Thanks for anyway

    – Hakan
    Nov 13 '18 at 19:07











  • @Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!

    – alvarobartt
    Nov 14 '18 at 8:35











  • This is the code how I scraped all other attributes...etc.

    – Hakan
    Nov 15 '18 at 10:52











  • //Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");

    – Hakan
    Nov 15 '18 at 10:52











  • foxtons.co.uk/… This is the link to web scraping.

    – Hakan
    Nov 15 '18 at 10:53













0












0








0







You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:



Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();

Elements elements = document.select("a");

String href = elements.attr("href");


Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:



String regex = "[a-zA-Z0-9]11";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);

String postalCode = matcher.find().group(0);


That's all, if you need anything else feel free to ask! Hope this helped you!






share|improve this answer













You can use JSOUP for that, you just need to retrieve the href attribute value as it follows:



Document document = Jsoup.connect(URL).userAgent("Mozilla/5.0").get();

Elements elements = document.select("a");

String href = elements.attr("href");


Now that you have the href attribute as a String, you need to apply a RegEx (Regular Expression) to get the field you want, in this case, the Postal Code contained in: "/properties-for-sale/w2/chpk3848653". To do that you will need to:



String regex = "[a-zA-Z0-9]11";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(href);

String postalCode = matcher.find().group(0);


That's all, if you need anything else feel free to ask! Hope this helped you!







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 13 '18 at 13:06









alvarobarttalvarobartt

12418




12418












  • Something is wrong with this code. Thanks for anyway

    – Hakan
    Nov 13 '18 at 19:07











  • @Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!

    – alvarobartt
    Nov 14 '18 at 8:35











  • This is the code how I scraped all other attributes...etc.

    – Hakan
    Nov 15 '18 at 10:52











  • //Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");

    – Hakan
    Nov 15 '18 at 10:52











  • foxtons.co.uk/… This is the link to web scraping.

    – Hakan
    Nov 15 '18 at 10:53

















  • Something is wrong with this code. Thanks for anyway

    – Hakan
    Nov 13 '18 at 19:07











  • @Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!

    – alvarobartt
    Nov 14 '18 at 8:35











  • This is the code how I scraped all other attributes...etc.

    – Hakan
    Nov 15 '18 at 10:52











  • //Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");

    – Hakan
    Nov 15 '18 at 10:52











  • foxtons.co.uk/… This is the link to web scraping.

    – Hakan
    Nov 15 '18 at 10:53
















Something is wrong with this code. Thanks for anyway

– Hakan
Nov 13 '18 at 19:07





Something is wrong with this code. Thanks for anyway

– Hakan
Nov 13 '18 at 19:07













@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!

– alvarobartt
Nov 14 '18 at 8:35





@Hakan no problem! ask me if you need anything else, that was just a sample code as guide! +1 if you found it useful!

– alvarobartt
Nov 14 '18 at 8:35













This is the code how I scraped all other attributes...etc.

– Hakan
Nov 15 '18 at 10:52





This is the code how I scraped all other attributes...etc.

– Hakan
Nov 15 '18 at 10:52













//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");

– Hakan
Nov 15 '18 at 10:52





//Get the location of property Elements locations = items.get(i).getElementsByTag("h6"); //Get the post code of property Elements postcodes = items.get(i).getElementsByTag("h6.a[href]"); //Get the longitude Elements longitude = items.get(i).select("div");

– Hakan
Nov 15 '18 at 10:52













foxtons.co.uk/… This is the link to web scraping.

– Hakan
Nov 15 '18 at 10:53





foxtons.co.uk/… This is the link to web scraping.

– Hakan
Nov 15 '18 at 10:53

















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247125%2fscrap-the-web-page-using-jsoup%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

ữḛḳṊẴ ẋ,Ẩṙ,ỹḛẪẠứụỿṞṦ,Ṉẍừ,ứ Ị,Ḵ,ṏ ṇỪḎḰṰọửḊ ṾḨḮữẑỶṑỗḮṣṉẃ Ữẩụ,ṓ,ḹẕḪḫỞṿḭ ỒṱṨẁṋṜ ḅẈ ṉ ứṀḱṑỒḵ,ḏ,ḊḖỹẊ Ẻḷổ,ṥ ẔḲẪụḣể Ṱ ḭỏựẶ Ồ Ṩ,ẂḿṡḾồ ỗṗṡịṞẤḵṽẃ ṸḒẄẘ,ủẞẵṦṟầṓế

⃀⃉⃄⃅⃍,⃂₼₡₰⃉₡₿₢⃉₣⃄₯⃊₮₼₹₱₦₷⃄₪₼₶₳₫⃍₽ ₫₪₦⃆₠₥⃁₸₴₷⃊₹⃅⃈₰⃁₫ ⃎⃍₩₣₷ ₻₮⃊⃀⃄⃉₯,⃏⃊,₦⃅₪,₼⃀₾₧₷₾ ₻ ₸₡ ₾,₭⃈₴⃋,€⃁,₩ ₺⃌⃍⃁₱⃋⃋₨⃊⃁⃃₼,⃎,₱⃍₲₶₡ ⃍⃅₶₨₭,⃉₭₾₡₻⃀ ₼₹⃅₹,₻₭ ⃌