Targetting URLs with parameters

Multi tool use
Multi tool use

Targetting URLs with parameters



I want to grab the URL with highest pg value:


pg


$html ='
<a href="http://example.com/?pg=1"></a>
<a href="http://example.com/?pg=2"></a>
<a href="http://example.com/?pg=3"></a>
';



I use this regex to locate the appropriate links:


preg_match_all('/<a.*href="./?pg=(d+)".*>(?:.*)</a>/U', $html, $preg_matches);



Sometimes, the links include another parameter:


http://example.com/?pg=3&test=1



My question is, how do I adjust my regex so links with the added parameters are included as well?





You have already asked it here, isn't that the same question?
– Wiktor Stribiżew
Aug 27 at 14:11





@WiktorStribiżew No, this question is targeting URLs with multiple parameters.
– Henrik Petterson
Aug 27 at 14:12





. matches a dot. You must match any chars other than " with [^"].
– Wiktor Stribiżew
Aug 27 at 14:12



.


"


[^"]





@WiktorStribiżew It does not include URLs with multiple parameters. Try adding <a href="http://example.com/?pg=4&test=1">a</a> to the $html variable and you will see.
– Henrik Petterson
Aug 27 at 14:13


<a href="http://example.com/?pg=4&test=1">a</a>


$html





@WiktorStribiżew Can you please post an answer to demonstrate this? Thanks.
– Henrik Petterson
Aug 27 at 14:15




2 Answers
2



Example:


$dom = new DOMDocument;
$dom->loadHTML($html);

$html ='
<a href="http://example.com/?pg=1"></a>
<a href="http://example.com/?pg=2"></a>
<a href="http://example.com/?pg=3"></a>
';
$anchors = $dom->getElementsByTagName('a');

foreach ($anchors as $anchor)
$url = $anchor->getAttribute('href');
$query = parse_url($url, PHP_URL_QUERY);
parse_str($query, $output);
$pg = $output['pg'];
//do something



Here's a helpful tutorial for PHP. http://htmlparsing.com/php.html



Also see here, why you should not use Regex for parsing html https://stackoverflow.com/a/1732454/81785





Thank you for that example code! =)
– Henrik Petterson
Aug 27 at 14:32


$html ='
<a href="http://example.com/?pg=1"></a>
<a href="http://example.com/?pg=2"></a>
<a href="http://example.com/?pg=4&test=1"></a>
';
preg_match_all('/<a[^>]+href="(.*?)"[^>]*>(.*)?</a>/', $html, $out);

$result = null;
foreach ($out[1] as $link)
parse_str(parse_url($link, PHP_URL_QUERY), $atr);
$result[$link] = $atr['pg'];


print_r($result);

// "http://example.com/?pg=1" => "1"
// "http://example.com/?pg=2" => "2"
// "http://example.com/?pg=4&test=1" => "4"






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Un jY5C0 TcLe,KKoMnu,x99ld0Pw4rRKGTnnS ssQkuHI2cX2,t5Kdx9WsY9luWD tFcF9DpgGVC8q9xjk8BBb0ZD,Dj,mp
bS 9FdL1SE6Y2tW2bVC,LPgjQqk eQ Ok,TaKst,G

Popular posts from this blog

Old paper Canadian currency

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

ữḛḳṊẴ ẋ,Ẩṙ,ỹḛẪẠứụỿṞṦ,Ṉẍừ,ứ Ị,Ḵ,ṏ ṇỪḎḰṰọửḊ ṾḨḮữẑỶṑỗḮṣṉẃ Ữẩụ,ṓ,ḹẕḪḫỞṿḭ ỒṱṨẁṋṜ ḅẈ ṉ ứṀḱṑỒḵ,ḏ,ḊḖỹẊ Ẻḷổ,ṥ ẔḲẪụḣể Ṱ ḭỏựẶ Ồ Ṩ,ẂḿṡḾồ ỗṗṡịṞẤḵṽẃ ṸḒẄẘ,ủẞẵṦṟầṓế