Targetting URLs with parameters

Targetting URLs with parameters



I want to grab the URL with highest pg value:


pg


$html ='
<a href="http://example.com/?pg=1"></a>
<a href="http://example.com/?pg=2"></a>
<a href="http://example.com/?pg=3"></a>
';



I use this regex to locate the appropriate links:


preg_match_all('/<a.*href="./?pg=(d+)".*>(?:.*)</a>/U', $html, $preg_matches);



Sometimes, the links include another parameter:


http://example.com/?pg=3&test=1



My question is, how do I adjust my regex so links with the added parameters are included as well?





You have already asked it here, isn't that the same question?
– Wiktor Stribiżew
Aug 27 at 14:11





@WiktorStribiżew No, this question is targeting URLs with multiple parameters.
– Henrik Petterson
Aug 27 at 14:12





. matches a dot. You must match any chars other than " with [^"].
– Wiktor Stribiżew
Aug 27 at 14:12



.


"


[^"]





@WiktorStribiżew It does not include URLs with multiple parameters. Try adding <a href="http://example.com/?pg=4&test=1">a</a> to the $html variable and you will see.
– Henrik Petterson
Aug 27 at 14:13


<a href="http://example.com/?pg=4&test=1">a</a>


$html





@WiktorStribiżew Can you please post an answer to demonstrate this? Thanks.
– Henrik Petterson
Aug 27 at 14:15




2 Answers
2



Example:


$dom = new DOMDocument;
$dom->loadHTML($html);

$html ='
<a href="http://example.com/?pg=1"></a>
<a href="http://example.com/?pg=2"></a>
<a href="http://example.com/?pg=3"></a>
';
$anchors = $dom->getElementsByTagName('a');

foreach ($anchors as $anchor)
$url = $anchor->getAttribute('href');
$query = parse_url($url, PHP_URL_QUERY);
parse_str($query, $output);
$pg = $output['pg'];
//do something



Here's a helpful tutorial for PHP. http://htmlparsing.com/php.html



Also see here, why you should not use Regex for parsing html https://stackoverflow.com/a/1732454/81785





Thank you for that example code! =)
– Henrik Petterson
Aug 27 at 14:32


$html ='
<a href="http://example.com/?pg=1"></a>
<a href="http://example.com/?pg=2"></a>
<a href="http://example.com/?pg=4&test=1"></a>
';
preg_match_all('/<a[^>]+href="(.*?)"[^>]*>(.*)?</a>/', $html, $out);

$result = null;
foreach ($out[1] as $link)
parse_str(parse_url($link, PHP_URL_QUERY), $atr);
$result[$link] = $atr['pg'];


print_r($result);

// "http://example.com/?pg=1" => "1"
// "http://example.com/?pg=2" => "2"
// "http://example.com/?pg=4&test=1" => "4"






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)