html - PHP Regex to match everything between <body style=...> and </body> tag -
I have found a curl function that catches everything on a specific page, but I only have to I want the elements only. Send me the & lt; Body & gt;
and & lt; / Body & gt;
To match everything, this nifty rezox got what worked, but then I realized that the pages in which I need to use curls actually have a body tag with style information inside them. , So that I really want to match << em> & lt; Body style = ... & gt;
and & lt; / Body & gt; Does anyone know the regex expression for that match? Here are all my code here ...
& lt ;? Php error_reporting (E_ALL); Ini_set ("display_errors", "1"); $ PageToLoad = $ _POST ['Load']; Function get_data ($ url) {$ ch = curl_init (); $ Timeout = 5; Curl_setopt ($ CH, CURLOPT_HEADER, 0); Curl_setopt ($ ch, CURLOPT_URL, $ url); Curl_setopt ($ CH, CURLOPT_RETURNTRANSFER, 1); Curl_setopt ($ CH, CURLOPT_CONNECTTIMEOUT, $ timeout); Curl_setopt ($ CH, CURLOPT_SSL_VERIFYPEER, incorrect); Curl_setopt ($ CH, CURLOPT_FOLLOWLOCATION, true); Curl_setopt ($ CH, CURLOPT_USERAGENT, 'Mozilla / 5.0 (Windows; U; Windows NT 5.1; N-US; RV .: 1.8.1.13) Gecko / 20080311 Firefox / 2.0.0.13'); $ Data = curl_xac ($ ch); Curl_close ($ ch); $ Return data; } $ Html = get_data ($ pageToLoad); $ NewHtml = preg_match ("~
When you are included in the attributes as part of your search pattern An attribute value can be either single or double cited, and will be able to manage most parsons, even if some have forgotten to quote, or the quotes may not match. Since you are only looking for a special feature name, its easy but still available, such as if you are searching for the attribute names that exist in the form of values in another attribute.
(HECK, your original simple regex will match incorrectly to some incompatible strings such as
.
Since a style feature is almost always an equal sign, I will use that fact to find it. I will also make sure that I match the element of the body, and not some impossible mutant, As with the example above.
gt;] * style \ s * = [^> gt] * & gt; (. * ?) & Lt; / body & gt;
It is essentially your original regex but between \ s [^> gt;] * style \ s * =
.
-
\ s < / Code> ensures that there is space after the body element so that it is only one body element.
-
[^ & gt;] *
matches any character , But& gt;
0 or more times -
Style
\ s *
Allows white space between= matches string "="
, or they had space or any other letter in the end of theMake me an example Has pressed hard to think about who will foot the Rigeks, which will not cause any problems with the parser. I think someone in the opening of the element
body& lt; A white space has been added between
andbody
. Plus anyone has to leave all the elements of closed body together.You can add examples to regex, but perhaps in any case you will encounter in the wild, which I have given is ok work.
Comments
Post a Comment