简体   繁体   中英

pdf regexp match on php

regexp isnt one of my strong skills so need a bit of your help on this, have this regexp to get pdf url on a site source code

if (preg_match("/http\:\/\/.*?\.pdf/i", $source)) {

which work ok most of the times but of example when I get sites with link urls like

http://doc.pdfsomething.com/somemore/name.pdf

I am getting as match http://doc.pdf and not the complete pdf url.

Any regexp guru, your help is appreciated.

You can try matching on a word boundary

preg_match("/http:\/\/.*?\.pdf\b/i", $source)

Meaning that .pdf will only be matched if there is a non-word character after the pdf such as " , whitespace, etc..

Alternatively, if you know the URL is always going to be followed up with a specific character (double quotes " ?), then you could use

preg_match("/http:\/\/.*?\.pdf\"/i", $source)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM