简体   繁体   中英

PHP special regular expression pattern to match URLs after a specific string

I'm trying to develop a word counting application that supports .pdf, .docx, .doc, .txt, ..etc documents and I was able to read .doc files with PHP and load the plain text to a variable.

I'm using following code to remove extra white spaces of the string.

$str = trim(preg_replace('/\s+/', ' ', $str));

My issue is: Word documents with hyperlinks are phrasing as Some dummy text here.. HYPERLINK "http://domain.com/directory/page" other dummy text is here..

So I want to remove that HYPERLINK "http://domain.com/directory/page" part or replace with a space or something.

Since I'm not a regular expression expert, I'm looking for help to solve this problem. Thanks!

HYPERLINK " http://domain.com/directory/page " will be matched by:

HYPERLINK "[^"]*"

Hyperlink, then quote, then anything but quote, then quote.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM