简体   繁体   English

PHP特殊正则表达式模式,用于在特定字符串后匹配URL

[英]PHP special regular expression pattern to match URLs after a specific string

I'm trying to develop a word counting application that supports .pdf, .docx, .doc, .txt, ..etc documents and I was able to read .doc files with PHP and load the plain text to a variable. 我正在尝试开发一个支持.pdf,.docx,.doc,.txt,.. etc文件的单词计数应用程序,并且能够使用PHP读取.doc文件并将纯文本加载到变量中。

I'm using following code to remove extra white spaces of the string. 我正在使用以下代码删除字符串的多余空格。

$str = trim(preg_replace('/\s+/', ' ', $str));

My issue is: Word documents with hyperlinks are phrasing as Some dummy text here.. HYPERLINK "http://domain.com/directory/page" other dummy text is here.. 我的问题是:带有超链接的Word文档Some dummy text here.. HYPERLINK "http://domain.com/directory/page" other dummy text is here..措辞为“ Some dummy text here.. HYPERLINK "http://domain.com/directory/page" other dummy text is here..

So I want to remove that HYPERLINK "http://domain.com/directory/page" part or replace with a space or something. 因此,我想删除该HYPERLINK "http://domain.com/directory/page"部分,或将其替换为空格或其他内容。

Since I'm not a regular expression expert, I'm looking for help to solve this problem. 由于我不是正则表达式专家,因此我正在寻求帮助来解决此问题。 Thanks! 谢谢!

HYPERLINK " http://domain.com/directory/page " will be matched by: HYPERLINK的“ http://domain.com/directory/page ”将通过以下方式匹配:

HYPERLINK "[^"]*"

Hyperlink, then quote, then anything but quote, then quote. 超链接,然后引用,然后除引用外再引用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM