正则表达式-匹配之前的字符

Question

I understand the concepts of RegEx, but this is more or less the first time I've actually been trying to write some myself. 我了解RegEx的概念，但这或多或少是我第一次真正尝试自己编写一些东西。

As a part of a project, I'm attempting to parse out strings which match to a certain domain (actually an array of domains, but let's keep it simple). 作为项目的一部分，我试图解析出与某个域匹配的字符串（实际上是一个域数组，但让我们保持简单）。

At first I started out with this: 首先，我从以下开始：

url.match('www.example.com')

But I noticed I was also getting input like this: 但是我注意到我也得到了这样的输入：

http://www.someothersite.com/page?ref=http://www.example.com http://www.someothersite.com/page?ref=http://www.example.com

These rows will of course match for www.example.com but I wish to exclude them. 这些行当然会与www.example.com匹配，但我希望排除它们。 So I was thinking along these lines: Only match rows that contain www.example.com , but not after a ? 因此，我一直在考虑以下方面：只匹配包含www.example.com行，而不匹配? character. 字符。 This is what I came up with: 这是我想出的：

var reg = new RegExp("[^\\?]*" + url + "(\\.*)", "gi");

This does however not seem to work, any suggestions would be greatly appreciated as I fear I've used what little knowledge I yet possess in the matter. 但是，这似乎行不通，因为我担心自己在此问题上所掌握的知识很少，所以任何建议都将不胜感激。

Edit: Some clarifications. 编辑：一些澄清。

The input is logged GET requests. 输入是记录的GET请求。 From these I wish to filter out only a few domains. 从这些中，我希望仅过滤出几个域。 These will have/should handle 0-1 arbitrary subdomains ( example.com , www.example.org , www.somethirdsite.com and web.example.net should all be valid), these will be stored in a variable. 这些将具有/应该处理0-1个任意子域（ example.com ， www.example.org ， www.somethirdsite.com和web.example.net均应有效），它们将存储在变量中。
I specifically found a request as mentioned above, but I would like to also be able to handle http://www.someothersite.com/page?ref=https://www.example.com and http://www.someothersite.com/page?ref=www.example.com ie, if my needle is not part of the request domain, but part of the request data, I do not want the match. 我专门找到了如上所述的请求，但我也希望能够处理http://www.someothersite.com/page?ref=https://www.example.com和http://www.someothersite.com/page?ref=www.example.com即，如果我的针头不是请求域的一部分，而是请求数据的一部分，则我不希望匹配。

Answer 1

Edit: here is the modified regex for arbitrary domain: 编辑：这是任意域的修改后的正则表达式：

RegExp("(^|\\s)(https?://)?(\\w+\\.)?" + url, "gi");

The idea here is that you're matching only url preceded by some white spaces character, which makes it impossible to be inside the query. 这里的想法是，您只匹配带有一些空格字符的url，这使得它不可能出现在查询中。

正则表达式-匹配之前的字符

问题描述

1 个解决方案

解决方案1
1 2011-01-17 13:54:55

正则表达式-匹配之前的字符

问题描述

1 个解决方案

解决方案1 1 2011-01-17 13:54:55

解决方案1
1 2011-01-17 13:54:55