简体   繁体   English

javascript查找协议,域,加上来自src标记的regexp的第一个斜杠,替换为空字符串

[英]javascript find protocol, domain, plus first slash with regexp from a src tag, replace with empty string

I tried to construct a regex for this task but I'm afraid I am still failing to have an intuitive understanding of regexp. 我试图为这个任务构建一个正则表达式,但我担心我仍然没有对regexp有直观的理解。

The problem is the regex matches until the last slash in a string. 问题是正则表达式匹配,直到字符串中的最后一个斜杠。 I want it to stop at the first match of the string. 我希望它在字符串的第一个匹配处停止。

My pathetic attempt at regex: 我对正则表达式的可怜尝试:

/^http(s?):\/\/.+\/{1}/

Test subject: 考试科目:

http://foo.com/bar/test/foo.jpeg

The goal is to obtain bar/test/foo.jpeg , so that I may then split the string, pop the last element and then join the remainder, resulting in having the path to the JavaScript file. 目标是获得bar/test/foo.jpeg ,这样我就可以拆分字符串,弹出最后一个元素然后加入其余元素,从而获得JavaScript文件的路径。

Example

var str = 'http://foo.com/bar/test/foo.jpeg';
str.replace(regexp,'');

While the other answer shows how to match a part of a string, I think a replace solution is more appropriate for the current task. 虽然另一个答案显示了如何匹配字符串的一部分,但我认为替换解决方案更适合当前任务。

The issue you have is that .+ matches one or more characters other than a newline greedily , that is, all the string is grabbed first in one go, and then the regex engine starts backtracking (moving backwards along the input string looking for a / to accommodate in the match). 你遇到的问题是.+ 贪婪地匹配换行符以外的一个或多个字符,也就是说,一次性抓取所有字符串,然后正则表达式引擎开始回溯(沿输入字符串向后移动寻找/在比赛中容纳)。 Thus, you get the match from http until the last / . 因此,你从http到最后一个/得到匹配。

To restrict the match from http to the first / use a negated character class [^/]+ instead of .+ . 要将匹配从http限制为第一个/使用否定字符类 [^/]+而不是.+

^https?:\/\/[^\/]+\/
            ^^^^^^

See the regex demo 请参阅正则表达式演示

Note that you do not need to place s into a capturing group to make it optional, unescaped ? 请注意,您不需要将s放入捕获组以使其成为可选的,未转义的? is a quantifier that makes the preceding character match one or zero times. 是一个量词,使前一个字符匹配一次或零次。 Also, {1} is a redundant quantifier since this is default behavior, c will only match 1 c , (?:something) will only match one something . 此外, {1}是一个冗余量词,因为这是默认行为, c只匹配1 c(?:something)只匹配一个something

 var re = /^https?:\\/\\/[^\\/]+\\//; var str = 'http://foo.com/bar/test/foo.jpeg'; var result = str.replace(re, ''); document.getElementById("r").innerHTML = result; 
 <div id="r"/> 

Note that you will need to assign the replace result to some variable, since in JS, strings are immutable. 请注意,您需要将替换结果分配给某个变量,因为在JS中,字符串是不可变的。

Regex explanation : 正则表达式解释

  • ^ - start of string ^ - 字符串的开头
  • https? - either http or https substring - httphttps子字符串
  • :\\/\\/ - a literal sequence of :// :\\/\\/ - 一个文字序列://
  • [^\\/]+ - 1 or more characters other than a / [^\\/]+ - 除了/之外的1个或多个字符
  • \\/ - a literal / symbol \\/ - 文字/符号

Use capturing group based regex. 使用基于捕获组的正则表达式。

> var s = "http://foo.com/bar/test/foo.jpeg"
> s.match(/^https?:\/\/[^\/]+((?:\/[^\/]*)*)/)[1]
'/bar/test/foo.jpeg'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM