简体   繁体   English

如何使用Javascript和正则表达式解析网址?

[英]How to parse a url using Javascript and Regular Expression?

I want to parse some urls's which have the following format :- 我想解析一些具有以下格式的网址:

var url ="http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a" 

Its not necessary that the domain name and other parts would be same for all url's, they can vary ie I am looking at a general solution. 域名和其他部分的所有URL都不必相同,它们可以有所不同,即我正在寻找一种通用的解决方案。

Basically I want to strip off all the other things and get only the part: 基本上,我想剥离所有其他内容并仅获得一部分:

/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p

I thought to parse this using JavaScript and Regular Expression 我想用JavaScript和正则表达式解析

I am doing like this: 我这样做是这样的:

var mapObj = {"/^(http:\/\/)?.*?\//":"","(&mycracker.+)":"","(&ref.+)":""};
var re = new RegExp(Object.keys(mapObj).join("|"),"gi");
url = url.replace(re, function(matched){
  return mapObj[matched];
}); 

But its returning this 但是它返回了

http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43pundefined

Where am I not doing the correct thing? 我在哪里做不正确的事情? Or is there another approach with an even easier solution? 还是有另一种方法甚至更简单的解决方案?

You can use : 您可以使用 :

/(?:https?:\/\/[^\/]*)(\/.*?)(?=\&mycracker)/

Code : 代码:

var s="http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a";
var ss=/(?:https?:\/\/[^\/]*)(\/.*?)(?=\&mycracker)/;
console.log(s.match(ss)[1]);

Demo 演示版

Fiddle Demo 小提琴演示

Explanation : 说明:

说明

Why don't you just map a split array? 您为什么不只映射拆分数组?

You don't quite need to regex the URL, but you will have to run an if statement inside the loop to remove specific GET params from them. 您并不需要对URL进行正则表达式,但是必须在循环内运行if语句才能从中删除特定的GET参数。 In this particular case (key word particular) you just have to substring till the indexOf "&mycracker" 在这种特定情况下(特定于关键字),您只需要对字符串进行子串化,直到indexOf “&mycracker”

var url ="http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a" 
var x = url.split("/");
var y = [];
x.map(function(data,index) { if (index >= 3) y.push(data); });
var path = "/"+y.join("/");
path = path.substring(0,path.indexOf("&mycracker"));

Change the following code a little bit and you can retrieve any parameter: 稍微更改以下代码,即可检索任何参数:

var url = "http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a"
var re = new RegExp(/http:\/\/[^?]+/);
var part1 = url.match(re);
var remain = url.replace(re, '');
//alert('Part1: ' + part1);
var rf = remain.split('&');
// alert('Part2: ' + rf);
var part2 = '';
for (var i = 0; i < rf.length; i++) 
    if (rf[i].match(/(p%5B%5D|sid)=/))
        part2 += rf[i] + '&';
part2 = part2.replace(/&$/, '');
//alert(part2)
url = part1 + part2;
alert(url);
var url ="http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a";
var newAddr = url.substr(22,url.length);
// newAddr == "/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a"

22 is where to start slicing up the string. 22是开始切线的地方。

url.length is how much of it to include. url.length是要包含的数量。

This works as long as the domain name remains the same on the links. 只要域名在链接上保持不变,此方法就起作用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM