[英]How to parse a url using Javascript and Regular Expression?
我想解析一些具有以下格式的網址:
var url ="http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a"
域名和其他部分的所有URL都不必相同,它們可以有所不同,即我正在尋找一種通用的解決方案。
基本上,我想剝離所有其他內容並僅獲得一部分:
/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p
我想用JavaScript和正則表達式解析
我這樣做是這樣的:
var mapObj = {"/^(http:\/\/)?.*?\//":"","(&mycracker.+)":"","(&ref.+)":""};
var re = new RegExp(Object.keys(mapObj).join("|"),"gi");
url = url.replace(re, function(matched){
return mapObj[matched];
});
但是它返回了
http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43pundefined
我在哪里做不正確的事情? 還是有另一種方法甚至更簡單的解決方案?
您可以使用 :
/(?:https?:\/\/[^\/]*)(\/.*?)(?=\&mycracker)/
代碼:
var s="http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a";
var ss=/(?:https?:\/\/[^\/]*)(\/.*?)(?=\&mycracker)/;
console.log(s.match(ss)[1]);
說明:
您為什么不只映射拆分數組?
您並不需要對URL進行正則表達式,但是必須在循環內運行if語句才能從中刪除特定的GET參數。 在這種特定情況下(特定於關鍵字),您只需要對字符串進行子串化,直到indexOf
“&mycracker”
var url ="http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a"
var x = url.split("/");
var y = [];
x.map(function(data,index) { if (index >= 3) y.push(data); });
var path = "/"+y.join("/");
path = path.substring(0,path.indexOf("&mycracker"));
稍微更改以下代碼,即可檢索任何參數:
var url = "http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a"
var re = new RegExp(/http:\/\/[^?]+/);
var part1 = url.match(re);
var remain = url.replace(re, '');
//alert('Part1: ' + part1);
var rf = remain.split('&');
// alert('Part2: ' + rf);
var part2 = '';
for (var i = 0; i < rf.length; i++)
if (rf[i].match(/(p%5B%5D|sid)=/))
part2 += rf[i] + '&';
part2 = part2.replace(/&$/, '');
//alert(part2)
url = part1 + part2;
alert(url);
var url ="http://www.example.com/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a";
var newAddr = url.substr(22,url.length);
// newAddr == "/cooks/cooking-dress-wine/~no-order/pr?p%5B%5D=sort%3Dfeatured&sid=bks%2C43p&mycracker=ch_vn_clothing_subcategory_Puma&ref=b41c8097-8efe-4acf-8919-0fa81bcb590a"
22
是開始切線的地方。
url.length
是要包含的數量。
只要域名在鏈接上保持不變,此方法就起作用。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.