簡體   English   中英

正則表達式模式與各種端點匹配

[英]Regex pattern matching with various end points

我想通過javascript從以下字符串列表中提取具有特定模式的子字符串。

但是我在設置正則表達式模式時遇到了問題。

輸入字符串列表

  1. 搜索?w = tot&DA = YZR&t__nil_searchbox = btn&sug =&o =& q=%EB%B9%84%EC%BD%98

  2. 搜索? q=%EB%B9%84%EC%BD%98 &go =%EC%A0…4%EB%B9%84%EC%BD%98&sc = 8-2&sp = -1&sk =&cvid = f05407c5bcb9496990d2874135aee8e9

  3. 其中= nexearch& query=%EB%B9%84%EC%BD%98 &sm = top_hty&fbm = 0&ie = utf8

預期模式匹配結果

以上情況為%EB%B9%84%EC%BD%98

正則表達式

+ / /(query|q)=.* + + /

它的終點是$first appeared &

? 我應該為寫什么?

您可以在這里進行測試。 謝謝。

將第一個捕獲組轉換為非捕獲組,然后添加一個否定的字符類而不是.*

\b(?:query|q)=([^&\n]*)

DEMO

> var s = "where=nexearch& query=%EB%B9%84%EC%BD%98&sm=top_hty&fbm=0&ie=utf8"
undefined
> var pat = /\b(?:query|q)=([^&\n]*)/;
> pat.exec(s)[1]
'%EB%B9%84%EC%BD%98'

我個人建議一種替代方法,使用更多過程函數來匹配所需的參數值,而不是“簡單”的正則表達式。 雖然它乍看起來可能比較復雜,但是如果您將來需要查找不同的或附加的參數值,它的確可以輕松擴展。

說:

/* haystack:
     String, the string in which you're looking for the
     parameter-values,
   needles:
     Array, the parameters whose values you're looking for
*/
function queryGrab(haystack, needles) {
  // creating a regular expression from the array of needles,
  // given an array of ['q','query'], this will result in:
  // /^(q)|(query)/gi
  var reg = new RegExp('^(' + needles.join(')|(') + ')', 'gi'),

    // finding either the index of the '?' character in the haystack:
    queryIndex = haystack.indexOf('?'),

    // getting the substring from the haystack, starting
    // after the '?' character:
    keyValues = haystack.substring(queryIndex + 1)
      // splitting that string on the '&' characters,
      // to form an array:
      .split('&')
      // filtering that array (with Array.prototype.filter()),
      // the 'keyValue' argument is the current array-element
      // from the array over which we're iterating:
      .filter(function(keyValue) {
        // if RegExp.prototype.test() returns true,
        // meaning the supplied string ('keyValue')
        // is matched by the created regular expression,
        // the current element is retained in the filtered
        // array:
        return reg.test(keyValue);
    // converting that filtered-array to a string
    // on the naive assumption each searched-string
    // should return only one match:
    }).toString();

  // returning a substring of the keyValue, from after
  // the position of the '=' character:
  return keyValues.substring(keyValues.indexOf('=') + 1);
}

// essentially irrelevant, just for the purposes of
// providing a demonstration; here we get all the
// elements of class="haystack":
var haystacks = document.querySelectorAll('.haystack'),

  // the parameters we're looking for:
  needles = ['q', 'query'],

  // an 'empty' variable for later use:
  retrieved;

// using Array.prototype.forEach() to iterate over, and
// perform a function on, each of the .haystack elements
// (using Function.prototype.call() to use the array-like
// NodeList instead of an array):
Array.prototype.forEach.call(haystacks, function(stack) {
  // like filter(), the variable is the current array-element

  // retrieved caches the found parameter-value (using
  // a variable because we're using it twice):
  retrieved = queryGrab(stack.textContent, needles);

  // setting the next-sibling's text:
  stack.nextSibling.nodeValue = '(found: ' + retrieved + ')';

  // updating the HTML of the current node, to allow for
  // highlighting:
  stack.innerHTML = stack.textContent.replace(retrieved, '<span class="found">$&</span>');
});

 function queryGrab(haystack, needles) { var reg = new RegExp('^(' + needles.join(')|(') + ')', 'gi'), queryIndex = haystack.indexOf('?'), keyValues = haystack.substring(queryIndex + 1) .split('&') .filter(function(keyValue) { return reg.test(keyValue); }).toString(); return keyValues.substring(keyValues.indexOf('=') + 1); } var haystacks = document.querySelectorAll('.haystack'), needles = ['q', 'query'], retrieved; Array.prototype.forEach.call(haystacks, function(stack) { retrieved = queryGrab(stack.textContent, needles); stack.nextSibling.nodeValue = '(found: ' + retrieved + ')'; stack.innerHTML = stack.textContent.replace(retrieved, '<span class="found">$&</span>'); }); 
 ul { margin: 0; padding: 0; } li { margin: 0 0 0.5em 0; padding-bottom: 0.5em; border-bottom: 1px solid #ccc; list-style-type: none; width: 100%; } .haystack { display: block; color: #999; } .found { color: #f90; } 
 <ul> <li><span class="haystack">search?w=tot&amp;DA=YZR&amp;t__nil_searchbox=btn&amp;sug=&amp;o=&amp;q=%EB%B9%84%EC%BD%98</span> </li> <li><span class="haystack">search?q=%EB%B9%84%EC%BD%98&amp;go=%EC%A0…4%EB%B9%84%EC%BD%98&amp;sc=8-2&amp;sp=-1&amp;sk=&amp;cvid=f05407c5bcb9496990d2874135aee8e9</span> </li> <li><span class="haystack">where=nexearch&amp;query=%EB%B9%84%EC%BD%98&amp;sm=top_hty&amp;fbm=0&amp;ie=utf8</span> </li> </ul> 

JS Fiddle(用於更輕松的異地實驗)

參考文獻:

正則表達式不是解析這些查詢字符串的最佳方法。 有庫和工具,但是如果您想自己做:

function parseQueryString(url) {
    return _.object(url .              // build an object from pairs
        split('?')[1]   .              // take the part after the ?
        split('&')      .              // split it by &
        map(function(str) {            // turn parts into 2-elt array
            return str.split('=');     // broken at =
        })
    );
}

這使用了_.object_.object ,它通過鍵/值對的數組數組創建對象,但是如果您不想使用它,則可以用兩行代碼編寫自己的對象。

現在,您正在尋找的價值僅僅是

params = parseQueryString(url);
return params.q || params.query;

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM