简体   繁体   中英

Regexp for accept numbers, letters and special characters

NKA-198, HM-1-0022, SCIDG 133

want regexp for the above codes. How can I Accept these codes and assign it to a variable??

Please suggest me and Thanks in advance.

First make sure that you have a solid understanding of the general structure of the strings you want to match - eg, which separator symbols will be permissible (your example suggests - , SPC , but what about + ? Would you want to match NKA 198 , SCIDG-133 too ?

As a base for further refinement, use the following code fragment:

var orig = "some string containing ids like 'NKA-198' and 'SCIDG 133'";
var first_id = orig.replace(/^.*?([A-Z]+([ -][0-9]+)+).*/, "$1");
var last_id = orig.replace(/(?:.*[^A-Z]|^)([A-Z]+([ -][0-9]+)+).*/, "$1");

Explanation

  • core( ([AZ]+([ -][0-9]+)+) )

    Match any sequence of capital letters followed by a digit sequence preceded by a single hyphen or space character. The sequence 'space or hyphen plus number' may repeat arbitrarily often but at least once. This specification may be too restrictive or too lax which is the reason why you have to look up / guess general rules that the Ids you wish to match obey. In a strict sense, the regex you've been asking for is ^(NKA-198|HM-1-0022| SCIDG 133)$ , which most certainly is not what you need.

    The outermost parentheses define the match as the first capture group, allowing to reference the matched content as $1 in the replace method . Using replace also mandates that your regexp needs to match the whole original string.

  • additional parts / first regexp

    Matches anything non-greedily, starting at the string's beginning. The non-greedy operator ( .*? ) makes sure that the shortest possible match is found that still allows a match of the complete pattern (See what happens if you drop the question mark). Ths you'll end up with the first matching id in first_id .

  • additional parts / second regexp

    Matches greedily (= as much as possible) until an identifier pattern matches. Thus you'll end up with the last match. the negated character class ( [^AZ] ) is necessary, since you there is no further information about the structure of the IDs in question, specifically which/how many leading capital characters there are. The class makes sure that the last character beforethe beginning of the matched id is not a capital character. The ^ in the alternation caters for the special case that orig starts with a matchable ID - in this case, the negated char class would not match, because there is no 'last prefix character' before the match.

References

A more detailed (and more competent) explanation of regexp pattern and usage can be found here . MDN provides info on regular expression usage in javascript.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM