If I have the following regex strings:
String one = "\"/^[^.]+$|\\.(?!(avi|bmp)$)([^.]+$)/i\"";
String two = "\"/^.*\\.(txt)$/i"";
Assuming I just want to parse the file extensions out of the strings, for example, I'd like:
List<String> fileExtensionsOne = getFileExtensionsFromRegex(one); // Returns ("avi",bmp")
List<String> fileExtensionsTwo = getFileExtensionsFromRegex(two); // Returns ("txt")
What's the best way to implement getFileExtensionsFromRegex
? Is it possible to convert the string to Java Regex objects and grab the groups out of them? eg without applying the pattern to some input text?
Edit, I think I can rely on the regex patterns staying fairly consistent, either this:
'/^.*\\.(' + _map(extensions, 'text').join('|') + ')$/i'
or this:
'/^[^.]+$|\\.(?!(' + _map(extensions, 'text').join('|') + ')$)([^.]+$)/i'
My approach is mainly to create a regex that analyzes the regex, something like
.*\(([a-z0-9\|]+)\).*
(Disclaimer: haven't checked it for correct regex syntax)
This looks for a group inside the regex, beginning with an opening paren \\(
, then containing any number of letters, digits and pipes [a-z0-9\\|]+
(assuming that file extensions allow for exactly these characters), followed by the closing paren \\)
and returns the content between the parens as group(1)
. The group-returning is what the extra parens just inside the \\(
and \\)
pair are meant for.
In the first example, this should give avi|bmp
, and in the second one txt
.
Then, do a split("\\|")
on the group(1)
result, and you get the individual extensions.
This might be what you need:
public static List<String> getFileExtensionsFromRegex(String s) {
Pattern pattern = Pattern.compile(("[a-zA-Z0-9]{2,}"));
Matcher matcher = pattern.matcher(s);
List<String> result = new ArrayList<>();
while (matcher.find()) {
result.add(matcher.group());
}
return result;
}
Your logic could start with comparing each caracter is it letter by ASCII. Here is a quick ASCII characters reference:
{
"31": "", "32": " ", "33": "!", "34": "\"", "35":
"#",
"36": "$", "37": "%", "38": "&", "39": "'", "40":
"(",
"41": ")", "42": "*", "43": "+", "44": ",", "45":
"-",
"46": ".", "47": "/", "48": "0", "49": "1", "50":
"2",
"51": "3", "52": "4", "53": "5", "54": "6", "55":
"7",
"56": "8", "57": "9", "58": ":", "59": ";", "60":
"<",
"61": "=", "62": ">", "63": "?", "64": "@", "65":
"A",
"66": "B", "67": "C", "68": "D", "69": "E", "70":
"F",
"71": "G", "72": "H", "73": "I", "74": "J", "75":
"K",
"76": "L", "77": "M", "78": "N", "79": "O", "80": "P",
"81": "Q", "82": "R", "83": "S", "84": "T", "85": "U",
"86": "V", "87": "W", "88": "X", "89": "Y", "90": "Z",
"91": "[", "92": "\\", "93": "]", "94": "^", "95": "_",
"96": "`", "97": "a", "98": "b", "99": "c",
"100": "d",
"101": "e", "102": "f", "103": "g", "104": "h",
"105": "i",
"106": "j", "107": "k", "108": "l", "109": "m",
"110": "n",
"111": "o", "112": "p", "113": "q", "114": "r",
"115": "s",
"116": "t", "117": "u", "118": "v", "119": "w",
"120": "x",
"121": "y", "122": "z", "123": "{", "124": "|",
"125": "}",
"126": "~", "127": "" }
something like this from
String.fromCharCode(97) //will return 'a'
to
String.fromCharCode(122) //will return 'z'
if it starts with letter you compare next until it is not letter
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.