简体   繁体   中英

How can I get a specific part of a URL using RegEx?


I am trying to get a part of a file download using RegEx (or other methods). I have pasted below the link that I am trying to parse and put the part I am trying to select in bold.

https://minecraft.azureedge.net/bin-linux/bedrock-server-.zip

I have looked around and thought about trying Named Capture Groups , however I couldn't figure it out. I would like to be able to do this in JavaScript/Node.js, even if it requires a module 👻.

You could use the below regex:

[\d.]+(?=\.\w+$)

This matches dots and digits that are following a file extension. You could also make it more accurate:

\d+(?:\.\d+)*(?=\.\w+$)

Perhaps a regular expression like this is what you need?

 var url = 'https://minecraft.azureedge.net/bin-linux9.9.9/bedrock-server-1.7.0.13.zip' var match = url.match(/(\\d+[.\\d+]*)(?=\\.\\w+$)/gi) console.log( match ) 

The way this pattern /\\d+[.\\d+]*\\d+/gi works is to basically say that we want a sub string match that:

  1. first contains one or more digit characters, ie \\d+
  2. immediately following this, there can be optional groupings of digits and decimal characters, ie [.\\d+]
  3. and finally, (?=\\.\\w+$) requires a file extension like .zip to follow immediately after our matched string

For more information on special characters like + and * , see this documentation . Hope that helps!

I'd stick with this:

-(\d+(?:\.\d+)*)(?:\.\w+)$
  • It matches a dash before any numbers
  • The parenthesis will make a capture group
  • Then, \\d+ will match from one to any number of digits
  • ?: will make a group but not capture it
  • Inside this group, \\.\\d+ will match a dot followed by any number of digits
  • The last expression will repeat from zero to any times thanks to *
  • After that, (?:\\.\\w+)$ will make a group that matches the extension toward the end of the string but not capture it

So, basically, this format would allow you to capture all the numbers that are after the dash and before the extension, be it 1 , 1.7 , 1.7.0 , 1.7.0.13 , 1.7.0.13.5 etc. On the match array, at index [0] you will have the entire regex match, and on [1] you will have your captured group, the number you're looking for.

You can use node.js default modules to ease the match

URL and path to identify filename, and an easy regexp finally.

const { URL } = require('url')
const path = require('path')

const test = new URL(
  'https://minecraft.azureedge.net/bin-linux/bedrock-server-1.7.0.13.zip'
)
/*
  test.pathname = '/bin-linux/bedrock-server-1.7.0.13.zip'
  path.parse(test.pathname) = { root: '/',
    dir: '/bin-linux',
    base: 'bedrock-server-1.7.0.13.zip',
    ext: '.zip',
    name: 'bedrock-server-1.7.0.13' }
  match = [ '1.7.0.13', index: 15, input: 'bedrock-server-1.7.0.13' ]
*/
const match = path.parse(test.pathname)
  .name
  .match(/[0-9.]*$/)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM