简体   繁体   中英

Regular expression for street address regardless of order (either /<house> <street>/ or /<street> <house>/)

I'm trying to parse a street address into the street name and the house number, but I want to allow the house number to appear either before or after the street name. The following code yields a syntax error:

const { groups: { house, street } } = streetAddress.match(/^(?<house>\d+)\s+(?<street>.*)|(?<street>.*)\s+(?<house>\d+)$/);
SyntaxError: Invalid regular expression: /^(?<house>\d+)\s+(?<street>.*)|(?<street>.*)\s+(?<house>\d+)$/: Duplicate capture group name

Is there an elegant way to do this?

I don't think it is possible to do this with named capture groups. I know, logically the group names are not duplicated because there is an OR there. But that would require a semantic analysis of the regex by the parser, and parsers really shouldn't go that deep in their initial analysis. In fact, if they did, you could use a regex parser as a SAT solver, which would imply that parsing a regex would be NP-hard in worst case.

But enough of the digression, this instead works:

const streetAddress = process.argv[2];
const groups = streetAddress.match(/^(\d+)\s+(.*)|(.*)\s+(\d+)$/);
const house = groups[1] || groups[4];
const street = groups[2] || groups[3];
console.log({house, street});

Examples:

> node x.js "1234 Mongomery"
{ house: '1234', street: 'Mongomery' }

> node x.js "Neuer Weg 1234"
{ house: '1234', street: 'Neuer Weg' }

I came up with this solution, which I think is reasonably elegant:

const { groups: { house, street } } =
    streetAddress.match(/^(?<house>\d+)\s+(?<street>.*)$/) || 
    streetAddress.match(/^(?<street>.*)\s+(?<house>\d+)$/);

(It probably needs to be wrapped in a try block to deal with addresses that don't match either pattern.)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM