简体   繁体   中英

Extracting text between Number and Parantheses in a String field in Excel

I have a bunch of values like the one below in a text field in Excel and was wondering if anyone knows a method of extracting the Suburb name (ie Liverpool ) which is usually before the first parantheses and after the last number ie postcode (3860 in this case)

PARK RD INT OF QUEENS DR LONDON 3860 LIVERPOOL (AGA) VIC

Thanks in advance

Put your complex value into A1 and try this in B1:

=IF(ISERROR(FIND(" ";LEFT(A1;FIND(" (";A1)-1)));LEFT(A1;FIND(" (";A1)-1);RIGHT(LEFT(A1;FIND(" (";A1)-1);LEN(LEFT(A1;FIND(" (";A1)-1))-FIND("~";SUBSTITUTE(LEFT(A1;FIND(" (";A1)-1);" ";"~";LEN(LEFT(A1;FIND(" (";A1)-1))-LEN(SUBSTITUTE(LEFT(A1;FIND(" (";A1)-1);" ";""))))))

This will give you the word before ( .
In your example it will return Liverpool. But fyi: this will work for single words.

Try below formula-

=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(SUBSTITUTE(A1,"(",REPT(" ",300)),300))," ",REPT(" ",100)),100))

You may also try-

=FILTERXML("<t><s>"&SUBSTITUTE(FILTERXML("<t><s>"&SUBSTITUTE(A1,"(","</s><s>")&"</s></t>","//s[1]")," ","</s><s>")&"</s></t>","//s[last()]")

在此处输入图像描述

Office 365:

=LET(a,1+MATCH(1,0/ISNUMBER(0+MID(A1,SEQUENCE(LEN(A1)),1))),b,FIND("(",A1),TRIM(MID(A1,a,ba)))

This type of thing is best done in steps, to make it easier to understand and modify. That is:

  • A1: Text you want to extract from
  • B1: Position of first character after the last digit and space =IFERROR(MATCH(10^6, INDEX(--MID(A1, ROW( INDIRECT("1:"& LEN(A1))), 1), )), 0)+2
  • C1: Position of opening parenthesis =FIND("(",A1)
  • D1: Extract text =MID(A1,C1,D1-C1-1)

This works with multi-word text.

Tried this and worked however there is a space at the beginning of every output but I feel that is still manageable

=IFERROR(LEFT(RIGHT(J13,LEN(J13)-MAX(IFERROR(FIND({1,2,3,4,5,6,7,8,9,0},J13,ROW(INDIRECT("1:"&LEN(J13)))),0))),MAX(FIND("(",RIGHT(J13,LEN(J13)-MAX(IFERROR(FIND({1,2,3,4,5,6,7,8,9,0},J13,ROW(INDIRECT("1:"&LEN(J13)))),0))))-1)),RIGHT(J13,LEN(J13)-MAX(IFERROR(FIND({1,2,3,4,5,6,7,8,9,0},J13,ROW(INDIRECT("1:"&LEN(J13)))),0))))

Since you have Excel365, you could use:

在此处输入图像描述

Formula in B1 :

=TEXTJOIN(" ",,FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[position()<count(//s[starts-with(., '(')][1]/preceding::*)+1][position()>count(//s[.*0=0][last()]/preceding::*)+1]"))

The Xpath expression used means:

  • //s[position()<count(//s[starts-with(., '(')][1]/preceding::*)+1] - Get all nodes that are before the very first occurrence of an opening paranthesis.
  • [position()>count(//s[.*0=0][last()]/preceding::*)+1] - Of those returned nodes make sure the position is after the very last numeric node.

This means you'd only return those nodes that are in between the last numeric substring and the first opening paranthesis as per your request.

I'm still looking for a more sound way of writing this Xpath.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM