简体   繁体   English

在 Excel 的字符串字段中提取数字和括号之间的文本

[英]Extracting text between Number and Parantheses in a String field in Excel

I have a bunch of values like the one below in a text field in Excel and was wondering if anyone knows a method of extracting the Suburb name (ie Liverpool ) which is usually before the first parantheses and after the last number ie postcode (3860 in this case)我在 Excel 的文本字段中有一堆值,如下面的值,想知道是否有人知道提取郊区名称(即利物浦)的方法,该名称通常在第一个括号之前和最后一个数字之后,即邮政编码(3860 in这个案例)

PARK RD INT OF QUEENS DR LONDON 3860 LIVERPOOL (AGA) VIC PARK RD INT OF QUEENS DR LONDON 3860利物浦(AGA) VIC

Thanks in advance提前致谢

Put your complex value into A1 and try this in B1:将您的复杂值放入 A1 并在 B1 中尝试:

=IF(ISERROR(FIND(" ";LEFT(A1;FIND(" (";A1)-1)));LEFT(A1;FIND(" (";A1)-1);RIGHT(LEFT(A1;FIND(" (";A1)-1);LEN(LEFT(A1;FIND(" (";A1)-1))-FIND("~";SUBSTITUTE(LEFT(A1;FIND(" (";A1)-1);" ";"~";LEN(LEFT(A1;FIND(" (";A1)-1))-LEN(SUBSTITUTE(LEFT(A1;FIND(" (";A1)-1);" ";""))))))

This will give you the word before ( .这会给你之前的词( .
In your example it will return Liverpool.在您的示例中,它将返回利物浦。 But fyi: this will work for single words.但仅供参考:这适用于单个单词。

Try below formula-试试下面的公式 -

=TRIM(RIGHT(SUBSTITUTE(TRIM(LEFT(SUBSTITUTE(A1,"(",REPT(" ",300)),300))," ",REPT(" ",100)),100))

You may also try-你也可以试试——

=FILTERXML("<t><s>"&SUBSTITUTE(FILTERXML("<t><s>"&SUBSTITUTE(A1,"(","</s><s>")&"</s></t>","//s[1]")," ","</s><s>")&"</s></t>","//s[last()]")

在此处输入图像描述

Office 365:办公室 365:

=LET(a,1+MATCH(1,0/ISNUMBER(0+MID(A1,SEQUENCE(LEN(A1)),1))),b,FIND("(",A1),TRIM(MID(A1,a,ba)))

This type of thing is best done in steps, to make it easier to understand and modify.这类事情最好分步完成,以便更容易理解和修改。 That is:那是:

  • A1: Text you want to extract from A1:要从中提取的文本
  • B1: Position of first character after the last digit and space =IFERROR(MATCH(10^6, INDEX(--MID(A1, ROW( INDIRECT("1:"& LEN(A1))), 1), )), 0)+2 B1: Position 最后一个数字和空格后的第一个字符 =IFERROR(MATCH(10^6, INDEX(--MID(A1, ROW( INDIRECT("1:"& LEN(A1))), 1), )) , 0)+2
  • C1: Position of opening parenthesis =FIND("(",A1) C1: Position 左括号 =FIND("(",A1)
  • D1: Extract text =MID(A1,C1,D1-C1-1) D1:提取文本 =MID(A1,C1,D1-C1-1)

This works with multi-word text.这适用于多字文本。

Tried this and worked however there is a space at the beginning of every output but I feel that is still manageable试过这个并且工作但是在每个 output 的开头都有一个空格,但我觉得这仍然是可以管理的

=IFERROR(LEFT(RIGHT(J13,LEN(J13)-MAX(IFERROR(FIND({1,2,3,4,5,6,7,8,9,0},J13,ROW(INDIRECT("1:"&LEN(J13)))),0))),MAX(FIND("(",RIGHT(J13,LEN(J13)-MAX(IFERROR(FIND({1,2,3,4,5,6,7,8,9,0},J13,ROW(INDIRECT("1:"&LEN(J13)))),0))))-1)),RIGHT(J13,LEN(J13)-MAX(IFERROR(FIND({1,2,3,4,5,6,7,8,9,0},J13,ROW(INDIRECT("1:"&LEN(J13)))),0))))

Since you have Excel365, you could use:由于您有 Excel365,您可以使用:

在此处输入图像描述

Formula in B1 : B1中的公式:

=TEXTJOIN(" ",,FILTERXML("<t><s>"&SUBSTITUTE(A1," ","</s><s>")&"</s></t>","//s[position()<count(//s[starts-with(., '(')][1]/preceding::*)+1][position()>count(//s[.*0=0][last()]/preceding::*)+1]"))

The Xpath expression used means:使用的 Xpath 表达式表示:

  • //s[position()<count(//s[starts-with(., '(')][1]/preceding::*)+1] - Get all nodes that are before the very first occurrence of an opening paranthesis. //s[position()<count(//s[starts-with(., '(')][1]/preceding::*)+1] - 获取第一次出现之前的所有节点开括号。
  • [position()>count(//s[.*0=0][last()]/preceding::*)+1] - Of those returned nodes make sure the position is after the very last numeric node. [position()>count(//s[.*0=0][last()]/preceding::*)+1] - 在这些返回的节点中,确保 position在最后一个数字节点之后。

This means you'd only return those nodes that are in between the last numeric substring and the first opening paranthesis as per your request.这意味着您只会根据您的请求返回位于最后一个数字 substring 和第一个左括号之间的那些节点。

I'm still looking for a more sound way of writing this Xpath.我仍在寻找一种更合理的方式来编写此 Xpath。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM