简体   繁体   English

如何在Excel中具有特定条件的字符串中查找子字符串

[英]How to find a substring from a string with certain condition in Excel

I have record like this 我有这样的唱片

A                          Result
Hello AP#12/22 Welcome     AP#12
Thanks AP#123-21           AP#123
No problem AP#111          AP#111

So as you can see i need the AP code from the string. 如您所见,我需要字符串中的AP代码。 It must not contain the - or / part. 它不能包含-或/部分。

Note: 注意:

AP code can be of any number of digit AP代码可以是任意数字

It can appear at the end or start 它可以出现在结尾或开头

AP code can be followed by / or - or any other special symbol such as : or any other. AP代码后可以跟/或-或任何其他特殊符号,例如:或任何其他符号。

So i need a generalized formula rather than checking for each special character(/, -, :) to get AP code. 所以我需要一个通用的公式,而不是检查每个特殊字符(/,-,:)以获取AP代码。

I want to achieve this without using VB. 我想在不使用VB的情况下实现这一目标。

Probably not the most efficient solution... but here's a way without VBA: (line break added for readability) 可能不是最有效的解决方案...但是这是一种不使用VBA的方法:(添加了换行符以提高可读性)

= "AP#"&MID(MID(A1,FIND("AP#",A1)+3,999),1,
  MAX((ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,999),{1,2,3},1)+0)+0)*{1,2,3}))

EDIT 编辑

Slightly better solution: 更好的解决方案:

= MID(A1,FIND("AP#",A1),
  MAX(ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,999),{1,2,3},1)+0)*{1,2,3})+3)

EDIT (again) 再次编辑

As pointed out in comment, this does not take into account something like AP#1-1 . 正如评论中指出的那样,这并未考虑到诸如AP#1-1 Here is the updated formula that will take this into account: 这是将考虑在内的更新公式:

= MID(A1,FIND("AP#",A1),IFERROR(MATCH(FALSE,
  ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,3),{1,2,3},1)+0),0),4)+2)

As requested, here is how this formula works. 根据要求,这是此公式的工作方式。 I'll break it down step by step. 我将逐步分解它。 This is a pretty long explanation but if you just take it one step at a time, I think you should be able to understand the entire formula. 这是一个很长的解释,但是如果您一次只迈出一步,我认为您应该能够理解整个公式。 I'm going to explain what is going on from the inside out. 我将从内到外解释发生了什么。

FIND("AP#",A1) returns the character index number in A1 where the first instance of AP# appears in A1 . FIND("AP#",A1)返回的字符索引号A1 ,其中的第一个实例AP#出现在A1

For simplicity, I will refer to FIND("AP#",A1) as <x1> in the next step. 为简单起见,在下一步中,我将FIND("AP#",A1)称为<x1>

MID(A1,<x1>+3,3) returns the 3 characters in A1 that appear immediately after AP# . MID(A1,<x1>+3,3)返回A1中紧随AP#之后出现的3个字符。 It only returns 3 characters because from the original problem, you said that up to 3 numbers can appear after AP# . 它只返回3个字符,因为从最初的问题开始,您说在AP#之后最多可以出现3个数字。

(Quick note: Originally I had this part of the formula as MID(A1,<x1>+3,999) but after making this explanation, I realized that 999 could be reduced to 3 . 999 would still work, just that 3 is simpler and makes the formula more efficient.) (快速注:本来我有公式的这一部分MID(A1,<x1>+3,999)但使这个解释后,我意识到, 999可以减少到3999仍然工作,只是3是简单,使公式更有效。)

I will refer to this value MID(A1,<x1>+3,3) as <x2> in the next step. 我将在下一步中将此值MID(A1,<x1>+3,3)称为<x2>

MID(<x2>,{1,2,3},1) essentially converts <x2> which is a string of 3 characters, to a array of 3 strings, each string 1 character long. MID(<x2>,{1,2,3},1)本质上将3个字符的字符串<x2>转换为3个字符串的数组 ,每个字符串长1个字符。 In other words, if <x2> is (for example), "1-2" , then that means MID(<x2>,{1,2,3},1) is {"1","-","2"} . 换句话说,例如,如果<x2>"1-2" ,则意味着MID(<x2>,{1,2,3},1){"1","-","2"} It is necessary to convert a string of 3 characters to a 1x3 array of single characters in order to individually analyze each character. 必须将3个字符的字符串转换为1x3的单个字符数组,以便分别分析每个字符。

I will refer to MID(<x2>,{1,2,3},1) as <x3> in the next step. 在下一步中MID(<x2>,{1,2,3},1)我将MID(<x2>,{1,2,3},1)称为<x3>

<x3>+0 may seem like a simple step but there is a lot going on here. <x3>+0似乎很简单,但是这里有很多事情要做。 Keep in mind <x3> is still an array of strings , not numbers (even if they look like numbers). 请记住, <x3>仍然是字符串数组,而不是数字数组(即使它们看起来像数字)。 The +0 will convert all strings that look like numbers to numbers, and will convert all strings that don't look like numbers to an error value. +0将转换看起来像数字到数字的所有字符串,并将转换所有字符串看起来不像数字为错误值。 (In this case, #VALUE! .) (在这种情况下,为#VALUE!

Sticking with our same example, {"1","-","2"}+0 will equal {1,#VALUE!,2} . 继续我们的示例{"1","-","2"}+0将等于{1,#VALUE!,2}

I will refer to <x3>+0 as <x4> in the next step. 在下一步中,我将<x3>+0称为<x4>

MATCH(FALSE,ISNUMBER(<x4>),0) returns the first index of <x4> where it is not a number. MATCH(FALSE,ISNUMBER(<x4>),0)返回<x4>的第一个索引,该索引不是数字。 The idea here is to find the index of the first non-number, and then include everything up to that index (minus one). 这里的想法是找到第一个非数字的索引,然后包括该索引之前的所有内容(减去一个)。

Sticking with our same example, MATCH(FALSE,ISNUMBER({1,#VALUE!,2}),0) would return 2 , because the 2nd index in {1,#VALUE!,2} is the first index that is not a number. 继续我们的示例, MATCH(FALSE,ISNUMBER({1,#VALUE!,2}),0)将返回2 ,因为{1,#VALUE!,2}中的第二个索引是第一个不是一个号码。

I will refer to MATCH(FALSE,ISNUMBER(<x4>),0) as <x5> in the next step. 在下一步中MATCH(FALSE,ISNUMBER(<x4>),0)我将把MATCH(FALSE,ISNUMBER(<x4>),0)称为<x5>

It is possible that all values in <x4> are numbers, in which case <x5> would return an error because it can't find a match for a non-number. <x4>中的所有值都可能是数字,在这种情况下, <x5>将返回错误,因为它找不到非数字的匹配项。 IFERROR(<x5>,4) fixes this issue. IFERROR(<x5>,4)解决了此问题。 It returns the value 4 if all values in <x5> are numbers. 如果<x5>中的所有值都是数字,它将返回值4 The reason to return 4 is because we are basically saying that all 3 of the characters following AP# are numbers, so the first index that we aren't considering after AP# is the 4th index. 返回4的原因是因为我们基本上是说AP#之后的所有3个字符都是数字,所以AP#之后我们考虑的第一个索引是第4个索引。

I will refer to IFERROR(<x5>,4) as <x6> in the next step. 在下一步中IFERROR(<x5>,4)我将IFERROR(<x5>,4)称为<x6>

<x6>+2 may seem like a strange calculation, and it is, so I will write it a different way that will make more sense: (<x6>-1)+3 <x6>+2似乎是一个奇怪的计算,确实如此,所以我将以另一种方式编写它,使它更有意义: (<x6>-1)+3

Remember what <x6> represents here: It is the index of the first non-number that appears in the string of 3 characters after AP# . 记住<x6>在这里代表什么: AP# 之后的3个字符的字符串中出现的第一个非数字的索引。 Therefore, <x6>-1 is the number of characters to include after AP# . 因此, <x6>-1AP#之后要包含的字符数。

Now, why add 3? 现在,为什么要加3? (<x6>-1)+3 is necessary to include the 3 characters in AP# itself. 要在AP#本身中包含3个字符,必须使用(<x6>-1)+3 This will make sense in the next step. 这将在下一步变得有意义。

I will refer to <x6>+2 as <x7> in the next step. 在下一步中,我将<x6>+2称为<x7>

MID(A1,FIND(AP#,A1),<x7>) returns a portion of string A1 , starting at the A in AP# and spanning <x7> characters. MID(A1,FIND(AP#,A1),<x7>)返回字符串A1的一部分,从AP#中的A开始并跨越<x7>字符。 And how large is <x7> ? <x7>多大? It is however many numbers are in the AP# code, plus 3. (Again, we must add 3 to include the 3 AP# characters themselves in the calculation.) 但是, AP#代码中有很多数字加上3。(同样,我们必须加3才能在计算中包括3个AP#字符。)

This is the entire calculation. 这是整个计算。

Come to think of it, you may want to wrap an IFERROR around the entire calculation to take care of cases where AP# isn't found in the string, eg something like: 仔细考虑一下,您可能希望将IFERROR包裹在整个计算中,以处理在字符串中未找到AP#的情况,例如:

= IFERROR(MID(A1,FIND("AP#",A1),IFERROR(MATCH(FALSE,
  ISNUMBER(MID(MID(A1,FIND("AP#",A1)+3,3),{1,2,3},1)+0),0),4)+2),"no match")

But really that is your call. 但这确实是您的电话。 I'm not sure if this is necessary. 我不确定这是否有必要。

Consider the following User Defined Function: 考虑以下用户定义函数:

Public Function FindAPcode(s As String) As String
    Dim L As Long, CH As String, i As Long, j As Long

    FindAPcode = ""
    L = Len(s)
    If L = 0 Then Exit Function
    j = InStr(1, s, "AP#") + 3
    If j = 3 Then Exit Function
    FindAPcode = "AP#"

    For i = j To L
        CH = Mid(s, i, 1)
        If IsNumeric(CH) Then
            FindAPcode = FindAPcode & CH
        Else
            Exit Function
        End If
     Next i
End Function

在此处输入图片说明

User Defined Functions (UDFs) are very easy to install and use: 用户定义函数(UDF)易于安装和使用:

  1. ALT-F11 brings up the VBE window ALT-F11弹出VBE窗口
  2. ALT-I ALT-M opens a fresh module ALT-I ALT-M打开一个新模块
  3. paste the stuff in and close the VBE window 将内容粘贴并关闭VBE窗口

If you save the workbook, the UDF will be saved with it. 如果您保存工作簿,则UDF将随之保存。 If you are using a version of Excel later then 2003, you must save the file as .xlsm rather than .xlsx 如果您在2003年以后使用Excel版本,则必须将文件另存为.xlsm而不是.xlsx

To remove the UDF: 删除UDF:

  1. bring up the VBE window as above 如上调出VBE窗口
  2. clear the code out 清除代码
  3. close the VBE window 关闭VBE窗口

To use the UDF from Excel: 要从Excel使用UDF:

=myfunction(A1) = myfunction的(A1)

To learn more about macros in general, see: 要总体上了解有关宏的更多信息,请参见:

http://www.mvps.org/dmcritchie/excel/getstarted.htm http://www.mvps.org/dmcritchie/excel/getstarted.htm

and

http://msdn.microsoft.com/en-us/library/ee814735(v=office.14).aspx http://msdn.microsoft.com/en-us/library/ee814735(v=office.14).aspx

and for specifics on UDFs, see: 有关UDF的详细信息,请参见:

http://www.cpearson.com/excel/WritingFunctionsInVBA.aspx http://www.cpearson.com/excel/WritingFunctionsInVBA.aspx

Macros must be enabled for this to work! 必须启用宏才能使其正常工作!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM