[英]excel formula find part number in file path text string
I have a extract of all the files on a network drive, and in the some file names is a part number, the part numbers format is 0000-000000-00
. 我提取了网络驱动器上的所有文件,其中一些文件名是部件号,部件号格式为
0000-000000-00
。 Now in the 600,000+ path names in this file I'm trying to figure out how to extract my part numbers out of the path names. 现在,在此文件的600,000多个路径名中,我试图找出如何从路径名中提取零件号。 I think a
mid
formula might work but I am at a loss on how to tell it to find anything with the part # format 0000-000000-00
and extract only those 14 characters from the path? 我认为一个
mid
公式可能会起作用,但是我对如何告诉它查找带有#格式0000-000000-00
的部件并从路径中仅提取那14个字符的内容感到0000-000000-00
?
input looks like this 输入看起来像这样
c:\users\stuff\folder_name\1234-000001-01_ baskets_1.pdf
c:\users\stuff\folder_name\1234-000001-02_ baskets_2.pdf
c:\users\stuff\folder_name\1234-000001-03_ baskets_3.pdf
c:\users\stuff\folder_name\1234-000030-01_ tree_30.pdf
c:\users\stuff\folder_name\random text_1234-000030-02_ tree_30.pdf
c:\users\stuff\folder_name\more random stuff_1234-000030-02_ tree_30.pdf
output I'm hoping for 我希望的输出
1234-000001-01
1234-000001-02
1234-000001-03
1234-000030-01
You wanted a formula but a UDF could also be used to apply a regex to get the pattern (a little overkill in this instance but worth being aware of): 您需要一个公式,但也可以使用UDF来应用正则表达式以获取模式(在这种情况下,有些过分了但值得注意):
Option Explicit
Public Sub GetCustomString()
Dim i As Long, tests()
tests = Array("c:\users\stuff\folder_name\1234-000001-01_ baskets_1.pdf", _
"c:\users\stuff\folder_name\1234-000001-02_ baskets_2.pdf", _
"c:\users\stuff\folder_name\1234-000001-03_ baskets_3.pdf", _
"c:\users\stuff\folder_name\1234-000030-01_ tree_30.pdf", _
"c:\users\stuff\folder_name\random text_1234-000030-02_ tree_30.pdf", _
"c:\users\stuff\folder_name\more random stuff_1234-000030-02_ tree_30.pdf")
For i = LBound(tests) To UBound(tests)
Debug.Print GetString(tests(i))
Next
End Sub
Public Function GetString(ByVal inputString As String) As String
Dim arr() As String, i As Long, matches As Object, re As Object
Set re = CreateObject("VBScript.RegExp")
With re
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "\d{4}-\d{6}-\d{2}"
If .test(inputString) Then
GetString = .Execute(inputString)(0)
Else
GetString = vbNullString
End If
End With
End Function
Using UDF in sheet: 在工作表中使用UDF:
Pattern: \\d{4}-\\d{6}-\\d{2}
模式:
\\d{4}-\\d{6}-\\d{2}
Explanation: 说明:
\\d{4} matches a digit (equal to [0-9]) \\ d {4}匹配一个数字(等于[0-9])
{4} Quantifier — Matches exactly 4 times {4}量词-精确匹配4次
"-" matches the character - literally (case sensitive) “-”与字符匹配-字面意义(区分大小写)
\\d{6} matches a digit (equal to [0-9]) \\ d {6}匹配一个数字(等于[0-9])
{6} Quantifier — Matches exactly 6 times {6}量词-精确匹配6次
"-" matches the character - literally (case sensitive) “-”与字符匹配-字面意义(区分大小写)
\\d{2} matches a digit (equal to [0-9]) \\ d {2}匹配一个数字(等于[0-9])
{2} Quantifier — Matches exactly 2 times {2}量词-精确匹配2次
Global pattern flags: g modifier: global. 全局模式标志:g修饰符:全局。 All matches (don't return after first match) m modifier: multi line.
所有匹配项(第一次匹配后不返回)m修饰符:多行。 Causes ^ and $ to match the begin/end of each line (not only begin/end of string)
使^和$匹配每行的开始/结束(不仅是字符串的开始/结束)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.