[英]How to select specific substring length based on a filter
I have multiple CSV files with different names, containing today's date, customer number, and then the extension. 我有多个具有不同名称的CSV文件,包含今天的日期,客户编号,然后是扩展名。 For example:
例如:
2019-01-23 XYZF-105.csv
2019-01-23 ABCD-205.csv
2019-01-23 Different nonstandard name.csv
2019-01-23 ##ABCD-305(Trial).csv
I would like to get the part of the name where it contains the customer number only, like ABCD-305. 我想获得名称中仅包含客户编号的部分,如ABCD-305。
Tried using a substring to select 8 characters right from the dot, but that doesn't work for those that have suffix like (Trial). 尝试使用子字符串从点中选择8个字符,但这对于具有后缀的那些(试用)不起作用。 Neither it work 11 characters from the beginning, as it will include the ##.
它从一开始就不会起作用11个字符,因为它将包含##。 Also, it has to avoid the nonstandard names.
此外,它必须避免非标准的名称。
I used 我用了
$allitems = Get-ChildItem -Path 'C:\Downloads\Customers\*.csv'
$res = @()
foreach ($item in $allitems){
$item = $item.Name.substring($item.Name.Length - 12,8)
$res += $Item
}
This way, for the proper names I get good results, but only if the name of the CSV is like 2019-01-23 ABCD-205.csv. 这样,对于正确的名称,我得到了很好的结果,但只有当CSV的名称类似于2019-01-23 ABCD-205.csv时。
What should be the way to skip the date, skip the .csv extension and get only results with 8 characters, that have a dash after the 4th character? 什么是跳过日期的方法,跳过.csv扩展名,只得到8个字符的结果,在第4个字符后有一个短划线? Thanks in advance
提前致谢
Try the following (PSv3+ syntax): 尝试以下(PSv3 +语法):
$res = (Get-ChildItem -Path C:\Downloads\Customers\*.csv).Name |
Select-String -CaseSensitive '\b[A-Z]{4}-\d{3}\b' |
ForEach-Object { $_.Matches[0].Value }
(Get-ChildItem -Path C:\\Downloads\\Customers\\*.csv).Name
outputs the file names of all CSV files in dir. (Get-ChildItem -Path C:\\Downloads\\Customers\\*.csv).Name
输出dir中所有CSV文件的文件名。 C:\\Downloads\\Customers
Select-String -CaseSensitive '\\b[AZ]{4}-\\d{3}\\b'
uses case-sensitive regex (regular-expression) matching to only select file names that contain 4 ( {4}
) uppercase chars. Select-String -CaseSensitive '\\b[AZ]{4}-\\d{3}\\b'
使用区分大小写的正则表达式(正则表达式)匹配,仅选择包含4( {4}
)个大写字符的文件名。 [AZ]
, followed by -
, followed by 3 digits ( \\d
), on word boundaries ( \\b
) [AZ]
,后跟-
,后跟3位数( \\d
),字边界( \\b
)
The ForEach-Object
script block then outputs the part of each matching file name that matched the regex ( $_.Matches[0].Value
), so that only the relevant portions of matching file names are collected in $res
, as an array. 然后
ForEach-Object
脚本块输出与正则表达式匹配的每个匹配文件名的一部分( $_.Matches[0].Value
),以便只在$res
中收集匹配文件名的相关部分,作为数组。
This would be a good time to use regex. 这是使用正则表达式的好时机。 See https://regex101.com/r/AH00n6/1
请参阅https://regex101.com/r/AH00n6/1
and understand the following regex: 并了解以下正则表达式:
.*\s[#]*([A-Z]{4}-[0-9]{3}).*.csv
This is a little extra to capture just the names, but gives more insight into how to control the regex. 这只是一些额外的信息,只能捕获名称,但可以更深入地了解如何控制正则表达式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.