[英]Powershell - Parse datetime string to datetime object
I am trying to parse a file and convert a string date to a object datetime in en-US culture, but I get the error ArgumentOutOfRangeException.我正在尝试解析文件并将字符串日期转换为 en-US 文化中的 object 日期时间,但出现错误 ArgumentOutOfRangeException。
I am stuck on two errors:我陷入了两个错误:
BARCODE LOCATION LIBRARY STORAGEPOLICY RETAIN UNTILL DATE
------- -------- ------- ------------- ------------------
L40065L8 IEPort1 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_WEEK Wed Mar 31 10:13:07 2021
L40063L8 slot 1 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_MONTH Sun Mar 6 22:34:39 2022
L40072L8 slot 5 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_ANNUAL now
L40071L8 slot 6 DRP_TAPE_DRPLTO now
L40070L8 slot 7 DRP_TAPE_DRPLTO now
L40064L8 slot 8 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_MONTH Sat Mar 19 11:10:37 2022
$lines = [System.IO.File]::ReadAllLines("c:\temp\qmedia.txt")
$lines = $lines | Select-Object -Skip 2
$objects = $lines | % {
return [PSCustomObject]@{
BARCODE = $_.Substring(0,8).Trim()
LOCATION = $_.Substring(12,8).Trim()
LIBRARY = $_.Substring(24,15).Trim()
STORAGEPOLICY = $_.Substring(44,33).Trim()
RETAINUNTIL = [datetime]::ParseExact($_.Substring(78,25).Trim()), "a dd hh:mm:ss yyyy", [Globalization.CultureInfo]::CreateSpecificCulture('en-US'))
}
}
$objects
Can anyone help me?谁能帮我?
As @Matt mentioned in the comments, the first part of your problem is the data format - you're relying on the exact column widths being correct when you're using Substring(78, 25)
, which in the case of your data looks to be incorrect...正如@Matt 在评论中提到的那样,问题的第一部分是数据格式 - 当您使用Substring(78, 25)
时,您依赖于正确的列宽,对于您的数据看起来不正确……
PS> $line = "L40065L8 IEPort1 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_WEEK Wed Mar 31 10:13:07 2021 "
PS> $line.Substring(78)
ed Mar 31 10:13:07 2021
gives ed Mar 31 10:13:07 2021
instead of what you're probably expecting which is Wed Mar 31 10:13:07 2021
.给ed Mar 31 10:13:07 2021
而不是你可能期待的Wed Mar 31 10:13:07 2021
。
If you can , it would be better to change your data format to eg csv or json so you can extract the fields more easily, but if you can't do that you could try to dynamically calculate the column widths - eg:如果可以,最好将数据格式更改为例如 csv 或 json 以便您可以更轻松地提取字段,但如果您不能这样做,您可以尝试动态计算列宽 - 例如:
$columns = [regex]::Matches($lines[1], "-+").Index;
# 0
# 12
# 24
# 43
# 77
This basically finds the start position of each of the "------" heading underlines, and then you can do something like:这基本上找到每个“------”标题下划线的开始position,然后您可以执行以下操作:
$objects = $lines | % {
return [PSCustomObject] @{
BARCODE = $_.Substring($columns[0], $columns[1] - $columns[0]).Trim()
LOCATION = $_.Substring($columns[1], $columns[2] - $columns[1]).Trim()
LIBRARY = $_.Substring($columns[2], $columns[3] - $columns[2]).Trim()
STORAGEPOLICY = $_.Substring($columns[3], $columns[4] - $columns[3]).Trim()
RETAINUNTIL = [datetime]::ParseExact(
$_.Substring($columns[4]).Trim(),
"a dd hh:mm:ss yyyy",
[Globalization.CultureInfo]::CreateSpecificCulture("en-US")
)
}
}
Except now , we're getting this error:除了now ,我们收到此错误:
Exception calling "ParseExact" with "3" argument(s): "String 'Wed Mar 31 10:13:07 2021' was not recognized as a valid DateTime."
which we can fix with:我们可以解决:
[datetime]::ParseExact(
"Wed Mar 31 10:13:07 2021",
"ddd MMM dd HH:mm:ss yyyy",
[Globalization.CultureInfo]::CreateSpecificCulture("en-US")
)
# 31 March 2021 10:13:07
but you've also got this date format:但你也有这种日期格式:
Sun Mar 6 22:34:39 2022
(two spaces when the day part is a single digit) (当日部分是个位数时,两个空格)
so we need to use this overload of ParseExact
instead to allow both formats:所以我们需要使用ParseExact
的重载来允许两种格式:
[datetime]::ParseExact(
"Sun Mar 6 22:34:39 2022",
[string[]] @( "ddd MMM dd HH:mm:ss yyyy", "ddd MMM d HH:mm:ss yyyy"),
[Globalization.CultureInfo]::CreateSpecificCulture("en-US"),
"None"
)
and then we need to allow for the literal string now
, so your final code becomes:然后我们now
需要考虑文字字符串,因此您的最终代码变为:
$lines = [System.IO.File]::ReadAllLines("c:\temp\qmedia.txt")
$columns = [regex]::Matches($lines[1], "-+").Index;
$lines = $lines | Select-Object -Skip 2
$objects = $lines | % {
return [PSCustomObject] @{
BARCODE = $_.Substring($columns[0], $columns[1] - $columns[0]).Trim()
LOCATION = $_.Substring($columns[1], $columns[2] - $columns[1]).Trim()
LIBRARY = $_.Substring($columns[2], $columns[3] - $columns[2]).Trim()
STORAGEPOLICY = $_.Substring($columns[3], $columns[4] - $columns[3]).Trim()
RETAINUNTIL = if( $_.Substring($columns[4]).Trim() -eq "now" ) {
" " } else {
[datetime]::ParseExact(
$_.Substring($columns[4]).Trim(),
[string[]] @( "ddd MMM dd HH:mm:ss yyyy", "ddd MMM d HH:mm:ss yyyy"),
[Globalization.CultureInfo]::CreateSpecificCulture("en-US"),
"None"
)
}
}
}
$objects | ft
#BARCODE LOCATION LIBRARY STORAGEPOLICY RETAINUNTIL
#------- -------- ------- ------------- -----------
#L40065L8 IEPort1 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_WEEK 31/03/2021 10:13:07
#L40063L8 slot 1 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_MONTH 06/03/2022 22:34:39
#L40072L8 slot 5 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_ANNUAL
#L40071L8 slot 6 DRP_TAPE_DRPLTO
#L40070L8 slot 7 DRP_TAPE_DRPLTO
#L40064L8 slot 8 DRP_TAPE_DRPLTO _DRP_GLB_SECOND_COPY_TAPE_MONTH 19/03/2022 11:10:37
Update更新
Inspired by mklement0's answer , it might be useful to have a generalised parser for your file format - this returns a set of pscustomobjects with properties that match the file headers:受mklement0's answer 的启发,为您的文件格式提供通用解析器可能会很有用 - 这会返回一组 pscustomobjects,其属性与文件头匹配:
function ConvertFrom-MyFormat
{
param
(
[Parameter(Mandatory=$true)]
[string[]] $Lines
)
# find the positions of the underscores so we can access each one's index and length
$matches = [regex]::Matches($Lines[1], "-+");
# extract the header names from the first line using the
# positions of the underscores in the second line as a cutting guide
$headers = $matches | foreach-object {
$Lines[0].Substring($_.Index, $_.Length);
}
# process the data lines and return a custom objects for each one.
# (the property names will match the headers)
$Lines | select-object -Skip 2 | foreach-object {
$line = $_;
$values = [ordered] @{};
0..($matches.Count-2) | foreach-object {
$values.Add($headers[$_], $line.Substring($matches[$_].Index, $matches[$_+1].Index - $matches[$_].Index));
}
$values.Add($headers[-1], $line.Substring($matches[-1].Index));
new-object PSCustomObject -Property $values;
}
}
and your main code then just becomes a case of cleaning up and restructuring the result of this function:然后你的主要代码就变成了清理和重组这个 function 的结果的情况:
$lines = [System.IO.File]::ReadAllLines("c:\temp\qmedia.txt")
$objects = ConvertFrom-MyFormat -Lines $lines | foreach-object {
return new-object PSCustomObject -Property ([ordered] @{
BARCODE = $_.BARCODE.Trim()
LOCATION = $_.LOCATION.Trim()
LIBRARY = $_.LIBRARY.Trim()
STORAGEPOLICY = $_.STORAGEPOLICY.Trim()
RETAINUNTIL = if( $_."RETAIN UNTILL DATE".Trim() -eq "now" ) {
" " } else {
[datetime]::ParseExact(
$_."RETAIN UNTILL DATE".Trim(),
[string[]] @( "ddd MMM dd HH:mm:ss yyyy", "ddd MMM d HH:mm:ss yyyy"),
[Globalization.CultureInfo]::CreateSpecificCulture("en-US"),
"None"
)
}
})
}
$objects | ft;
mclayton's helpful answer provides good explanations and an effective solution. mclayton 的有用答案提供了很好的解释和有效的解决方案。
Let me complement it with an approach that:让我用一种方法来补充它:
-------
indicates a column.假设可以从分隔线(第 2 行)可靠地推断出列宽,这样相邻列分隔符之间的每个 substring (例如-------
表示一列。Note: The code requires PowerShell (Core) v6.2.1+, but could be adapted to work in Windows PowerShell too.注意:该代码需要 PowerShell (Core) v6.2.1+,但也可以适用于 Windows PowerShell。
$sepChar = '-' # The char. used on the separator line to indicate column spans.
Get-Content c:\temp\qmedia.txt | ForEach-Object {
$line = $_
switch ($_.ReadCount) {
1 {
# Header line: save for later analysis
$headerLine = $line
break
}
2 {
# Separator line: it is the only reliable indicator of column width.
# Construct a regex that captures the column values.
# With the sample input's separator line, the resulting regex is:
# (.{12})(.{12})(.{19})(.{34})(.{25})
# Note: Syntax requires PowerShell (Core) v6.2.1+
$reCaptureColumns =
$line -replace ('{0}+[^{0}]+' -f [regex]::Escape($sepChar)),
{ "(.{$($_.Value.Length)})" }
# Break the header line into column names.
if ($headerLine -notmatch $reCaptureColumns) { Throw "Unexpected header line format: $headerLine" }
# Save the array of column names.
$columnNames = $Matches[1..($Matches.Count - 1)].TrimEnd()
break
}
default {
# Data line:
if ($line -notmatch $reCaptureColumns) { Throw "Unexpected line format: $line" }
# Construct an ordered hashtable from the column values.
$oht = [ordered] @{ }
foreach ($ndx in 1..$columnNames.Count) {
$oht[$columnNames[$ndx-1]] = $Matches[$ndx].TrimEnd()
}
[pscustomobject] $oht # Convert to [pscustomobject] and output.
}
}
}
The above outputs a stream of [pscustomobject]
instances, which allow for robust, convenient further processing, such as the date parsing you require, as shown in mclayton's answer (of course, you could integrate this processing directly into the code above, but I wanted to show the fixed-width parsing solution alone).以上输出[pscustomobject]
实例的 stream 实例,它允许进行健壮、方便的进一步处理,例如您需要的日期解析,如 mclayton 的回答所示(当然,您可以将此处理直接集成到上面的代码中,但我想单独展示固定宽度的解析解决方案)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.