簡體   English   中英

通過在Excel中使用正則表達式和格式讀取文本文件

[英]read text file by using regular expression and format in Excel

我有一個txt文件,其中包括一些注釋行和很多數據行,如下所示

XYZ3-CCAV::[2] mcb XYZ3 hpy diag ce56 dsc
[UT000029118.494] XYZ3:mcb >> LN (CDRxN  , UC_CFG,XTP_RST,STP) SD LCK XRMPP CLK90 CLKP1 PF(M,L) VGA DCO P1kII M1kII  EPD(1,2,3,4,5,6)       XTMPP  AMAP(n1,m,p1,2,3,rpara)   Head(L,R,U,D)  LINK_TIME
[UT000029118.495] XYZ3:mcb >>  0 (OSx1:x1, 0x0c,     0,0,    0)  1*  1*   0    44     2     0,1   17   4 205    0  30,  2,  2, -2,  1,  1      0   22, 90, 0, 0, 0, 0     296,464,153,155    57.6
[UT000029118.495] XYZ3:mcb >>  1 (OSx1:x1, 0x0c,     0,0,    0)  1*  1*   0    44     0     0,1   17   2 202    0  31,  2, -1,  5, -1,  1      0   22, 90, 0, 0, 0, 0     296,464,155,155    58.5
[UT000029118.496] XYZ3:mcb >>  2 (OSx1:x1, 0x0c,     0,0,    0)  1*  1*   0    43     0     0,1   17   0 209    0  33,  1,  0,  1,  3, -3      0   22, 90, 0, 0, 0, 0     312,449,159,159    60.1
[UT000029118.497] XYZ3:mcb >>  3 (OSx1:x1, 0x0c,     0,0,    0)  1*  1*   1    45     0     0,1   17   6 202    0  33,  2,  0, -1,  3,  0      0   22, 90, 0, 0, 0, 0     328,449,153,159    60.3
[UT000029118.497] XYZ3:mcb >> 

XYZ3-CCAV::[2] Headscan 51 0 0xf 0
Headscan: min_dwell_bits 100000
Headscan: max_dwell_bits 100000000

我可以使用Excel內置正則表達式(VBS)提取數據行

[UT000029118.495] XYZ3:mcb >>  0 (OSx1:x1, 0x0c,     0,0,    0)  1*  1*   0    44     2     0,1   17   4 205    0  30,  2,  2, -2,  1,  1      0   22, 90, 0, 0, 0, 0     296,464,153,155    57.6
[UT000029118.495] XYZ3:mcb >>  1 (OSx1:x1, 0x0c,     0,0,    0)  1*  1*   0    44     0     0,1   17   2 202    0  31,  2, -1,  5, -1,  1      0   22, 90, 0, 0, 0, 0     296,464,155,155    58.5
[UT000029118.496] XYZ3:mcb >>  2 (OSx1:x1, 0x0c,     0,0,    0)  1*  1*   0    43     0     0,1   17   0 209    0  33,  1,  0,  1,  3, -3      0   22, 90, 0, 0, 0, 0     312,449,159,159    60.1
[UT000029118.497] XYZ3:mcb >>  3 (OSx1:x1, 0x0c,     0,0,    0)  1*  1*   1    45     0     0,1   17   6 202    0  33,  2,  0, -1,  3,  0      0   22, 90, 0, 0, 0, 0     328,449,153,159    60.3

我嘗試使用以下代碼將數據行寫入Excel文件(在Excel文件中創建了名為“ EyeInfo”的工作表):

Sub open_log_file()
    Dim Full_Name As String, text As String, textline As String
    Dim ws As Worksheet 'Used to Store file path and file name

    'Set up worksheet
    Set ws = Worksheets("EyeInfo")
    ws.UsedRange.Clear

    'Call the Window to open the file
    Full_Name = Application.GetOpenFilename("Diag Log File(*.log;*.txt;*.*),*.log;*.txt;*.*")

    'read the file
    Open Full_Name For Input As #1
    Do Until EOF(1)
        Line Input #1, textline
        text = text & textline
    Loop
    Close #1

    ' define regular expression
    Dim regEx_CE As Object
    Set regEx_CE = CreateObject("VBScript.RegExp")

    With regEx_CE
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "\w*\[\d+\]\s+mcb\s+XYZ3\s+hpy\s+diag\s+(ce\d+)\s+dsc"
    End With

    Dim regEx_LN As Object
    Set regEx_LN = CreateObject("VBScript.RegExp")

    With regEx_LN
        .Global = True
        .MultiLine = True
        .IgnoreCase = False
        .Pattern = "\[\w*\.\w*\]\s*\w*:\w*\s*>>\s*\d+.*"
    End With

    ' Execute the match process line by line and put the data in Excel/EyeInfo
    Set CE_match = regEx_CE.Execute(text)
    Set LN_match = regEx_LN.Execute(text)
    ws.Cells(1, 1) = Full_Name
    ws.Cells(2, 1) = "Number of Ports to Be Extracted"
    ws.Cells(2, 2) = CE_match.Count
    For i = 0 To CE_match.Count - 1
        ws.Cells(i * 4 + 3, 1) = CE_match(i).Value
        ws.Cells(i * 4 + 3, 2) = LN_match(i * 4 + 0).Value
        ws.Cells(i * 4 + 4, 2) = LN_match(i * 4 + 1).Value
        ws.Cells(i * 4 + 5, 2) = LN_match(i * 4 + 2).Value
        ws.Cells(i * 4 + 6, 2) = LN_match(i * 4 + 3).Value
    Next
End Sub

我想做的是將數據放入以空格或逗號分隔的行中,以便可以將數據行中的每個數據很好地放入行的每個單元格中。 但是此代碼將整個數據行放在Excel中的單個單元格中。

絕對需要您的代碼和數據來解決此問題。 盡管可以更改其他內容,但是基本問題是您的例程讀取文本文件。 該例程正在刪除所有EOL令牌。

當使用Line Input語句時,回車換行序列將被跳過,而不是附加到字符串中。

因此,發生這種情況時,您的regEx_LN模式將僅讀取一行,因為模式末尾的*表示將讀取所有內容,直到到達EOL或字符串的末尾為止。 text中只有一行,已讀入整個文件(從起點開始)。

通過以下更改,您的例程將對數據起作用:

'read the file
Open Full_Name For Input As #1
Do Until EOF(1)
    Line Input #1, textline
    text = text & vbCrLf & textline
Loop
Close #1

text = Mid(text, 2) 'remove first crlf

進行修改並運行代碼后,結果如下所示:

在此處輸入圖片說明

在最初的問題中,您表示您還希望根據分隔符(空格或逗號)將數據行分成幾列。

而且,正如@AnsgarWiechers在下面的評論中強調的那樣,一步讀取整個文件比在一行中單獨讀取並連接起來更容易。

在他的評論中,他展示了使用Line Input方法執行的一行。

我更喜歡一般使用FileSystemObject讀取文本文件。 在某些情況下,數據格式和讀取要求可能會導致Line Input方法出現問題。

下面是代碼

  • 使用FSO一步讀取整個文件
  • 還將數據線解析為單個單元格

=======================================

Sub open_log_file()

Dim Full_Name As String, text As String, textline As String
Dim ws  As Worksheet 'Used to Store file path and file name

'Set up worksheet
Set ws = Worksheets("EyeInfo")
ws.UsedRange.Clear

'Call the Window to open the file
Full_Name = Application.GetOpenFilename("Diag Log File(*.log;*.txt;*.*),*.log;*.txt;*.*")

'read the file
'Open Full_Name For Input As #1
'Do Until EOF(1)
'    Line Input #1, textline
'    text = text & vbCrLf & textline
'Loop
'Close #1

'text = Mid(text, 2)

'Using FSO to read the file
Dim FSO As Object
Dim TS As Object

Set FSO = CreateObject("Scripting.FileSystemObject")
Set TS = FSO.OpenTextFile(Full_Name, ForReading)
text = TS.ReadAll


' define regular expression
Dim regEx_CE As Object
Set regEx_CE = CreateObject("VBScript.RegExp")

With regEx_CE
    .Global = True
    .MultiLine = True
    .IgnoreCase = False
    .Pattern = "\w*\[\d+\]\s+mcb\s+XYZ3\s+hpy\s+diag\s+(ce\d+)\s+dsc"
End With

Dim regEx_LN As Object
Set regEx_LN = CreateObject("VBScript.RegExp")

With regEx_LN
    .Global = True
    .MultiLine = True
    .IgnoreCase = False
    .Pattern = "\[\w*\.\w*\]\s*\w*:\w*\s*>>\s*\d+.*"
End With

' Execute the match process line by line and put the data in Excel/EyeInfo
Set CE_match = regEx_CE.Execute(text)
Set LN_match = regEx_LN.Execute(text)
ws.Cells(1, 1) = Full_Name
ws.Cells(2, 1) = "Number of Ports to Be Extracted"
ws.Cells(2, 2) = CE_match.Count
For i = 0 To CE_match.Count - 1
    ws.Cells(i * 4 + 3, 1) = CE_match(i).Value
    ws.Cells(i * 4 + 3, 2) = LN_match(i * 4 + 0).Value
    ws.Cells(i * 4 + 4, 2) = LN_match(i * 4 + 1).Value
    ws.Cells(i * 4 + 5, 2) = LN_match(i * 4 + 2).Value
    ws.Cells(i * 4 + 6, 2) = LN_match(i * 4 + 3).Value

    ws.Range(ws.Cells(i * 4 + 3, 2), ws.Cells(i * 4 + 6, 2)).TextToColumns _
        DataType:=xlDelimited, _
        textqualifier:=xlTextQualifierNone, _
        consecutivedelimiter:=True, _
        Tab:=False, _
        semicolon:=False, _
        comma:=True, _
        Space:=True, _
        other:=False

Next

End Sub

=======================================

這是數據的結果:

在此處輸入圖片說明

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM