[英]read text file by using regular expression and format in Excel
我有一個txt文件,其中包括一些注釋行和很多數據行,如下所示
XYZ3-CCAV::[2] mcb XYZ3 hpy diag ce56 dsc [UT000029118.494] XYZ3:mcb >> LN (CDRxN , UC_CFG,XTP_RST,STP) SD LCK XRMPP CLK90 CLKP1 PF(M,L) VGA DCO P1kII M1kII EPD(1,2,3,4,5,6) XTMPP AMAP(n1,m,p1,2,3,rpara) Head(L,R,U,D) LINK_TIME [UT000029118.495] XYZ3:mcb >> 0 (OSx1:x1, 0x0c, 0,0, 0) 1* 1* 0 44 2 0,1 17 4 205 0 30, 2, 2, -2, 1, 1 0 22, 90, 0, 0, 0, 0 296,464,153,155 57.6 [UT000029118.495] XYZ3:mcb >> 1 (OSx1:x1, 0x0c, 0,0, 0) 1* 1* 0 44 0 0,1 17 2 202 0 31, 2, -1, 5, -1, 1 0 22, 90, 0, 0, 0, 0 296,464,155,155 58.5 [UT000029118.496] XYZ3:mcb >> 2 (OSx1:x1, 0x0c, 0,0, 0) 1* 1* 0 43 0 0,1 17 0 209 0 33, 1, 0, 1, 3, -3 0 22, 90, 0, 0, 0, 0 312,449,159,159 60.1 [UT000029118.497] XYZ3:mcb >> 3 (OSx1:x1, 0x0c, 0,0, 0) 1* 1* 1 45 0 0,1 17 6 202 0 33, 2, 0, -1, 3, 0 0 22, 90, 0, 0, 0, 0 328,449,153,159 60.3 [UT000029118.497] XYZ3:mcb >> XYZ3-CCAV::[2] Headscan 51 0 0xf 0 Headscan: min_dwell_bits 100000 Headscan: max_dwell_bits 100000000
我可以使用Excel內置正則表達式(VBS)提取數據行
[UT000029118.495] XYZ3:mcb >> 0 (OSx1:x1, 0x0c, 0,0, 0) 1* 1* 0 44 2 0,1 17 4 205 0 30, 2, 2, -2, 1, 1 0 22, 90, 0, 0, 0, 0 296,464,153,155 57.6 [UT000029118.495] XYZ3:mcb >> 1 (OSx1:x1, 0x0c, 0,0, 0) 1* 1* 0 44 0 0,1 17 2 202 0 31, 2, -1, 5, -1, 1 0 22, 90, 0, 0, 0, 0 296,464,155,155 58.5 [UT000029118.496] XYZ3:mcb >> 2 (OSx1:x1, 0x0c, 0,0, 0) 1* 1* 0 43 0 0,1 17 0 209 0 33, 1, 0, 1, 3, -3 0 22, 90, 0, 0, 0, 0 312,449,159,159 60.1 [UT000029118.497] XYZ3:mcb >> 3 (OSx1:x1, 0x0c, 0,0, 0) 1* 1* 1 45 0 0,1 17 6 202 0 33, 2, 0, -1, 3, 0 0 22, 90, 0, 0, 0, 0 328,449,153,159 60.3
我嘗試使用以下代碼將數據行寫入Excel文件(在Excel文件中創建了名為“ EyeInfo”的工作表):
Sub open_log_file()
Dim Full_Name As String, text As String, textline As String
Dim ws As Worksheet 'Used to Store file path and file name
'Set up worksheet
Set ws = Worksheets("EyeInfo")
ws.UsedRange.Clear
'Call the Window to open the file
Full_Name = Application.GetOpenFilename("Diag Log File(*.log;*.txt;*.*),*.log;*.txt;*.*")
'read the file
Open Full_Name For Input As #1
Do Until EOF(1)
Line Input #1, textline
text = text & textline
Loop
Close #1
' define regular expression
Dim regEx_CE As Object
Set regEx_CE = CreateObject("VBScript.RegExp")
With regEx_CE
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "\w*\[\d+\]\s+mcb\s+XYZ3\s+hpy\s+diag\s+(ce\d+)\s+dsc"
End With
Dim regEx_LN As Object
Set regEx_LN = CreateObject("VBScript.RegExp")
With regEx_LN
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "\[\w*\.\w*\]\s*\w*:\w*\s*>>\s*\d+.*"
End With
' Execute the match process line by line and put the data in Excel/EyeInfo
Set CE_match = regEx_CE.Execute(text)
Set LN_match = regEx_LN.Execute(text)
ws.Cells(1, 1) = Full_Name
ws.Cells(2, 1) = "Number of Ports to Be Extracted"
ws.Cells(2, 2) = CE_match.Count
For i = 0 To CE_match.Count - 1
ws.Cells(i * 4 + 3, 1) = CE_match(i).Value
ws.Cells(i * 4 + 3, 2) = LN_match(i * 4 + 0).Value
ws.Cells(i * 4 + 4, 2) = LN_match(i * 4 + 1).Value
ws.Cells(i * 4 + 5, 2) = LN_match(i * 4 + 2).Value
ws.Cells(i * 4 + 6, 2) = LN_match(i * 4 + 3).Value
Next
End Sub
我想做的是將數據放入以空格或逗號分隔的行中,以便可以將數據行中的每個數據很好地放入行的每個單元格中。 但是此代碼將整個數據行放在Excel中的單個單元格中。
絕對需要您的代碼和數據來解決此問題。 盡管可以更改其他內容,但是基本問題是您的例程讀取文本文件。 該例程正在刪除所有EOL
令牌。
當使用Line Input
語句時,回車換行序列將被跳過,而不是附加到字符串中。
因此,發生這種情況時,您的regEx_LN
模式將僅讀取一行,因為模式末尾的*
表示將讀取所有內容,直到到達EOL或字符串的末尾為止。 在text
中只有一行,已讀入整個文件(從起點開始)。
通過以下更改,您的例程將對數據起作用:
'read the file
Open Full_Name For Input As #1
Do Until EOF(1)
Line Input #1, textline
text = text & vbCrLf & textline
Loop
Close #1
text = Mid(text, 2) 'remove first crlf
進行修改並運行代碼后,結果如下所示:
在最初的問題中,您表示您還希望根據分隔符(空格或逗號)將數據行分成幾列。
而且,正如@AnsgarWiechers在下面的評論中強調的那樣,一步讀取整個文件比在一行中單獨讀取並連接起來更容易。
在他的評論中,他展示了使用Line Input
方法執行的一行。
我更喜歡一般使用FileSystemObject
讀取文本文件。 在某些情況下,數據格式和讀取要求可能會導致Line Input
方法出現問題。
下面是代碼
=======================================
Sub open_log_file()
Dim Full_Name As String, text As String, textline As String
Dim ws As Worksheet 'Used to Store file path and file name
'Set up worksheet
Set ws = Worksheets("EyeInfo")
ws.UsedRange.Clear
'Call the Window to open the file
Full_Name = Application.GetOpenFilename("Diag Log File(*.log;*.txt;*.*),*.log;*.txt;*.*")
'read the file
'Open Full_Name For Input As #1
'Do Until EOF(1)
' Line Input #1, textline
' text = text & vbCrLf & textline
'Loop
'Close #1
'text = Mid(text, 2)
'Using FSO to read the file
Dim FSO As Object
Dim TS As Object
Set FSO = CreateObject("Scripting.FileSystemObject")
Set TS = FSO.OpenTextFile(Full_Name, ForReading)
text = TS.ReadAll
' define regular expression
Dim regEx_CE As Object
Set regEx_CE = CreateObject("VBScript.RegExp")
With regEx_CE
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "\w*\[\d+\]\s+mcb\s+XYZ3\s+hpy\s+diag\s+(ce\d+)\s+dsc"
End With
Dim regEx_LN As Object
Set regEx_LN = CreateObject("VBScript.RegExp")
With regEx_LN
.Global = True
.MultiLine = True
.IgnoreCase = False
.Pattern = "\[\w*\.\w*\]\s*\w*:\w*\s*>>\s*\d+.*"
End With
' Execute the match process line by line and put the data in Excel/EyeInfo
Set CE_match = regEx_CE.Execute(text)
Set LN_match = regEx_LN.Execute(text)
ws.Cells(1, 1) = Full_Name
ws.Cells(2, 1) = "Number of Ports to Be Extracted"
ws.Cells(2, 2) = CE_match.Count
For i = 0 To CE_match.Count - 1
ws.Cells(i * 4 + 3, 1) = CE_match(i).Value
ws.Cells(i * 4 + 3, 2) = LN_match(i * 4 + 0).Value
ws.Cells(i * 4 + 4, 2) = LN_match(i * 4 + 1).Value
ws.Cells(i * 4 + 5, 2) = LN_match(i * 4 + 2).Value
ws.Cells(i * 4 + 6, 2) = LN_match(i * 4 + 3).Value
ws.Range(ws.Cells(i * 4 + 3, 2), ws.Cells(i * 4 + 6, 2)).TextToColumns _
DataType:=xlDelimited, _
textqualifier:=xlTextQualifierNone, _
consecutivedelimiter:=True, _
Tab:=False, _
semicolon:=False, _
comma:=True, _
Space:=True, _
other:=False
Next
End Sub
=======================================
這是數據的結果:
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.