[英]How to optimize the performance of data pulling from a very large text file in excel via VBA
I want to get data regarding the value against a key cell value in a row. 我想获取有关该值对一行中的关键单元格值的数据。 The problem is that the file is really big, I have a .txt file that has around 54000 rows and 14 columns so as such the text file itself is of 20 mb, and over that I need to get the value of D column against the value in F column.
问题是该文件确实很大,我有一个.txt文件,该文件有大约54000行和14列,因此文本文件本身的大小为20 mb,并且我需要获取D列的值作为对F列中的值。 The values in column F are unique.
F列中的值是唯一的。
I have tried the direct approach till now to pull the data from .txt file and copy it to the sheet and then run a loop to get the attached value. 到目前为止,我一直尝试直接方法从.txt文件中提取数据并将其复制到工作表中,然后运行循环以获取附加值。
But the code is not able to pull data from the .txt file even after waiting for 15 minutes. 但是,即使等待15分钟,代码也无法从.txt文件中提取数据。
Do While bContinue = True
outRow = 1
sInputFile = Application.GetOpenFilename("Text Files (*.txt), *.txt")
If sInputFile = "False" Then
bContinue = False
Reset 'close any opened text file
Exit Sub
Else
outCol = outCol + 2
'process text file
fNum = FreeFile
Open sInputFile For Input As #fNum
Do While Not EOF(fNum)
outRow = outRow + 1
Line Input #fNum, sInputRecord
Sheets("Sheet1").Cells(outRow, outCol).Value = sInputRecord
Loop
Close #fNum
End If
Loop
errHandler:
Reset
End Sub
I expected it to take some time but it is taking forever to run this code which kills the purpose of using the macro. 我预计这将花费一些时间,但是运行该代码将花费很多时间,这扼杀了使用宏的目的。 I just request if someone has a better way to solve this issue.
我只是问是否有人有更好的方法来解决这个问题。
The first part of the code is missing but I guess you declared variables. 代码的第一部分丢失了,但是我想您已经声明了变量。 If not, that might help a little on performance.
如果没有,那可能会对性能有所帮助。
You can also try switching off calculations at the beginning of the process then switch them back in the end. 您也可以尝试在流程开始时关闭计算,然后在最后将其重新切换。
Application.Calculation = xlCalculationManual
'...
Application.Calculation = xlCalculationAutomatic
You are saying that you only need the 4th and 6th column from the text but you put the whole line into a cell. 您说的是,只需要文本中的第4列和第6列,但是您将整行都放在了一个单元格中。
If you really want to put only those two parts of a line into the sheet, you might want to do something like this: 如果您确实只想将行的这两部分放入工作表中,则可能需要执行以下操作:
With Sheets("Sheet1")
Do While Not EOF(fNum)
outRow = outRow + 1
Line Input #fNum, sInputRecord
.Cells(outRow, outCol).Value = Split(sInputRecord,";")(3)
.Cells(outRow, outCol+1).Value = Split(sInputRecord,";")(5)
Loop
End With
Change the semicolon to whatever character the separator is in the txt file. 将分号更改为txt文件中分隔符的任何字符。
Pleases try this and feed back. 请尝试一下并反馈。
Sub TryMe()
Dim cN As ADODB.Connection '* Connection String
Dim RS As ADODB.Recordset '* Record Set
Dim sQuery As String '* Query String
On Error GoTo ADO_ERROR
cN = New ADODB.Connection
cN.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\temp\;Extended Properties=""text;HDR=Yes;FMT=Delimited(,)"";Persist Security Info=False"
cN.ConnectionTimeout = cN.Open()
RS = New ADODB.Recordset
sQuery = "Select * From VBA.csv ORDER BY ID"
RS.ActiveConnection = cN
RS.Source = sQueryRS.Open()
If RS.EOF <> True Then
While RS.EOF = False
Open "c:\temp\vba_sorted.csv" For Append As 1
Print #1, RS.Fields(0) & "," & RS.Fields(1); RS.MoveNext()
Close #1
End If
If Not RS Is Nothing Then RS = Nothing
If Not cN Is Nothing Then cN = Nothing
ADO_ERROR:
If Err <> 0 Then
Debug.Assert (Err = 0)
MsgBox (Err.Description)
Resume Next
End If
End Sub
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.