[英]Delete Blank Cells - 146,459 rows
I hope you can help me with this issue. 希望您能帮我解决这个问题。
I have an Excel file with 146,459 rows and I need to delete blank cells to unify my data. 我有一个包含146,459行的Excel文件,我需要删除空白单元格以统一我的数据。 Here is an image of what I mean:
这是我的意思的图片:
When I select all blanks, my laptop takes around 2 minutes, but then when I try to delete the cells from one or more columns and shift up, Excel freezes and nothing happen. 当我选择所有空格时,我的笔记本电脑大约需要2分钟,但是当我尝试从一个或多个列中删除单元格并向上移动时,Excel冻结,并且什么也没有发生。 I already left my laptop for over 1 hours like that and I didn't have any results.
像这样,我已经离开笔记本电脑超过1个小时,但没有任何结果。
Do you know if there's a way to do it or if any alternatives can be implemented? 您知道是否有办法做到或是否可以实施任何替代方案?
Thanks in advance! 提前致谢!
Looping through cells takes a very long time, even with the Union optimisation. 即使在联盟进行了优化的情况下,遍历各个单元也需要很长时间。 The code below was tested on an imitated data set, 5 columns x 200,000 records, and finished in 5.5 seconds.
下面的代码在模仿的数据集上进行了测试,该数据集包含5列x 200,000条记录,并在5.5秒内完成。
Setup: Let's say your source data is in a range "A1:E200000" on a sheet named "Source", and you want clean data in a similar range on a sheet named "Target". 设置:假设您的源数据在名为“源”的工作表中的范围为“ A1:E200000”,并且您希望在名为“目标”的工作表中的相似范围中的干净数据。
Code: 码:
Option Explicit
Sub Remove_Empty_Cells()
Dim Source As Range
Dim Target As Range
Dim i As Integer
Set Source = ThisWorkbook.Sheets("Source").Range("A1:E200000")
Set Target = ThisWorkbook.Sheets("Target").Range("A1:E200000")
For i = 1 To Source.Columns.Count
Clean_Column Source.Columns(i), Target.Columns(i)
Next i
End Sub
Sub Clean_Column(Source As Range, Target As Range)
Dim rs As Object
Dim XML As Object
Set XML = CreateObject("MSXML2.DOMDocument")
XML.LoadXML Source.Value(xlRangeValueMSPersistXML)
Set rs = CreateObject("ADODB.Recordset")
rs.Open XML
rs.Filter = rs.Fields(0).Name & "<>null"
Target.CopyFromRecordset rs
End Sub
How it works: Sub Remove_Empty_Cells loops though the source range by columns, and calls sub "Clean_Column" that removes empty cells from the provided column. 工作原理:Sub Remove_Empty_Cells按列循环遍历源范围,并调用sub“ Clean_Column”从提供的列中删除空单元格。
Clean_Column loads all column cells into an ADO recordset using MSXML2.DOMDocument object. Clean_Column使用MSXML2.DOMDocument对象将所有列单元格加载到ADO记录集中。 The recordset is then filtered for non-empty rows, and the result is copied to the target column.
然后,对记录集进行过滤以查找非空行,并将结果复制到目标列。 All these operations are very fast in VBA.
所有这些操作在VBA中都非常快。
Ideally, I would love to load the entire range into a recordset at once, but unfortunately VBA function CopyFromRecordset does not alow to paste recordset field by field. 理想情况下,我希望立即将整个范围加载到一个记录集中,但是不幸的是,VBA函数CopyFromRecordset不允许逐字段粘贴记录集。 So we have to load the data column by column (if somebody knows a more optimal way, I'd love to see it).
因此,我们必须逐列加载数据(如果有人知道更好的方法,我很乐意看到它)。
A couple of caveats: 注意事项:
[EDIT]: An alternative solution, implemented using arrays. [编辑]:使用数组实现的替代解决方案。 The same data set 5x 200,000 with 40,000 valid records is cleaned in less than 1 second.
同一数据集5x 200,000和40,000个有效记录的清除时间不到1秒。 It can be further optimized, I just prototyped a quick demo.
可以进一步优化它,我只是制作了一个快速演示的原型。
Sub Remove_Empty_Cells()
Dim Source_Data() As Variant
Dim Clean_Data() As Variant
Dim Source_Range As Range
Dim Target_Range As Range
Dim Column_Count As Long
Dim Row_Count As Long
Dim i As Long
Dim j As Long
Dim k As Long
Set Source_Range = ThisWorkbook.Sheets("Source").Range("A1:E200000")
Column_Count = Source_Range.Columns.Count
Row_Count = Source_Range.Rows.Count
ReDim Source_Data (1 To Row_Count, 1 To Column_Count)
ReDim Clean_Data (1 To Row_Count, 1 To Column_Count)
Source_Data = Source_Range
For j = 1 To Column_Count
k = 1
For i = 1 To Row_Count
If Source_Data(i, j) <> "" Then
Clean_Data(k, j) = Source_Data(i, j)
k = k + 1
End If
Next i
Next j
Set Target_Range = ThisWorkbook.Sheets("Target").Range("A1").Resize(Row_Count, Column_Count)
Target_Range = Clean_Data
End Sub
Working with arrays is either one of the fastest or the fastest method of dealing with large ranges of cells. 处理数组是处理大范围单元格的最快方法或最快方法之一。
Start with: 从...开始:
Run code: 运行代码:
Option Explicit
Sub delBlanks()
Dim i As Long, j As Long, k As Long, arr As Variant, vals As Variant
Dim s As Double, e As Double, c As Long
s = Timer
With Worksheets("sheet6")
If .AutoFilterMode Then .AutoFilterMode = False
'data validity check
c = Application.CountA(.Columns(1))
For j = 2 To 5
If c <> Application.CountA(.Columns(j)) Then Exit For
Next j
If j <= 5 Then
Debug.Print "GIGO, waste of time to continue"
Exit Sub
End If
'collect offset values
vals = .Range(.Cells(2, "A"), .Cells(.Rows.Count, "E").End(xlUp)).Value2
ReDim arr(LBound(vals, 1) To UBound(vals, 1), _
LBound(vals, 2) To UBound(vals, 2))
'loop through array coolating A"E to a single row
i = LBound(vals, 1)
k = LBound(arr, 1)
Do
For j = LBound(vals, 2) To UBound(vals, 2)
Do While vals(i, j) = vbNullString: i = i + 1: Loop
arr(k, j) = vals(i, j)
Next j
i = i + 1: k = k + 1
Loop Until i > UBound(vals, 1)
'put data back on worksheet
.Cells(2, "A").Resize(UBound(arr, 1), UBound(arr, 2)) = arr
.Cells(2, "C").Resize(UBound(arr, 1), 1).NumberFormat = "dd/mm/yyyy"
End With
e = Timer
Debug.Print c - 1 & " records in " & UBound(vals, 1) & _
" rows collated in " & Format((e - s), "0.000") & " seconds"
End Sub
Results: 结果:
30000 records in 157500 rows collated in 0.984 seconds
Seeded data: 种子数据:
The following was used to replicate the OP 'sample-data-in-an-image'. 以下内容用于复制OP“图像中的样本数据”。
Sub fillBlanks()
Dim i As Long, j As Long, k As Long, arr As Variant, vals As Variant
vals = Array("to: ""someone"" <someone@null.com", "from: ""no one"" <no_one@null.com", _
Date, "\i\m\p\o\r\t\a\n\c\e\: 0", "subject: something nothing")
ReDim arr(1 To 6, 1 To 5)
With Worksheets("sheet6")
.Cells(1, 1).CurrentRegion.Offset(1, 0).Clear
For k = 1 To 30000
j = 0
For i = LBound(arr, 2) To UBound(arr, 2)
If i = 2 And Not CBool(k Mod 4) Then j = j + 1
If i = 4 Then
arr(i + j, i) = Format(k, vals(i - 1))
Else
arr(i + j, i) = vals(i - 1)
End If
Next i
.Cells(.Rows.Count, 5).End(xlUp).Offset(1, -4).Resize(UBound(arr, 1), UBound(arr, 2)) = arr
ReDim arr(1 To 6, 1 To 5)
Next k
End With
End Sub
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.