[英]How to read quoted field from CSV using VBScript
In a sample.csv file, which has fixed number of columns, I have to extract a particular field value and store it in a variable using VBScript. 在具有固定列数的sample.csv文件中,我必须提取特定的字段值并使用VBScript将其存储在变量中。
sample.csv
100,SN,100.SN,"100|SN| 435623| serkasg| 15.32|
100|SN| 435624| serkasg| 15.353|
100|SN| 437825| serkasg| 15.353|"," 0 2345"
101,SN,100.SN,"100|SN| 435623| serkasg| 15.32|
100|SN| 435624| serkasg| 15.353|
100|SN| 437825| serkasg| 15.353|"," 0 2346"
I want to parse the last two fields which are within double quotes and store them in two different array variables for each row. 我想解析双引号内的最后两个字段,并将它们存储在每行的两个不同的数组变量中。
You could try using an ADO connection 您可以尝试使用ADO连接
Option Explicit
dim ado: set ado = CreateObject("ADODB.Connection")
ado.ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=c:\txtFilesFolder\;Extended Properties=""text;HDR=No;FMT=Delimited"";"
ado.open
dim recordSet: set recordSet = ado.Execute("SELECT * FROM [samples.csv]")
dim field3, field4
do until recordSet.EOF
field3 = recordSet.Fields(3).Value
field4 = recordSet.Fields(4).Value
' use your fields here
recordSet.MoveNext
loop
recordSet.close
ado.close
You may have an issue if those fields are greater than 255 characters in length - if they are, they may return truncated. 如果这些字段的长度超过255个字符,则可能会出现问题 - 如果是,则可能会返回截断的字段。 You also may have better luck with ODBC or ACE connection strings instead of the Jet one I've used here.
您也可以使用ODBC或ACE连接字符串,而不是我在这里使用的Jet。
Since CSV's are comma-separated, you can use the Split()
function to separate the fields into an array: 由于CSV是逗号分隔的,因此您可以使用
Split()
函数将字段分隔为数组:
' Read a line from the CSV...
strLine = myCSV.ReadLine()
' Split by comma into an array...
a = Split(strLine, ",")
Since you have a static number of columns (5), the last field will always be a(4)
and the second-to-last field will be a(3)
. 由于您具有静态列数(5),因此最后一个字段将始终为
a(4)
,倒数第二个字段将为a(3)
。
Your CSV data seems to contain 2 embedded hard returns (CR, LF) per line. 您的CSV数据似乎每行包含2个嵌入式硬回车(CR,LF)。 Then the first line
ReadLine
returns is: 然后第一行
ReadLine
返回:
100,SN,100.SN,"100|SN| 435623| serkasg| 15.32|
100,SN,100.SN,“100 | SN | 435623 | serkasg | 15.32 |
The solution below unwraps these lines before extracting the required fields. 下面的解决方案在提取必填字段之前解开这些行。
Option Explicit
Const ForReading = 1
Const ForAppending = 8
Const TristateUseDefault = 2 ' Opens the file using the system default.
Const TristateTrue = 1 ' Opens the file as Unicode.
Const TristateFalse = 0 ' Opens the file as ASCII.
Dim FSO, TextStream, Line, LineNo, Fields, Field4, Field5
ExtractFields "sample.csv"
Sub ExtractFields(FileName)
Set FSO = CreateObject("Scripting.FileSystemObject")
If FSO.FileExists(FileName) Then
Line = ""
LineNo = 0
Set TextStream = FSO.OpenTextFile(FileName, ForReading, False, TristateFalse)
Do While Not TextStream.AtEndOfStream
Line = Line & TextStream.ReadLine()
LineNo = LineNo + 1
If LineNo mod 3 = 0 Then
Fields = Split(Line, ",")
Field4 = Fields(3)
Field5 = Fields(4)
MsgBox "Line " & LineNo / 3 & ": " & vbNewLine & vbNewLine _
& "Field4: " & Field4 & vbNewLine & vbNewLine _
& "Field5: " & Field5
Line = ""
End If
Loop
TextStream.Close()
Else
MsgBox "File " & FileName & " ... Not found"
End If
End Sub
Here is an alternative solution that allows for single or multiline CSV records. 这是一种允许单行或多行CSV记录的替代解决方案。 It uses a regular expression which simplifies the logic for handling multiline records.
它使用正则表达式,简化了处理多行记录的逻辑。 This solution does not remove CRLF characters embedded in a record;
此解决方案不会删除记录中嵌入的CRLF字符; I've left that as an exercise for you :)
我已经把它作为锻炼了:)
Option Explicit
Const ForReading = 1
Const ForAppending = 8
Const TristateUseDefault = 2 ' Opens the file using the system default.
Const TristateTrue = 1 ' Opens the file as Unicode.
Const TristateFalse = 0 ' Opens the file as ASCII.
Dim FSO, TextStream, Text, MyRegExp, MyMatches, MyMatch, Field4, Field5
ExtractFields "sample.csv"
Sub ExtractFields(FileName)
Set FSO = CreateObject("Scripting.FileSystemObject")
If FSO.FileExists(FileName) Then
Set MyRegExp = New RegExp
MyRegExp.Multiline = True
MyRegExp.Global = True
MyRegExp.Pattern = """([^""]+)"",""([^""]+)"""
Set TextStream = FSO.OpenTextFile(FileName, ForReading, False, TristateFalse)
Text = TextStream.ReadAll
Set MyMatches = MyRegExp.Execute(Text)
For Each MyMatch in MyMatches
Field4 = SubMatches(0)
Field5 = SubMatches(1)
MsgBox "Field4: " & vbNewLine & Field4 & vbNewLine & vbNewLine _
& "Field5: " & vbNewLine & Field5, 0, "Found Match"
Next
Set MyMatches = Nothing
TextStream.Close()
Else
MsgBox "File " & FileName & " ... Not found"
End If
End Sub
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.