简体   繁体   中英

Excel VBA split CSV files a certain order

This codes takes a CSV file such as:

"Penn National Gaming, Inc.",16.28
"iShares 20 Year Treasury Bond E",118.88
"iShares MSCI Emerging Index Fun",42.40

Step 1

Line 0: ""Penn National Gaming, Inc.",16.28

Line 1: "iShares 20 Year Treasury Bond E",118.88

Line 2: "iShares MSCI Emerging Index Fun",42.40

Step 2

It takes Line 0 and makes it in:

Value 0: Penn National Gaming

Value 1: , Inc.

Value 2: 16.28

My question is: How can I make it as:

Value 0: Penn National Gaming Inc.

Value 1: 16.28

Essentially combine the full name (it may contain more than 1 comma) in Value 0 and keep value 1 as it is, but at the same time, still manage to comma separate the CSV delivered data. I was thinking of some sort of order (remove just 1 comma from Line 1 starting from end to beginning of the line, but I couldn't find a way to do it.

Thanks!

Dim Resp As String: Resp = Http.ResponseText
Dim Lines As Variant: Lines = Split(Resp, vbLf)
Dim sLine As String
Dim Values As Variant

For i = 0 To UBound(Lines)
    sLine = Lines(i)
       If InStr(sLine, ",") > 0 Then
        Values = Split(sLine, ",")

This was an interesting problem. I came up with a general function that will work for any number of non-quoted and quoted values in a csv line, where the quoted values may or may not contain commas.

Test Line: "Penn National Gaming, Inc.",16.28
Output:

    Value[0] = Penn National Gaming, Inc.  
    Value[1] = 16.28  

Test Line: a,b,c,"some, commas, here",16.28,"some,commas,there",17.123
Output:

    Value[0] = a  
    Value[1] = b  
    Value[2] = c  
    Value[3] = some, commas, here  
    Value[4] = 16.28  
    Value[5] = some,commas,there  
    Value[6] = 17.123 
  1. I first searched the line for pairs of quotes "...".

  2. Within each pair of quotes, I searched for commas, and replaced them with a character that I assume will never be present normally, replacementCharacter = "¯" , (you can choose a different character if you need to).

  3. Once the quoted commas were replaced, i split the line by commas using the Split() function.

  4. Then I iterated through the resulting array and replaced all replacementCharacters with commas.

I tested my code using the specific example given and a more general example of mixed quoted comma values, and values:

Code:

Function parseLine(sLine)
    Dim Value As Variant
    Dim i As Integer

    quote = """"
    delimiter = ","
    replacementCharacter = "¯"

    'get first pair of quotes
    currentQuoteIndex = InStr(1, sLine, quote) 'get first quote
    If (currentQuoteIndex = 0) Then
        nextQuoteIndex = 0
    Else
        nextQuoteIndex = InStr(currentQuoteIndex + 1, sLine, quote) 'get next quote
    End If

    'get pairs of quotes and replace commas with replacementCharacter
    Do While nextQuoteIndex <> 0 And currentQuoteIndex <> 0

        subString = Mid(sLine, currentQuoteIndex + 1, nextQuoteIndex - currentQuoteIndex - 1)
        subString = Replace(subString, comma, replacementCharacter)
        sLine = Left(sLine, currentQuoteIndex - 1) + subString + Right(Mid(sLine, nextQuoteIndex + 1), Len(sLine))

        'get next pair of quotes
        currentQuoteIndex = InStr(nextQuoteIndex + 1, sLine, quote) 'get first quote
        If (currentQuoteIndex = 0) Then
            nextQuoteIndex = 0
        Else
            nextQuoteIndex = InStr(currentQuoteIndex + 1, sLine, quote)  'get next quote
        End If
    Loop

    'split string by commas
    Values = Split(sLine, delimiter)

    'replace replacementCharacter with commas
    For i = 0 To UBound(Values)
        Values(i) = Replace(Values(i), replacementCharacter, delimiter)
    Next
    parseLine = Values
End Function

This function works any number of comma containing quoted strings, with columns in any order.

The following simple solution identifies the location of the last comma. This information is used to determine the location of the full name and price within the line. The end result is an array containing 2 values.

Note: Additional commas in full name are disregarded due to the non-split on comma "," process used

Dim Resp As String: Resp = Http.ResponseText
Dim Lines As Variant: Lines = Split(Resp, vbLf)
Dim sLine As String
Dim Values(1) As Variant

For i = 0 To UBound(Lines)
    sLine = Lines(i)   

    'Reduced complexity by avoiding the need to split on commas ","
    Values(0) = left(sLine,instrrev(sLine,",")-1)  'Full Name
    Values(1) = mid(sLine,instrrev(sLine,",")+1)   'Price value
Next

Using a Function

Dim Resp As String: Resp = Http.ResponseText
Dim Lines As Variant: Lines = Split(Resp, vbLf)
Dim sLine As String
Dim Values(1) As Variant

Function extractData(sLine as String)
  Dim tmpArray(1) As Variant

  'Reduced complexity by avoiding the need to split on commas ","
  tmpArray(0) = left(sLine,instrrev(sLine,",")-1)  'Full Name
  tmpArray(1) = mid(sLine,instrrev(sLine,",")+1)   'Price value

  extractData = tmpArray

End Function

For i = 0 To UBound(Lines)
    sLine = Lines(i)   
    Values = extractData(sLine)
Next

Output:

Value 0: Penn National Gaming, Inc.

Value 1: 16.28

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM