简体   繁体   中英

Efficent way of finding a specific row in .CSV that contains two specific values

My goal is to find a specific row in a.CSV file that contains 2 specific values. What I'm expecting is that if given a.csv file(for example):

0,0,0,0
1,2,3,4
5,6,7,8
9,10,11,12

I want to find out the row that contains 2 specific numbers in columns 3 and 4. For example, I want to find the row number that contains the value 7 in the third column and the value 8 in the fourth row. What is expected to happen is that the application returns row 3 as the answer.

An Important part is that I want the way to be efficient, as I have roughly 25,000 rows in the csv file and wouldn't want it to take a long time.

How can I do this?

Try this

Imports System.IO
Public Class Form1
    Dim csvPath As String = "C:\Book1.csv"
    Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
        Dim sr As New IO.StreamReader(csvPath, True)
        MessageBox.Show(getRow(sr, 7, 3, 8, 4))
        sr.Close()
    End Sub
    Function getRow(csv As StreamReader, v1 As Integer, Col1 As Integer, v2 As Integer, Col2 As Integer) As Long
        Dim r As Long = 1

        Do While csv.EndOfStream = False
            Dim s As String() = Split(csv.ReadLine(), ","c)
            If s.Contains(v1.ToString) And s.Contains(v2.ToString) Then
                If s.Count >= Col1 And s.Count >= Col2 Then
                    If s(Col1 - 1) = v1.ToString And s(Col2 - 1) = v2.ToString Then Return r
                End If
            End If
            r += 1
        Loop
        Return 0 'not found
    End Function
End Class

If that truly is a representation of your CSV file, then this will do it:

Dim line = File.ReadLines(path).FirstOrDefault(Function(line) line.EndsWith(",7,8") )

ReadLines reads incrementally, so it will stop as soon as it finds a result. There isn't any need to get into splitting the line etc, if things are truly as you say and that's not just a contrived example. If the real file is more complex, then you might need to parse it; there are CSV reading libraries aplenty, so don't need to reinvent the wheel.. 25,000 lines is fairly small, all things considered, so it wouldn't matter if it isn't "hand optimized in assembler"; only dig into optimizing if thests prove there is a serious performance problem

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM