简体   繁体   中英

how can I reference multiple columns in a single row within a DataTable as a two-dimensional array?

The basic question is how can I reference multiple contiguous columns in a single row within a DataTable as a two-dimensional array which can be processed with For-Next structures? Here's the background:

The program in question loads data from a .csv file where each line/row contains basic identity information about a person, followed by their numeric answers to two dozen questions. The program cycles through each line of the .csv file and identifies the five other lines which have the highest number of exact answer matches to the current line.

A DataTable seems to be the best structure to read the .csv file into, but am not sure how to reference the last x columns of each row as an array of the form answer(person,question).

In case this seems either really easy or totally impractical, I should make the following disclaimer: The program code is already written and working, but I'm in the process of re-coding it from QuickBASIC 4 (yes, I did say QB4...) to VB.NET. The program is basically a dating program and I've been running it once a year for the last 20 years or so with the local school selling the matches as a fundraiser. It's gotten to the point where neither Windows 7 nor the latest patched version of Windows XP will run QB4, so I downloaded VS Express for Desktop and am using this as an opportunity to learn VB.NET. I've done a lot of (non-windowed) VBScript application scripting, but some really light dabbling into VB6 is my only experience with traditional VB. As everyone here is already well aware, the file I/O in .NET is very different than VB6 or prior. That's what I'm fighting now…

.....

To answer Zohar's question/comment:
Below is a sample of the .csv file format. The actual file is several hundred lines long, but all identical in form. Names and phone numbers have been changed for privacy. The fields are, in order:

LastName
FirstName
Phone# (if given, placeholders if not)
Sex (1=M;2=F)
Answer to Question 1 (1-4)
Answer to Question 2 (1-4)
....
Answer to Question 24 (1-4)

Mouse,Mickey,xxx-xxxx,1,2,3,3,2,3,1,3,4,2,1,4,3,1,1,2,1,2,1,1,1,2,1,1,4
Mouse,Minnie,555-9931,2,1,3,1,2,1,2,3,3,3,4,4,2,4,1,2,3,4,4,4,1,2,1,1,4
Duck,Donald,555-7024,1,2,3,4,2,4,3,4,2,2,1,4,2,4,1,2,1,1,2,1,3,2,1,1,1
McDuck,Scrooge,555-4824,1,2,3,3,2,1,2,4,3,2,4,4,2,4,1,4,2,2,4,4,3,2,1,1,4
GoodWitch,Wendy,xxx-xxxx,2,2,2,4,2,1,2,4,4,3,4,2,2,1,1,2,1,1,4,4,4,4,1,3,1

The reason for the two-dimensional array is to create a single-variable database of answers by user and question number. See below for the sorting portion of the actual existing QB4 code. The two-dimensional array below that I'm trying to bring to VB.NET standards is StudentAnswer(matchFrom, question).

    For matchFrom = 1 To numberSheets
        '
        'The following section of code finds the top maximumToMatch groups of n
        'matching questions per sheet
        '
        For x = 1 To maximumToMatch
            topMatches(x) = 0
            sheetsMatched(x) = 0
        Next x
        For matchTo = 1 To numberSheets
            If StudentSex(matchFrom) <> StudentSex(matchTo) Then
                numberMatched(matchTo) = 0
                highMatch = 0
                For question = 1 To numberQuestions
                    If StudentAnswer(matchFrom, question) = StudentAnswer(matchTo, question) Then
                        numberMatched(matchTo) = numberMatched(matchTo) + 1
                    End If
                Next question
                If numberMatched(matchTo) = topMatches(1) Then
                    sheetsMatched(1) = sheetsMatched(1) + 1
                End If
                If numberMatched(matchTo) > topMatches(1) Then
                    match = maximumToMatch
                    done = False
                    Do
                        If numberMatched(matchTo) = topMatches(match) Then
                            sheetsMatched(match) = sheetsMatched(match) + 1
                            done = True
                        End If
                        If numberMatched(matchTo) > topMatches(match) Then
                            For x = 1 To match - 1
                                topMatches(x) = topMatches(x + 1)
                                sheetsMatched(x) = sheetsMatched(x + 1)
                            Next x
                            topMatches(match) = numberMatched(matchTo)
                            sheetsMatched(match) = 1
                            done = True
                        Else
                            match = match - 1
                        End If
                    Loop Until done
                End If
            Else
                numberMatched(matchTo) = 0
            End If
        Next matchTo
    ...
    <additional code to narrow it down to a fixed number of sheet matches>
    Next matchFrom

And to anticipate two other likely questions:

The existing code is written to match M to F and vice versa. I'd like to make that more flexible during the re-write, but it's a rural area and I'm not really sure they're ready for that yet...

The reason for the data file being in .csv format is the lack of a formal data entry front-end ever being written for the program. That's been perpetually on the To-Do list, but in the mean time and since it's only ran once a year, Excel has been my friend... If all goes well I'll design a data entry screen during the VB.NET re-write.

Thanks in advance to everyone who takes the time to read this.

Now that LINQ is available (since .NET 3.5, so not new) to query lists of objects, using a DataTable may not be your best option, since your data is coming from a CSV file and not a database (LINQ can also be used for databases).

So, this isn't really an answer to your original question, but if you have input something like this:

Joe User,1,2,3,4,5
Jane User,2,2,3,4,6
Jack User,3,4,5,2,8
Jill User,5,3,1,8,6

You could define a class to store the data:

Public Class UserInfo
    Public Property Name As String
    Public Property Answers As List(Of Integer) = New List(Of Integer)()
    Public Function MatchRating(other As UserInfo) As Integer
        Dim rating As Integer = 0
        For i = 0 To Me.Answers.Count - 1
            If Me.Answers(i) = other.Answers(i) Then
                rating += 1
            End If
        Next
        Return rating
    End Function
End Class

You could then read the CSV data into a list of UserInfo objects:

    Dim users = File.ReadLines("Data.csv").Select(
        Function(line)
            Dim parts = line.Split(","c)
            Dim user = New UserInfo() With {.Name = parts(0)}
            user.Answers.AddRange(parts.Skip(1).Select(Function(str) CInt(str)))
            Return user
        End Function
    ).ToList()

You could then find the best matches with something like this, which loops through the users and uses a LINQ query to find the top 5 matches based on number of answers matched (the UserInfo.MatchRating function), skipping any with no matches ( rating > 0 ):

    For Each user In users
        Console.WriteLine("{0}:{1}", user.Name, String.Join(",", user.Answers))
        Dim bestMatches = From u In users
                          Where u IsNot user
                          Let rating = u.MatchRating(user)
                          Where rating > 0
                          Order By rating Descending
                          Take 5
                          Select New With {.Name = u.Name, .Rating = rating}
        For Each match In bestMatches
            Console.WriteLine("  Match: {0}, rating: {1}", match.Name, match.Rating)
        Next
    Next

You will need to add properties to the UserInfo class for your actual identity information and adjust the code to match.

You will also need to make sure that your project options are set appropriately, and you will need the appropriate imports/references, eg:

Option Explicit On
Option Infer On
Option Strict On

Imports System.IO

For reference, the output of my test was (poor Jack, and I suppose you may need to adjust for sexual preference, eg Where u IsNot user AndAlso u.Sex <> user.Sex ):

Joe User:1,2,3,4,5
  Match: Jane User, rating: 3
Jane User:2,2,3,4,6
  Match: Joe User, rating: 3
  Match: Jill User, rating: 1
Jack User:3,4,5,2,8
Jill User:5,3,1,8,6
  Match: Jane User, rating: 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM