简体   繁体   中英

Accessing entries in a csv files for computation F#

How can I access the entries in a csv file in order to perform calculations on them in F#?

I can read the csv file into memory in the usual way, but once there I am stuck.

Ideally I would just create arrays from the columns and then use array.map2 to perform calculations.

So I array 1 is some website usage metric, and column 2 is the number of users that reached the value in column 1 (say made 6 visits to a website) we could calculate the mean number of visits by multiplying each entry in an array of column 1, by an array made from column 2 and dividing by the array.sum of column 2.

I have tried the csv to Array code on F# snippets, http://fssnip.net/3T , but it produces and array for me that is a series of string tuples.

Can anyone suggest a better approach?

EDIT: Some sample input would be similar to this:-

     Visits Count
     1  8
     2  9
     3  5
     4  3
     5  2
     6  1
     7  1
    10  1

And the output would be to return the mean of the data, in this case 2.87 (to 2 decimal places).

EDIT 2: The current output from the CSV to array code I found is this

     val it : seq<BookWindow> =
            seq [{Visits = 1;
                  Count = 8;}; {Visits = 2;
                           Count = 9;}; {Visits = 3;
                                  Count = 5;}; {Visits = 4;
                                              Count = 3;}; ...]

which is not so useful for calculations...

What I do is create a record type so I can use strongly typed operations lateron, and then read the textfile into a seq<myRecord> very quickly like this code below. If i intend to reuse this lateron I usually move the method to the record as static member fromFile . The seq is very useful if you work with large textfiles as I do usually, it uses very little memory this way.

edit this is cleaner:

open System.IO

type myRecord = { 
    Visits: int
    Count: int 
} with
    static member fromFile file = 
        file
        |> File.ReadLines       // expose as seq<string>
        |> Seq.skip 1           // skip headers
        |> Seq.map (fun s-> s.Split '\t') // split each line into array
        |> Seq.map (fun a -> {Visits=int a.[0]; Count=int a.[1]}) // and create record

myRecord.fromFile @"D:\data.csv"
|> Seq.fold (fun (tv, tc) r -> (tv+r.Visits*r.Count, tc+r.Count))(0,0)
|> (fun t -> float (fst t) / float (snd t))
//val mean : float = 2.866666667

It is worth adding that with F# 3.0 type providers, accessing CSV files is getting a lot simpler. A type provider can take a look at the CSV data statically during compilation and generate the type to represent columns (like BookWindow ) and then it infers data types of individual columns.

For example, take a look at the "Using the Yahoo Finance Type Provider" article under "Financial Modeling" on the new version of Try F# web site . You can write something like:

#r "Samples.Csv.dll"

// Type provider that generates schema based on CSV file located online
[<Literal>]
let url = "http://ichart.finance.yahoo.com/table.csv?s=MSFT"
let msft = new Samples.FSharp.CsvProvider.MiniCsv<url>()

// The provider automatically infers the structure and we
// can access columns as properties of the 'row' object
for row in msft.Data do
  printfn "%A %f" row.Date row.Close

As far as I know, the most recent publicly available version of the CSV provider is in the F# 3.0 Sample Pack . I have a possibly better version that also handles type inference on my GitHub repo .

Once you have the data in memory, you can do calculations using standard F# functions. For example, to calculate the average closing stock price (you can try that on Try F#), you can write:

 Seq.average [ for row in msft.Data -> row.Close ]

This generates a list with just closing prices and then calls standard average function on the numbers.

You probably are overcomplicating things and this is not the cleanest solution, but you can still work with what you have. Map that BookWindow type into separate arrays if that provides a good way to do your calculations.

 type BookWindow = { Visits: int
                     Count: int }
 // Sample data
 let list = [|{Visits = 1; Count = 8;}; {Visits = 2; Count = 9;}; {Visits = 3; Count = 5;}|]

 let visitcol = list |> Array.map (fun r -> r.Visits)
 let countcol = list |> Array.map (fun r -> r.Count)
 Array.map2( fun v c -> v * c) visitcol countcol

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM