[英]Reading from a CSV file
I have a CSV file containing many rows and columns 2 of which are similar to: 我有一个包含许多行和列2的CSV文件,它们类似于:
Horizontal-1 Acc. Filename Horizontal-2 Acc. Filename RSN88_SFERN_FSD172.AT2 RSN88_SFERN_FSD262.AT2 RSN164_IMPVALL.H_H-CPE147.AT2 RSN164_IMPVALL.H_H-CPE237.AT2 RSN755_LOMAP_CYC195.AT2 RSN755_LOMAP_CYC285.AT2 RSN1083_NORTHR_GLE170.AT2 RSN1083_NORTHR_GLE260.AT2 RSN1614_DUZCE_1061-N.AT2 RSN1614_DUZCE_1061-E.AT2 RSN1633_MANJIL_ABBAR--L.AT2 RSN1633_MANJIL_ABBAR--T.AT2 RSN3750_CAPEMEND_LFS270.AT2 RSN3750_CAPEMEND_LFS360.AT2 RSN3757_LANDERS_NPF090.AT2 RSN3757_LANDERS_NPF180.AT2 RSN3759_LANDERS_WWT180.AT2 RSN3759_LANDERS_WWT270.AT2 RSN4013_SANSIMEO_36258021.AT2 RSN4013_SANSIMEO_36258111.AT2 RSN4841_CHUETSU_65004NS.AT2 RSN4841_CHUETSU_65004EW.AT2 RSN4843_CHUETSU_65006NS.AT2 RSN4843_CHUETSU_65006EW.AT2 RSN4844_CHUETSU_65007NS.AT2 RSN4844_CHUETSU_65007EW.AT2 RSN4848_CHUETSU_65011NS.AT2 RSN4848_CHUETSU_65011EW.AT2
In the CSV file I wanna look for the headers "Horizontal-1 Acc. Filename and Horizontal-2 Acc. Filename" and then line by line get the names of each row under these headers one at a time ? 在CSV文件中,我想查找标题“ Horizontal-1 Acc。Filename和Horizontal-2 Acc。Filename”,然后逐行获取这些标题下的每一行的名称?
Any suggestion ? 有什么建议吗?
Thanks RG. 谢谢RG。
package require csv
package require struct::matrix
::struct::matrix m
m add columns 2
set chan [open data.csv]
::csv::read2matrix $chan m
close $chan
lassign [m get row 0] header1 header2
for {set r 1} {$r < [m rows]} {incr r} {
puts -nonewline [format {%s = %-30s } $header1 [m get cell 0 $r]]
puts [format {%s = %s} $header2 [m get cell 1 $r]]
}
m destroy
I find that the easiest way to deal with csv data sets is by using a matrix
. 我发现处理csv数据集的最简单方法是使用matrix
。 A matrix
is sort of a two-dimensional vector with built-ins for searching, sorting and rearranging columns and rows. matrix
是一种二维向量,具有内置的搜索,排序和重新排列列和行的内置向量。
First, create a matrix and call it m
. 首先,创建一个矩阵并将其命名为m
。 It will have two columns from the beginning, but no rows yet. 从一开始它将有两列,但没有行。
::struct::matrix m
m add columns 2
Open a channel to read the data file. 打开一个通道以读取数据文件。 Pass the channel and the matrix name to the ::csv::read2matrix
command. 将通道和矩阵名称传递给::csv::read2matrix
命令。 This command will read the csv data and create a matrix row for each data row. 此命令将读取csv数据,并为每个数据行创建一个矩阵行。 The data fields are stored in the columns. 数据字段存储在列中。
set chan [open data.csv]
::csv::read2matrix $chan m
close $chan
To get the header strings, retrieve row 0. 要获取标题字符串,请检索第0行。
lassign [m get row 0] header1 header2
To iterate over the data rows, go from 1 (if we didn't have headers, 0) to just under m rows
, which is the number of rows in the matrix. 要遍历数据行,请从1(如果没有标题,则为0)到m rows
,这是矩阵中的行数。
There is a handy report
facility that works well with matrices, but I'll just use a for
loop here. 有一个方便的report
工具,可以很好地处理矩阵,但这里我只使用for
循环。 I'm guessing how you want the data presented: 我猜你想如何呈现数据:
for {set r 1} {$r < [m rows]} {incr r} {
puts -nonewline [format {%s = %-30s } $header1 [m get cell 0 $r]]
puts [format {%s = %s} $header2 [m get cell 1 $r]]
}
If you're done with the matrix, you might as well destroy it. 如果矩阵处理完毕,则最好销毁它。
m destroy
Solution for the specific problem in the comments. 评论中针对特定问题的解决方案。
package require csv
package require struct::matrix
::struct::matrix m
set chan [open foo.csv]
::csv::read2matrix $chan m , auto
close $chan
set f1 [m search column 0 "Result ID"]
set headerRow [lindex $f1 0 1]
set f2 [m search rect 0 $headerRow 0 [expr {[m rows] - 1}] ""]
set f3 [m search row $headerRow "Horizontal-1 Acc. Filename"]
set f4 [m search row $headerRow "Horizontal-2 Acc. Filename"]
set top [expr {$headerRow + 1}]
set bottom [expr {[lindex $f2 0 1] - 1}]
set left [lindex $f3 0 0]
set right [lindex $f4 0 0]
puts [format {Vector=[ %s ]} [concat {*}[m get rect $left $top $right $bottom]]]
m destroy
Obviously, you need to change the filename to the correct name. 显然,您需要将文件名更改为正确的名称。 There is no error handling: in such a simple script it's better to just have the script fail and correct whatever went wrong. 没有错误处理:在这样一个简单的脚本中,最好让脚本失败并纠正所有错误。
Solution to the second problem, comments below: 解决第二个问题的方法如下:
package require csv
package require struct::matrix
::struct::matrix m
set chan [open _SearchResults.csv]
::csv::read2matrix $chan m , auto
close $chan
set f1 [m search column 0 {Result ID}]
set headerRow [lindex $f1 0 1]
set f2 [m search -glob rect 0 $headerRow 0 [expr {[m rows] - 1}] { These*}]
set numofRow [lindex $f2 0 1]
set headercol1 [m search row $headerRow { Horizontal-1 Acc. Filename}]
set headercol2 [m search row $headerRow { Horizontal-2 Acc. Filename}]
set indexheaderH1col [lindex $headercol1 0 0]
set indexheaderH2col [lindex $headercol2 0 0]
set rows [m get rect $indexheaderH1col [expr {$headerRow+1}] $indexheaderH2col [expr {$numofRow-1}]]
set rows [lmap row $rows {
lassign $row a b
list [string trim $a] [string trim $b]
}]
foreach row $rows {
puts [format {%-30s %s} {*}$row]
}
puts [format {Vector=[ %s ]} [concat {*}$rows]]
Comments: 评论:
read2matrix
with auto
如果将read2matrix
与auto
配合使用,则无需设置列数 lmap
command fixes that 使用lmap
命令的部分修复了以下问题: Documentation: + (operator) , - (operator) , < (operator) , chan , close , concat , csv (package) , expr , for , format , incr , lassign , lindex , lmap (for Tcl 8.5) , lmap , open , package , puts , set , struct::matrix (package) , {*} (syntax) 文档: +(运算符) , -(运算符) , <(运算符) , chan , close , concat , csv(程序包) , expr , for , format , incr , lassign , lindex , lmap(对于Tcl 8.5) , lmap , open , package , puts , set , struct :: matrix(package) , {*}(语法)
wipe all 全部清除
package require csv
package require struct::matrix
::struct::matrix m
m add columns 2
set chan [open _SearchResults.csv]
::csv::read2matrix $chan m , auto
close $chan
set f1 [m search column 0 {Result ID}]
set headerRow [lindex $f1 0 1]
set f2 [m search rect 0 $headerRow 0 [expr {[m rows] - 1}] {}]
set numofRow [lindex [lindex $f2 0 1]]
set headercol1 [m search row $headerRow { Horizontal-1 Acc. Filename}]
set headercol2 [m search row $headerRow { Horizontal-2 Acc. Filename}]
set indexheaderH1col [lindex $headercol1 0 0]
set indexheaderH2col [lindex $headercol2 0 0]
set header1 [m get cell $indexheaderH1col $headerRow]
set header2 [m get cell $indexheaderH2col $headerRow]
for {set r [expr $headerRow+1]} {$r < [expr $numofRow-1]} {incr r} {
puts [format {%-30s %s} [m get cell $indexheaderH1col $r] [m get cell $indexheaderH2col $r]]
}
set vector [concat {*}[m get rect $indexheaderH1col [expr $headerRow+1] $indexheaderH2col [expr $numofRow-1]]]
puts [format {Vector=[ %s ]} [concat {*}[m get rect $indexheaderH1col [expr $headerRow+1] $indexheaderH2col [expr $numofRow-1]]]]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.