简体   繁体   English

从CSV文件读取

[英]Reading from a CSV file

I have a CSV file containing many rows and columns 2 of which are similar to: 我有一个包含许多行和列2的CSV文件,它们类似于:

Horizontal-1 Acc. Filename      Horizontal-2 Acc. Filename
 RSN88_SFERN_FSD172.AT2          RSN88_SFERN_FSD262.AT2 
 RSN164_IMPVALL.H_H-CPE147.AT2   RSN164_IMPVALL.H_H-CPE237.AT2 
 RSN755_LOMAP_CYC195.AT2         RSN755_LOMAP_CYC285.AT2 
 RSN1083_NORTHR_GLE170.AT2       RSN1083_NORTHR_GLE260.AT2 
 RSN1614_DUZCE_1061-N.AT2        RSN1614_DUZCE_1061-E.AT2 
 RSN1633_MANJIL_ABBAR--L.AT2     RSN1633_MANJIL_ABBAR--T.AT2 
 RSN3750_CAPEMEND_LFS270.AT2     RSN3750_CAPEMEND_LFS360.AT2 
 RSN3757_LANDERS_NPF090.AT2      RSN3757_LANDERS_NPF180.AT2
 RSN3759_LANDERS_WWT180.AT2      RSN3759_LANDERS_WWT270.AT2 
 RSN4013_SANSIMEO_36258021.AT2   RSN4013_SANSIMEO_36258111.AT2 
 RSN4841_CHUETSU_65004NS.AT2     RSN4841_CHUETSU_65004EW.AT2 
 RSN4843_CHUETSU_65006NS.AT2     RSN4843_CHUETSU_65006EW.AT2 
 RSN4844_CHUETSU_65007NS.AT2     RSN4844_CHUETSU_65007EW.AT2 
 RSN4848_CHUETSU_65011NS.AT2     RSN4848_CHUETSU_65011EW.AT2

In the CSV file I wanna look for the headers "Horizontal-1 Acc. Filename and Horizontal-2 Acc. Filename" and then line by line get the names of each row under these headers one at a time ? 在CSV文件中,我想查找标题“ Horizo​​ntal-1 Acc。Filename和Horizo​​ntal-2 Acc。Filename”,然后逐行获取这些标题下的每一行的名称?

Any suggestion ? 有什么建议吗?

Thanks RG. 谢谢RG。

package require csv
package require struct::matrix

::struct::matrix m
m add columns 2

set chan [open data.csv]
::csv::read2matrix $chan m
close $chan

lassign [m get row 0] header1 header2

for {set r 1} {$r < [m rows]} {incr r} {
    puts -nonewline [format {%s = %-30s  } $header1 [m get cell 0 $r]]
    puts [format {%s = %s} $header2 [m get cell 1 $r]]
}

m destroy

I find that the easiest way to deal with csv data sets is by using a matrix . 我发现处理csv数据集的最简单方法是使用matrix A matrix is sort of a two-dimensional vector with built-ins for searching, sorting and rearranging columns and rows. matrix是一种二维向量,具有内置的搜索,排序和重新排列列和行的内置向量。

First, create a matrix and call it m . 首先,创建一个矩阵并将其命名为m It will have two columns from the beginning, but no rows yet. 从一开始它将有两列,但没有行。

::struct::matrix m
m add columns 2

Open a channel to read the data file. 打开一个通道以读取数据文件。 Pass the channel and the matrix name to the ::csv::read2matrix command. 将通道和矩阵名称传递给::csv::read2matrix命令。 This command will read the csv data and create a matrix row for each data row. 此命令将读取csv数据,并为每个数据行创建一个矩阵行。 The data fields are stored in the columns. 数据字段存储在列中。

set chan [open data.csv]
::csv::read2matrix $chan m
close $chan

To get the header strings, retrieve row 0. 要获取标题字符串,请检索第0行。

lassign [m get row 0] header1 header2

To iterate over the data rows, go from 1 (if we didn't have headers, 0) to just under m rows , which is the number of rows in the matrix. 要遍历数据行,请从1(如果没有标题,则为0)到m rows ,这是矩阵中的行数。

There is a handy report facility that works well with matrices, but I'll just use a for loop here. 有一个方便的report工具,可以很好地处理矩阵,但这里我只使用for循环。 I'm guessing how you want the data presented: 我猜你想如何呈现数据:

for {set r 1} {$r < [m rows]} {incr r} {
    puts -nonewline [format {%s = %-30s  } $header1 [m get cell 0 $r]]
    puts [format {%s = %s} $header2 [m get cell 1 $r]]
}

If you're done with the matrix, you might as well destroy it. 如果矩阵处理完毕,则最好销毁它。

m destroy

Solution for the specific problem in the comments. 评论中针对特定问题的解决方案。

package require csv
package require struct::matrix

::struct::matrix m

set chan [open foo.csv]
::csv::read2matrix $chan m , auto
close $chan

set f1 [m search column 0 "Result ID"]
set headerRow [lindex $f1 0 1]
set f2 [m search rect 0 $headerRow 0 [expr {[m rows] - 1}] ""]
set f3 [m search row $headerRow "Horizontal-1 Acc. Filename"]
set f4 [m search row $headerRow "Horizontal-2 Acc. Filename"]

set top [expr {$headerRow + 1}]
set bottom [expr {[lindex $f2 0 1] - 1}]
set left [lindex $f3 0 0]
set right [lindex $f4 0 0]

puts [format {Vector=[ %s ]} [concat {*}[m get rect $left $top $right $bottom]]]
m destroy

Obviously, you need to change the filename to the correct name. 显然,您需要将文件名更改为正确的名称。 There is no error handling: in such a simple script it's better to just have the script fail and correct whatever went wrong. 没有错误处理:在这样一个简单的脚本中,最好让脚本失败并纠正所有错误。


Solution to the second problem, comments below: 解决第二个问题的方法如下:

package require csv
package require struct::matrix

::struct::matrix m

set chan [open _SearchResults.csv]
::csv::read2matrix $chan m , auto
close $chan

set f1 [m search column 0 {Result ID}]
set headerRow [lindex $f1 0 1]

set f2 [m search -glob rect 0 $headerRow 0 [expr {[m rows] - 1}] { These*}]
set numofRow [lindex $f2 0 1]

set headercol1 [m search row $headerRow { Horizontal-1 Acc. Filename}]
set headercol2 [m search row $headerRow { Horizontal-2 Acc. Filename}]  

set indexheaderH1col [lindex $headercol1  0 0]
set indexheaderH2col [lindex $headercol2  0 0]

set rows [m get rect $indexheaderH1col [expr {$headerRow+1}] $indexheaderH2col [expr {$numofRow-1}]]

set rows [lmap row $rows {
    lassign $row a b
    list [string trim $a] [string trim $b]
}]

foreach row $rows {
    puts [format {%-30s   %s} {*}$row]
}

puts [format {Vector=[ %s ]} [concat {*}$rows]]

Comments: 评论:

  • You don't need to set the number of columns if you use read2matrix with auto 如果将read2matrixauto配合使用,则无需设置列数
  • In this file, there is no empty cell after the table. 在此文件中,表之后没有空单元格。 Instead, we need to search for a string beginning with " These" 相反,我们需要搜索以“这些”开头的字符串
  • Since each cell holds a space character followed by the value, we need to trim off space around the value, otherwise the concatenation will go wrong. 由于每个单元格都包含一个空格字符,后跟值,因此我们需要在值周围剪裁空格,否则串联将出错。 The part with the lmap command fixes that 使用lmap命令的部分修复了以下问题:
  • Always brace your expressions 时刻保持表情

Documentation: + (operator) , - (operator) , < (operator) , chan , close , concat , csv (package) , expr , for , format , incr , lassign , lindex , lmap (for Tcl 8.5) , lmap , open , package , puts , set , struct::matrix (package) , {*} (syntax) 文档: +(运算符)-(运算符)<(运算符)chancloseconcatcsv(程序包)exprforformatincrlassignlindexlmap(对于Tcl 8.5)lmapopenpackageputssetstruct :: matrix(package){*}(语法)

wipe all 全部清除

package require csv
package require struct::matrix

::struct::matrix m
m add columns 2

set chan [open _SearchResults.csv]
::csv::read2matrix $chan m  , auto
close $chan

set f1 [m search column 0 {Result ID}]
set headerRow [lindex $f1 0 1]

set f2 [m search rect 0 $headerRow 0 [expr {[m rows] - 1}] {}]
set  numofRow [lindex [lindex $f2 0 1]]

set headercol1 [m search row  $headerRow { Horizontal-1 Acc. Filename}]
set headercol2 [m search row  $headerRow  { Horizontal-2 Acc. Filename}]  

set indexheaderH1col [lindex $headercol1  0 0]
set indexheaderH2col [lindex $headercol2  0 0]

set header1 [m get cell $indexheaderH1col $headerRow]
set header2 [m get cell $indexheaderH2col $headerRow]

for {set r [expr $headerRow+1]} {$r < [expr $numofRow-1]} {incr r} {
    puts [format {%-30s   %s}  [m get cell $indexheaderH1col $r]  [m get cell $indexheaderH2col $r]]
}

set  vector   [concat {*}[m get rect $indexheaderH1col  [expr $headerRow+1] $indexheaderH2col [expr $numofRow-1]]]

puts [format {Vector=[ %s ]} [concat {*}[m get rect $indexheaderH1col  [expr $headerRow+1] $indexheaderH2col [expr $numofRow-1]]]]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM