简体   繁体   English

使用Linux的csvfix对数值进行排序-将数字转换为字符串

[英]Sorting on a numerical value using csvfix for linux - turns numbers to strings

I'm using csvfix to sort a CSV file based on an integer (counter) value in the second column. 我正在使用csvfix根据第二列中的整数(计数器)值对CSV文件进行排序。 However it seems that csvfix puts double quotes around all fields in the file, turning them to strings, before it performs the sort. 但是,似乎csvfix在执行排序之前将双引号放在文件中的所有字段周围,并将它们转换为字符串。 The result is that the rows are sorted by the string value, such that "1000" comes before "2". 结果是按字符串值对行进行排序,以使“ 1000”位于“ 2”之前。

There is a command-line option -smq that is supposed to apply "smart quoting" but that's not helping me. 有一个命令行选项-smq应该应用“智能引号”,但这对我没有帮助。 If I use the command csvfix echo -smq file.csv , the output has no quotes around numerical fields, but when I pipe that into csvfix sort -f 2 file.csv , the file is written without quotes but still sorted in "string order". 如果我使用命令csvfix echo -smq file.csv ,则输出在数字字段周围没有引号,但是当我将其通过管道csvfix sort -f 2 file.csvcsvfix sort -f 2 file.csv ,文件被写成不带引号,但仍按“字符串顺序”排序”。 It makes no difference whether I include the -smq flag in the sort command or not. 是否在排序命令中包含-smq标志都没有区别。

Additionally I would like csvfix to ignore the first row of string headers. 另外,我希望csvfix忽略字符串标题的第一行。 Csvfix issue tracking claims this is already implemented but I can only find the -ifn flag that seems to cut the header row out entirely. Csvfix问题跟踪声称这已经实现,但是我只能找到-ifn标志,它似乎完全切掉了标题行。

These seem pretty basic pieces of functionality for this tool, so I'm probably missing something very simple. 这些似乎是此工具的基本功能,因此我可能缺少一些非常简单的东西。 Hoping someone on here has used csvfix and figured out. 希望有人在这里使用csvfix并弄清楚。

According to the on line documentation for csvfix , sort has a N option for numeric sorts: 根据csvfix的在线文档 ,sort具有用于数字排序的N选项:

csvfix sort -f 2:N file.csv 

Having said this, CSV isn't a particularly good format for text manipulation . 话虽这么说,CSV 并不是用于文本操作的特别好的格式 If possible, you're much better off choosing DSV (delimiter separated values) such as Tab or Pipe separated, so that you can simply pipe the output to sort , which has ample capability to sort by field, using whatever collation method you need. 如果可能的话,最好选择DSV(定界符分隔的值),例如Tab或Pipe split,以便您可以简单地将输出通过管道传递给sort ,它具有使用所需的任何排序方法进行按字段排序的足够功能。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM