当字段在文件中匹配时，在csv中对多行进行求和

Question

I have a file that I've trimmed down to look like the following: 我有一个文件，我已经修剪下来，如下所示：

"Reno","40.00"
"Reno","40.00"
"Reno","80.00"
"Reno","60.00"
"Lakewood","150.00"
"Altamonte Springs","50.25"
"Altamonte Springs","25.00"
"Altamonte Springs","25.00"
"Sandpoint","50.00"
"Lenoir City","987.00"

etc. 等等

What I want to end up with is a sum of the total amount per city. 我想最终得到的是每个城市的总金额的总和。 That is: 那是：

"Reno","220.00"
"Lakewood","150.00"
"Altamonte Springs","100.25"

Etc. 等等。

Fair warning, the data set is not necessarily continuous-that is, a city may appear once here, once a thousand lines down, and 3 more times at the end. 公平的警告，数据集不一定是连续的 - 也就是说，一个城市可能会出现一次，一次一千行，最后三次。

I've been trying to use the following awk script: 我一直在尝试使用以下awk脚本：

awk -F "," '{array[$1]+=$2} END { for (i in array) {print i"," array[i]}}' test1.csv > test6.csv

The results I'm getting look like this: 结果我看起来像这样：

"Matawan",0
"Bay Side",0
"Pataskala",0
"Dorothy",0
"Haymarket",0
"Myrtle Point",0

Etc. All zeros on the second column, and no quotes. 等等。第二列全部为零，没有引号。

I'm obviously missing something, but I don't know what or where else to look. 我显然错过了一些东西，但我不知道要看什么或其他什么。 What am I missing? 我错过了什么？

Thanks. 谢谢。

Answer 1

The reason you failed is because of the double quotes. 你失败的原因是因为双引号。

Do something like this: 做这样的事情：

sed 's/"//g' file.csv | awk -F "," '{array[$1]+=$2}END{for(i in array) {print "\""  i "\""  ","  "\"" array[i] "\"" }}' 

"Lenoir City","987"
"Reno","220"
"Lakewood","150"
"Sandpoint","50"
"Altamonte Springs","100.25"

Answer 2

This awk one-liner would give exactly what you want with formatting : 这个awk单行将准确地给出您想要的格式：

awk -F'","' '{a[$1]+=$2*1}END{for (x in a)printf "%s\",\"%.2f\"\n", x,a[x]}' file

test with your data: 测试您的数据：

kent$  cat f
"Reno","40.00"
"Reno","40.00"
"Reno","80.00"
"Reno","60.00"
"Lakewood","150.00"
"Altamonte Springs","50.25"
"Altamonte Springs","25.00"
"Altamonte Springs","25.00"
"Sandpoint","50.00"
"Lenoir City","987.00"

kent$  awk -F'","' '{a[$1]+=$2*1}END{for (x in a)printf "%s\",\"%.2f\"\n", x,a[x]}' f
"Lakewood","150.00"
"Reno","220.00"
"Lenoir City","987.00"
"Sandpoint","50.00"
"Altamonte Springs","100.25"

Answer 3

" is causing problem in your input. First remove them using sed and print it back using printf inside awk "导致输入问题。首先使用sed删除它们，然后使用awk printf将其打印回来

Try following: 试试以下：

sed 's/"//g' input.csv | awk -F "," '{array[$1]+=$2} END { for (i in array) {printf "\"%s\",\"%\"\n", i, array[i]}}' > output.csv

Jumbled Input 混乱的输入

"Reno","40.00"
"Reno","60.00"
"Lakewood","150.00"
"Altamonte Springs","50.25"
"Altamonte Springs","25.00"
"Reno","80.00"
"Sandpoint","50.00"
"Reno","40.00"
"Lenoir City","987.00"
"Altamonte Springs","25.00"

Output 产量

"Reno","220.00"
"Altamonte Springs","100.25"
"Lakewood","150.00"
"Lenoir City","987.00"
"Sandpoint","50.00"

Answer 4

You don't need pre-processing or nasty escaping: 您不需要预处理或讨厌的转义：

$ awk -F'"' '{a[$2]+=$4}END{for(k in a)printf "%s,%s\n",FS k FS,FS a[k] FS}' file
"Lenoir City","987"
"Reno","220"
"Lakewood","150"
"Sandpoint","50"
"Altamonte Springs","100.25"

当字段在文件中匹配时，在csv中对多行进行求和

问题描述

4 个解决方案

解决方案1
3 2013-10-03 18:27:46

解决方案2
2 2013-10-03 18:31:56

解决方案3
1 2013-10-03 18:30:34

解决方案4
1 2013-10-03 18:48:07

当字段在文件中匹配时，在csv中对多行进行求和

问题描述

4 个解决方案

解决方案1 3 2013-10-03 18:27:46

解决方案2 2 2013-10-03 18:31:56

解决方案3 1 2013-10-03 18:30:34

解决方案4 1 2013-10-03 18:48:07

解决方案1
3 2013-10-03 18:27:46

解决方案2
2 2013-10-03 18:31:56

解决方案3
1 2013-10-03 18:30:34

解决方案4
1 2013-10-03 18:48:07