[英]Python: CSV write by column rather than row
I have a python script that generates a bunch of data in a while loop.我有一个 python 脚本,它在 while 循环中生成一堆数据。 I need to write this data to a CSV file, so it writes by column rather than row.
我需要将此数据写入 CSV 文件,因此它按列而不是行写入。
For example in loop 1 of my script I generate:例如,在我生成的脚本的循环 1 中:
(1, 2, 3, 4)
I need this to reflect in my csv script like so:我需要这样反映在我的 csv 脚本中:
Result_1 1
Result_2 2
Result_3 3
Result_4 4
On my second loop i generate:在我的第二个循环中,我生成:
(5, 6, 7, 8)
I need this to look in my csv file like so:我需要这样查看我的 csv 文件:
Result_1 1 5
Result_2 2 6
Result_3 3 7
Result_4 4 8
and so forth until the while loop finishes.依此类推,直到 while 循环结束。 Can anybody help me?
有谁能够帮助我?
EDIT编辑
The while loop can last over 100,000 loops while 循环可以持续超过 100,000 次循环
The reason csv
doesn't support that is because variable-length lines are not really supported on most filesystems. csv
不支持的原因是因为大多数文件系统并不真正支持可变长度的行。 What you should do instead is collect all the data in lists, then call zip()
on them to transpose them after. 你应该做的是收集列表中的所有数据,然后调用它们上面的
zip()
来转置它们。
>>> l = [('Result_1', 'Result_2', 'Result_3', 'Result_4'), (1, 2, 3, 4), (5, 6, 7, 8)]
>>> zip(*l)
[('Result_1', 1, 5), ('Result_2', 2, 6), ('Result_3', 3, 7), ('Result_4', 4, 8)]
wr.writerow(item) #column by column
wr.writerows(item) #row by row
This is quite simple if your goal is just to write the output column by column. 如果您的目标只是逐列编写输出列,这非常简单。
If your item is a list: 如果您的商品是列表:
yourList = []
with open('yourNewFileName.csv', 'w', ) as myfile:
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
for word in yourList:
wr.writerow([word])
Updating lines in place in a file is not supported on most file system (a line in a file is just some data that ends with newline, the next line start just after that). 大多数文件系统不支持更新文件中的行(文件中的一行只是一些以换行结束的数据,下一行就在此之后开始)。
As I see it you have two options: 在我看来,你有两个选择:
Small example for the first method: 第一种方法的小例子:
from itertools import islice, izip, count
print list(islice(izip(count(1), count(2), count(3)), 10))
This will print 这将打印
[(1, 2, 3), (2, 3, 4), (3, 4, 5), (4, 5, 6), (5, 6, 7), (6, 7, 8), (7, 8, 9), (8, 9, 10), (9, 10, 11), (10, 11, 12)]
even though count
generate an infinite sequence of numbers 即使
count
产生无限的数字序列
Read it in by row and then transpose it in the command line. 逐行读取它,然后在命令行中转置它。 If you're using Unix, install csvtool and follow the directions in: https://unix.stackexchange.com/a/314482/186237
如果您使用的是Unix,请安装csvtool并按照以下说明操作: https ://unix.stackexchange.com/a/314482/186237
what about Result_*
there also are generated in the loop (because i don't think it's possible to add to the csv file) 那么
Result_*
还会在循环中生成(因为我认为不可能添加到csv文件)
i will go like this ; 我会这样的; generate all the data at one rotate the matrix write in the file:
生成所有数据,旋转矩阵写入文件:
A = []
A.append(range(1, 5)) # an Example of you first loop
A.append(range(5, 9)) # an Example of you second loop
data_to_write = zip(*A)
# then you can write now row by row
Let's assume that (1) you don't have a large memory (2) you have row headings in a list (3) all the data values are floats; 让我们假设(1)你没有大内存(2)你在列表中有行标题(3)所有数据值都是浮点数; if they're all integers up to 32- or 64-bits worth, that's even better.
如果它们都是高达32位或64位的整数,那就更好了。
On a 32-bit Python, storing a float in a list takes 16 bytes for the float object and 4 bytes for a pointer in the list; 在32位Python上,将float存储在列表中对于float对象需要16个字节,对于列表中的指针需要4个字节; total 20. Storing a float in an array.array('d') takes only 8 bytes.
总计20.在array.array('d')中存储一个浮点只需要8个字节。 Increasingly spectacular savings are available if all your data are int (any negatives?) that will fit in 8, 4, 2 or 1 byte(s) -- especially on a recent Python where all ints are longs.
如果您的所有数据都是int(任何底片?),那么可以获得越来越多的节省,这些数据将适合8,4,2或1个字节 - 尤其是在最近所有整数都很长的Python上。
The following pseudocode assumes floats stored in array.array('d'). 以下伪代码假定浮点数存储在array.array('d')中。 In case you don't really have a memory problem, you can still use this method;
如果你真的没有内存问题,你仍然可以使用这种方法; I've put in comments to indicate the changes needed if you want to use a list.
如果您想使用列表,我已添加注释以指示所需的更改。
# Preliminary:
import array # list: delete
hlist = []
dlist = []
for each row:
hlist.append(some_heading_string)
dlist.append(array.array('d')) # list: dlist.append([])
# generate data
col_index = -1
for each column:
col_index += 1
for row_index in xrange(len(hlist)):
v = calculated_data_value(row_index, colindex)
dlist[row_index].append(v)
# write to csv file
for row_index in xrange(len(hlist)):
row = [hlist[row_index]]
row.extend(dlist[row_index])
csv_writer.writerow(row)
As an alternate streaming approach: 作为替代流媒体方法:
Both steps should handle steaming just fine. 这两个步骤应该处理蒸汽就好了。
Pitfalls: 陷阱:
After thinkering for a while i was able to come up with an easier way of achieving same goal.经过一段时间的思考,我能够想出一种更简单的方法来实现相同的目标。 Assuming you have the code as below:
假设你有如下代码:
fruitList = ["Mango", "Apple", "Guava", "Grape", "Orange"]
vegList = ["Onion", "Garlic", "Shallot", "Pumpkin", "Potato"]
with open("NEWFILE.csv", "w") as csvfile:
writer = csv.writer(csvfile)
for value in range(len(fruitList)):
writer.writerow([fruitList[value], vegList[value]])
zip<\/code> will only take number of elements equal to the shortest length list.
zip<\/code>只会采用等于最短长度列表的元素数量。
If your columns are of equal length, you need to use
zip_longest<\/code>
如果您的列长度相等,则需要使用
zip_longest<\/code>
import csv
from itertools import zip_longest
data = [[1,2,3,4],[5,6]]
columns_data = zip_longest(*data)
with open("file.csv","w") as f:
writer = csv.writer(f)
writer.writerows(columns_data)
FruitList = [“芒果”、“苹果”、“番石榴”、“葡萄”、“橙子”] vegList = [“洋葱”、“大蒜”、“青葱”、“南瓜”、“土豆”]
"
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.