简体   繁体   English

如何从CSV文件导入数据并将其存储在变量中?

[英]How to import data from a CSV file and store it in a variable?

I am extremely new to python 3 and I am learning as I go here. 我是python 3的新手,我在这里学习。 I figured someone could help me with a basic question: how to store text from a CSV file as a variable to be used later in the code. 我认为有人可以帮助我解决一个基本问题:如何将CSV文件中的文本存储为变量,以便稍后在代码中使用。 So the idea here would be to import a CSV file into the python interpreter: 因此,这里的想法是将CSV文件导入python解释器:

import csv
with open('some.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        ...

and then extract the text from that file and store it as a variable (ie w = ["csv file text"] ) to then be used later in the code to create permutations: 然后从该文件中提取文本并将其存储为变量(即w = ["csv file text"] ),以便稍后在代码中使用以创建排列:

print (list(itertools.permutations(["w"], 2)))

If someone could please help and explain the process, it would be very much appreciated as I am really trying to learn. 如果有人可以帮助和解释这一过程,我将非常感谢我的努力。 Please let me know if any more explanation is needed! 请让我知道是否需要更多说明!

itertools.permutations() wants an iterable (eg a list) and a length as its arguments, so your data structure needs to reflect that, but you also need to define what you are trying to achieve here. itertools.permutations()一个可迭代(例如列表)和一个长度作为其参数,因此您的数据结构需要反映出来,但是您还需要在此处定义您要实现的目标。 For example, if you wanted to read a CSV file and produce permutations on every individual CSV field you could try this: 例如,如果您想读取CSV文件并在每个CSV字段上产生排列,则可以尝试以下操作:

import csv
with open('some.csv', newline='') as f:
    reader = csv.reader(f)
    w = []
    for row in reader:
        w.extend(row)

print(list(itertools.permutations(w, 2)))

The key thing here is to create a flat list that can be passed to itertools.permutations() - this is done by intialising w to an empty list, and then extending its elements with the elements/fields from each row of the CSV file. 这里的关键是创建一个可以传递给itertools.permutations()的平面列表,方法是将w初始化为一个空列表,然后使用CSV文件每一行的元素/字段扩展其元素。

Note : As pointed out by @martineau, for the reasons explained here , the file should be opened with newline='' when used with the Python 3 csv module. 注意 :正如@martineau指出的,出于此处说明的原因,与Python 3 csv模块一起使用时,应使用newline=''打开文件。

If you want to use Python 3 (as you state in the question) and to process the CSV file using the standard csv module, you should be careful about how to open the file. 如果您要使用Python 3(如您在问题中所述)并使用标准csv模块处理CSV文件,则应注意如何打开该文件。 So far, your code and the answers use the Python 2 way of opening the CSV file. 到目前为止,您的代码和答案都使用Python 2打开CSV文件的方式。 The things has changed in Python 3. Python 3发生了变化。

As shengy wrote, the CSV file is just a text file, and the csv module gets the elements as strings. 正如shengy所写,CSV文件只是一个文本文件,而csv模块将元素作为字符串获取。 Strings in Python 3 are unicode strings. Python 3中的字符串是unicode字符串。 Because of that, you should open the file in the text mode, and you should supply the encoding. 因此,您应该以文本模式打开文件,并提供编码。 Because of the nature of CSV file processing, you should also use the newline='' when opening the file. 由于CSV文件处理的性质,打开文件时还应使用newline=''

Now extending the explanation of Burhan Khalid ... When reading the CSV file, you get the rows as lists of strings. 现在扩展Burhan Khalid的解释...阅读CSV文件时,您将这些行作为字符串列表获取。 If you want to read all content of the CSV file into memory and store it in a variable, you probably want to use the list of rows (ie list of lists where the nested lists are the rows). 如果要将CSV文件的所有内容读入内存并将其存储在变量中,则可能要使用行列表(即嵌套列表为行的列表列表)。 The for loop iterates through the rows. for循环遍历各行。 The same way the list() function iterates through the sequence (here through the sequence of rows) and build the list of the items. list()函数以同样的方式遍历序列(这里是行的序列)并构建项目列表。 To combine that with the wish to store everything in the content variable, you can write: 要将其与希望将所有content存储在content变量中结合起来,可以编写:

import csv

with open('some.csv', newline='', encoding='utf_8') as f:
    reader = csv.reader(f)
    content = list(reader)

Now you can do your permutation as you wish. 现在,您可以根据需要进行排列。 The itertools is the correct way to do the permutations. itertools是进行排列的正确方法。

import csv
data = csv.DictReader(open('FileName.csv', 'r'))
print data.fieldnames
output = []
for each_row in data:
   row = {}
   try:
     p = dict((k.strip(), v) for k, v in p.iteritems() if v.lower() != 'null')
   except AttributeError, e:
     print e
     print p
     raise Exception()
//based on the number of column   
if p.get('col1'):
    row['col1'] = p['col1']
if p.get('col2'):
    row['col2'] = p['col2']
output.append(row)

Finally all data stored in output variable 最后所有数据存储在输出变量中

Is this what you need? 这是您需要的吗?

import csv
with open('some.csv', 'rb') as f:
    reader = csv.reader(f, delimiter=',')
    rows = list(reader)

print('The csv file had {} rows'.format(len(rows)))

for row in rows:
   do_stuff(row)

do_stuff_to_all_rows(rows)

The interesting line is rows = list(reader) , which converts each row from the csv file (which will be a list), into another list rows , in effect giving you a list of lists. 有趣的行是rows = list(reader) ,它将csv文件(将是列表)中的每一行转换为另一个列表rows ,实际上是为您提供了一个列表列表。

If you had a csv file with three rows, rows would be a list with three elements, each element a row representing each line in the original csv file. 如果您有一个包含三行的csv文件,则rows是包含三个元素的列表,每个元素一行代表原始csv文件中的每一行。

If all you care about is to read the raw text in the file ( csv or not) then: 如果您只关心读取文件中的原始文本(是否为csv ),则:

with open('some.csv') as f:
    w = f.read()

will be a simple solution to having w="csv, file, text\\nwithout, caring, about columns\\n" 将是拥有w="csv, file, text\\nwithout, caring, about columns\\n"的简单解决方案w="csv, file, text\\nwithout, caring, about columns\\n"

First, a csv file is a text file too, so everything you can do with a file, you can do it with a csv file. 首先, csv文件也是文本文件,因此您可以对文件进行的所有操作,都可以对csv文件进行处理。 That means f.read() , f.readline() , f.readlines() can all be used. 这意味着f.read()f.readline()f.readlines()都可以使用。 see detailed information of these functions here . 在这里查看这些功能的详细信息。

But, as your file is a csv file, you can utilize the csv module. 但是,由于您的文件是csv文件,因此可以使用csv模块。

# input.csv
# 1,david,enterprise
# 2,jeff,personal

import csv

with open('input.csv') as f:
    reader = csv.reader(f)
    for serial, name, version in reader:
        # The csv module already extracts the information for you
        print serial, name, version

More details about the csv module is here . 有关csv模块的更多详细信息在这里

You should try pandas, which work both with Python 2.7 and Python 3.2+ : 您应该尝试可与Python 2.7和Python 3.2+一起使用的pandas:

import pandas as pd
csv = pd.read_csv("your_file.csv")

Then you can handle you data easily. 然后,您可以轻松处理数据。

More fun here 在这里更有趣

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM