Python Pandas CSV - 阅读

Question

i've a problem, i have to read a CSV file and take the value from each rows.我有一个问题，我必须读取一个 CSV 文件并从每一行中获取值。

in example例如

Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202

the content above is my csv file composed, i want to proceed in reading this kind of CSV file, processing each column alone.上面的内容是我组成的csv文件，我想继续阅读这种CSV文件，单独处理每一列。 In example例如

dfin = pd.read_csv(var_an.csv)
for index1 in dfin.iterrows():

Name = 
Surname = 
Sex = 
Date =

how you would extract that one?你会如何提取那个？ i tried with str(dfin["Name"]), but i got the error that should be integer value inside the tuple, i then changed the "Name" with 0,1,2 but at the first column says that it's ouf of the index.我尝试使用 str(dfin["Name"])，但我得到的错误应该是元组内的整数值，然后我将“名称”更改为 0,1,2 但在第一列说它是 ouf索引。 What i'm wrong?我错了什么？ i had and easy success with xlsx file.我在 xlsx 文件上取得了轻松的成功。

def analytics(var_an):
    from termcolor import colored, cprint
    import pandas as pd
    dfin = pd.read_csv(var_an)
    for index1 in dfin.iterrows():
        print(index1)
        cprint(f'Found on file : {var_an}', 'red')
       # cprint(f'Obd = {obd} | pallet = {pallet} | loggerid = {loggerid} | system_date = {system_date} | system_time = {system_time} | house = {house} | hub = {hub}', 'on_green')

when i did this above it extract the entire row, but i can't manage it each file alone like当我在上面执行此操作时，它会提取整行，但我无法单独管理每个文件，例如

Name = 
Surname = 
Sex =

Answer 1

Not sure what you want to accomplish, but to take the values from the rows you can simply:不确定你想要完成什么，但要从行中获取值，你可以简单地：

for idx, row in dfin.iterrows():
    name = row['name']
    surname = row['surname']
    sex = row['sex']

    ...

Answer 2

That's not a CSV which expects comma-separated values in each line.这不是一个 CSV，它期望每行中有逗号分隔的值。 When you used read_csv , you got a table with a single column named "Name Surname Sex Date".当您使用read_csv时，您会得到一个包含名为“Name Surname Sex Date”的列的表。 Turning your fragments into a running script把你的片段变成一个运行的脚本

import pandas as pd
import io

the_file = io.StringIO("""Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202""")

dfin = pd.read_csv(the_file)
print(dfin.columns)

outputs产出

1 columns: Index(['Name Surname Sex Date'], dtype='object')

So, the file didn't parse correctly.因此，该文件没有正确解析。 You can change the separator from a comma to a regular expression and use all whitespace as column separators and you'll get the right values for this sample data您可以将分隔符从逗号更改为正则表达式，并将所有空格用作列分隔符，您将获得此示例数据的正确值

import pandas as pd
import io

the_file = io.StringIO("""Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202""")

dfin = pd.read_csv(the_file, sep=r"\s+")
print(dfin.columns)

for i, row in dfin.iterrows():
    print(f"====\nRow {i}:\n{row}")
    Name = row["Name"]
    Surname = row["Surname"]
    Sex = row["Sex"]
    Date = row["Date"]
    print("Extracted:", Name, Surname, Sex, Date)

This gets the right stuff:这得到了正确的东西：

Index(['Name', 'Surname', 'Sex', 'Date'], dtype='object')
====
Row 0:
Name           Franco
Surname         Puppi
Sex              Male
Date       01/01/2022
Name: 0, dtype: object
Extracted: Franco Puppi Male 01/01/2022
====
Row 1:
Name             Max
Surname      Pezzali
Sex             Male
Date       03/4/2022
Name: 1, dtype: object
Extracted: Max Pezzali Male 03/4/2022
====
Row 2:
Name          Fuffi
Surname       Fuffi
Sex          female
Date       03/8/202

Kinda good.还不错。 But there is still a huge problem.但是仍然存在一个巨大的问题。 What if one of these people have a space in their name?如果其中一个人的名字中有空格怎么办？ Pandas would split each part of the name into a separate column and the parsing would fail. Pandas 会将名称的每个部分拆分为单独的一列，并且解析会失败。 You need a better file format than what you've been given.您需要比现有格式更好的文件格式。

Python Pandas CSV - 阅读

问题描述

2 个解决方案

解决方案1
0 2022-12-13 20:33:04

解决方案2
0 已采纳 2022-12-13 21:34:47

Python Pandas CSV - 阅读

问题描述

2 个解决方案

解决方案1 0 2022-12-13 20:33:04

解决方案2 0 已采纳 2022-12-13 21:34:47

解决方案1
0 2022-12-13 20:33:04

解决方案2
0 已采纳 2022-12-13 21:34:47