简体   繁体   English

Python Pandas CSV - 阅读

[英]Python Pandas CSV - reading

i've a problem, i have to read a CSV file and take the value from each rows.我有一个问题,我必须读取一个 CSV 文件并从每一行中获取值。

in example例如

Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202

the content above is my csv file composed, i want to proceed in reading this kind of CSV file, processing each column alone.上面的内容是我组成的csv文件,我想继续阅读这种CSV文件,单独处理每一列。 In example例如

dfin = pd.read_csv(var_an.csv)
for index1 in dfin.iterrows():

Name = 
Surname = 
Sex = 
Date = 

how you would extract that one?你会如何提取那个? i tried with str(dfin["Name"]), but i got the error that should be integer value inside the tuple, i then changed the "Name" with 0,1,2 but at the first column says that it's ouf of the index.我尝试使用 str(dfin["Name"]),但我得到的错误应该是元组内的整数值,然后我将“名称”更改为 0,1,2 但在第一列说它是 ouf索引。 What i'm wrong?我错了什么? i had and easy success with xlsx file.我在 xlsx 文件上取得了轻松的成功。

def analytics(var_an):
    from termcolor import colored, cprint
    import pandas as pd
    dfin = pd.read_csv(var_an)
    for index1 in dfin.iterrows():
        print(index1)
        cprint(f'Found on file : {var_an}', 'red')
       # cprint(f'Obd = {obd} | pallet = {pallet} | loggerid = {loggerid} | system_date = {system_date} | system_time = {system_time} | house = {house} | hub = {hub}', 'on_green')

when i did this above it extract the entire row, but i can't manage it each file alone like当我在上面执行此操作时,它会提取整行,但我无法单独管理每个文件,例如

Name = 
Surname = 
Sex =

Not sure what you want to accomplish, but to take the values from the rows you can simply:不确定你想要完成什么,但要从行中获取值,你可以简单地:

for idx, row in dfin.iterrows():
    name = row['name']
    surname = row['surname']
    sex = row['sex']

    ...

That's not a CSV which expects comma-separated values in each line.这不是一个 CSV,它期望每行中有逗号分隔的值。 When you used read_csv , you got a table with a single column named "Name Surname Sex Date".当您使用read_csv时,您会得到一个包含名为“Name Surname Sex Date”的列的表。 Turning your fragments into a running script把你的片段变成一个运行的脚本

import pandas as pd
import io

the_file = io.StringIO("""Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202""")

dfin = pd.read_csv(the_file)
print(dfin.columns)

outputs产出

1 columns: Index(['Name Surname Sex Date'], dtype='object')

So, the file didn't parse correctly.因此,该文件没有正确解析。 You can change the separator from a comma to a regular expression and use all whitespace as column separators and you'll get the right values for this sample data您可以将分隔符从逗号更改为正则表达式,并将所有空格用作列分隔符,您将获得此示例数据的正确值

import pandas as pd
import io

the_file = io.StringIO("""Name Surname Sex Date
Franco Puppi Male 01/01/2022
Max   Pezzali Male 03/4/2022
Fuffi Fuffi  female 03/8/202""")

dfin = pd.read_csv(the_file, sep=r"\s+")
print(dfin.columns)

for i, row in dfin.iterrows():
    print(f"====\nRow {i}:\n{row}")
    Name = row["Name"]
    Surname = row["Surname"]
    Sex = row["Sex"]
    Date = row["Date"]
    print("Extracted:", Name, Surname, Sex, Date)

This gets the right stuff:这得到了正确的东西:

Index(['Name', 'Surname', 'Sex', 'Date'], dtype='object')
====
Row 0:
Name           Franco
Surname         Puppi
Sex              Male
Date       01/01/2022
Name: 0, dtype: object
Extracted: Franco Puppi Male 01/01/2022
====
Row 1:
Name             Max
Surname      Pezzali
Sex             Male
Date       03/4/2022
Name: 1, dtype: object
Extracted: Max Pezzali Male 03/4/2022
====
Row 2:
Name          Fuffi
Surname       Fuffi
Sex          female
Date       03/8/202

Kinda good.还不错。 But there is still a huge problem.但是仍然存在一个巨大的问题。 What if one of these people have a space in their name?如果其中一个人的名字中有空格怎么办? Pandas would split each part of the name into a separate column and the parsing would fail. Pandas 会将名称的每个部分拆分为单独的一列,并且解析会失败。 You need a better file format than what you've been given.您需要比现有格式更好的文件格式。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM