简体   繁体   English

使用 pandas 从一个 csv 文件写入另一个文件时出现值错误

[英]Value error while writing from one csv file to another using pandas

I am writing a code that goes through many csv files in a folder(using a for loop), removes bad data from each csv file(where row values are more than number of columns or sometimes lesser than number of columns).我正在编写一个代码,它遍历文件夹中的许多 csv 文件(使用 for 循环),从每个 csv 文件中删除错误数据(其中行值大于列数,有时小于列数)。 After removing, I rearrange the columns and then I write the useful data into a new csv file.删除后,我重新排列列,然后将有用的数据写入新的 csv 文件。
Here in the code below the for loop is for cycling between different files present in a folder.在下面的代码中,for 循环用于在文件夹中存在的不同文件之间循环。 You can assume the df=pd.read_csv line as the beginning and assume correct indentation.您可以假定df=pd.read_csv行作为开头并假定正确的缩进。

import pandas as pd
import os

for filename in os.listdir("csv files copy"):
    filenames=os.path.join("csv files copy",filename)
    print(filename)
   
    df=pd.read_csv(filenames, error_bad_lines=False)

    for row in df:

        df.columns=["id","FirstName","LastName","UserName","Phone","IsContact","RestrictionReason","Status","IsScam","Date"]
        df = df.drop(labels="Status", axis=1)
        df = df.reindex(columns=['id', 'Phone', 'FirstName', 'LastName', 'UserName',"IsContact","IsScam","Date","RestrictionReason"])
        df.to_csv(filenames,index=False)

While doing so this is the error I recieve.这样做时,这是我收到的错误。
ValueError: Length mismatch: Expected axis has 9 elements, new values have 10 elements

This is the first 4 values and the header of the dataframe that I am using:这是我使用的 dataframe 的前 4 个值和 header:

id                      Phone   FirstName   LastName   UserName     IsContact  IsScam Date                       RestrictionReason        Status             
Forex Pips Fire Free    NaN     Goldenboy      NaN     Goldenboyys      False   False  5/7/2022 8:34:07 AM                NaN             NaN
Forex Pips Fire Free    NaN     Abu 3odeh      NaN         oudah12      False   False  5/7/2022 8:38:03 AM                NaN             NaN
Forex Pips Fire Free    NaN        Rahman     Azar     Rahman_Azar      False   False  5/7/2022 8:41:22 AM                NaN             NaN
Forex Pips Fire Free    NaN         HUDLE      NaN       Hudle1051      False   False  5/7/2022 8:41:11 AM                NaN             NaN

And given below is the header of the destination csv file that the above data needs to be entered into下面给出的是目标csv文件的header,上面的数据需要输入

id Phone FirstName LastName UserName IsContact IsScam Date RestrictionReason

You need to remove the for loop as follows:您需要删除 for 循环,如下所示:

import pandas as pd
import os

for filename in os.listdir("csv files copy"):
    filenames = os.path.join("csv files copy", filename)
    print(filename)
   
    df = pd.read_csv(filenames, error_bad_lines=False)
    df.columns = ["id", "FirstName", "LastName", "UserName", "Phone", "IsContact", "RestrictionReason", "Status", "IsScam", "Date"]
    df = df.drop(labels="Status", axis=1)
    df = df.reindex(columns=["id", "Phone", "FirstName", "LastName", "UserName","IsContact","IsScam","Date","RestrictionReason"])
    df.to_csv(filenames, index=False)

This was causing the error and is not needed.这是导致错误的原因,不需要。 The first time through the loop it correctly removes Status column and saves the CSV file.第一次通过循环时,它正确地删除了Status列并保存了 CSV 文件。 The second time through the loop (on the same dataframe) it attempts to do df.columns again but now there is no Status column, so an incorrect number of columns are given.第二次通过循环(在同一数据帧上)它尝试再次执行df.columns但现在没有Status列,因此给出了不正确的列数。

The code for row in df: would actually iterate over the column names in the dataframe, for row in df:的代码实际上会遍历 dataframe 中的列名,
eg id then FirstName etc.例如id然后FirstName等。

Because you give only 9 columns in this line you missed the 'Status' column因为您在这一行中只给出了 9 列,所以您错过了'Status'

df = df.reindex(columns=['id', 'Phone', 'FirstName', 'LastName', 'UserName', 'IsContact', 'IsScam', 'Date', 'RestrictionReason'])
df.to_csv(filenames, index=False)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用熊猫或csv模块将值写入csv文件中的给定文件 - Writing value to given filed in csv file using pandas or csv module 使用 Python 将数据从一个 CSV 文件写入另一个 CSV 文件 - Writing data from one CSV file to another CSV file using Python 如何使用 Python 和 Z251D1BBFE9A3B678CEAZ30DC 将 csv 文件中一个单元格的值复制到另一个 csv 文件? - How can I copy the value of one cell in a csv file to another csv file using Python and Pandas? Python pandas 读写 csv 文件时出错 - Python pandas error while reading and writing csv file 使用 Pandas 确定另一个 CSV 文件中是否缺少一个 CSV 文件中的值 - Using Pandas to determine if values from one CSV file are missing in another CSV file 从一个csv文件中选择特定列并在python中写入另一列时出错 - error in selecting particular columns from one csv file and writing to another in python 使用python熊猫将列表写入CSV文件时出错 - Error with writing a list to a csv file using python pandas 将反向数据从一个 csv 文件写入另一个文件时出现问题 - Issue in writing reverse data from one csv file to another 将 html 页面从网站写入 CSV 文件时出错 - Getting error while writing html page from website into CSV file 使用熊猫通过csv中的另一列中的条件更新一列中的值 - Update a value in one column by a condition in another in csv using pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM