使用 pandas 从一个 csv 文件写入另一个文件时出现值错误

Question

I am writing a code that goes through many csv files in a folder(using a for loop), removes bad data from each csv file(where row values are more than number of columns or sometimes lesser than number of columns).我正在编写一个代码，它遍历文件夹中的许多 csv 文件（使用 for 循环），从每个 csv 文件中删除错误数据（其中行值大于列数，有时小于列数）。 After removing, I rearrange the columns and then I write the useful data into a new csv file.删除后，我重新排列列，然后将有用的数据写入新的 csv 文件。
Here in the code below the for loop is for cycling between different files present in a folder.在下面的代码中，for 循环用于在文件夹中存在的不同文件之间循环。 You can assume the df=pd.read_csv line as the beginning and assume correct indentation.您可以假定df=pd.read_csv行作为开头并假定正确的缩进。

import pandas as pd
import os

for filename in os.listdir("csv files copy"):
    filenames=os.path.join("csv files copy",filename)
    print(filename)
   
    df=pd.read_csv(filenames, error_bad_lines=False)

    for row in df:

        df.columns=["id","FirstName","LastName","UserName","Phone","IsContact","RestrictionReason","Status","IsScam","Date"]
        df = df.drop(labels="Status", axis=1)
        df = df.reindex(columns=['id', 'Phone', 'FirstName', 'LastName', 'UserName',"IsContact","IsScam","Date","RestrictionReason"])
        df.to_csv(filenames,index=False)

While doing so this is the error I recieve.这样做时，这是我收到的错误。
ValueError: Length mismatch: Expected axis has 9 elements, new values have 10 elements

This is the first 4 values and the header of the dataframe that I am using:这是我使用的 dataframe 的前 4 个值和 header：

id                      Phone   FirstName   LastName   UserName     IsContact  IsScam Date                       RestrictionReason        Status             
Forex Pips Fire Free    NaN     Goldenboy      NaN     Goldenboyys      False   False  5/7/2022 8:34:07 AM                NaN             NaN
Forex Pips Fire Free    NaN     Abu 3odeh      NaN         oudah12      False   False  5/7/2022 8:38:03 AM                NaN             NaN
Forex Pips Fire Free    NaN        Rahman     Azar     Rahman_Azar      False   False  5/7/2022 8:41:22 AM                NaN             NaN
Forex Pips Fire Free    NaN         HUDLE      NaN       Hudle1051      False   False  5/7/2022 8:41:11 AM                NaN             NaN

And given below is the header of the destination csv file that the above data needs to be entered into下面给出的是目标csv文件的header，上面的数据需要输入

id Phone FirstName LastName UserName IsContact IsScam Date RestrictionReason

Answer 1

You need to remove the for loop as follows:您需要删除 for 循环，如下所示：

import pandas as pd
import os

for filename in os.listdir("csv files copy"):
    filenames = os.path.join("csv files copy", filename)
    print(filename)
   
    df = pd.read_csv(filenames, error_bad_lines=False)
    df.columns = ["id", "FirstName", "LastName", "UserName", "Phone", "IsContact", "RestrictionReason", "Status", "IsScam", "Date"]
    df = df.drop(labels="Status", axis=1)
    df = df.reindex(columns=["id", "Phone", "FirstName", "LastName", "UserName","IsContact","IsScam","Date","RestrictionReason"])
    df.to_csv(filenames, index=False)

This was causing the error and is not needed.这是导致错误的原因，不需要。 The first time through the loop it correctly removes Status column and saves the CSV file.第一次通过循环时，它正确地删除了Status列并保存了 CSV 文件。 The second time through the loop (on the same dataframe) it attempts to do df.columns again but now there is no Status column, so an incorrect number of columns are given.第二次通过循环（在同一数据帧上）它尝试再次执行df.columns但现在没有Status列，因此给出了不正确的列数。

The code for row in df: would actually iterate over the column names in the dataframe, for row in df:的代码实际上会遍历 dataframe 中的列名，
eg id then FirstName etc.例如id然后FirstName等。

Answer 2

Because you give only 9 columns in this line you missed the 'Status' column因为您在这一行中只给出了 9 列，所以您错过了'Status'列

df = df.reindex(columns=['id', 'Phone', 'FirstName', 'LastName', 'UserName', 'IsContact', 'IsScam', 'Date', 'RestrictionReason'])
df.to_csv(filenames, index=False)

使用 pandas 从一个 csv 文件写入另一个文件时出现值错误

问题描述

2 个解决方案

解决方案1
1 已采纳 2022-05-09 12:02:36

解决方案2
0 2022-05-07 18:05:22

使用 pandas 从一个 csv 文件写入另一个文件时出现值错误

问题描述

2 个解决方案

解决方案1 1 已采纳 2022-05-09 12:02:36

解决方案2 0 2022-05-07 18:05:22

解决方案1
1 已采纳 2022-05-09 12:02:36

解决方案2
0 2022-05-07 18:05:22