简体   繁体   中英

Pandas skipping certain columns

I'm trying to format an Amazon Vendor CSV using Pandas but I'm running into an issue. The issue stems from the fact that Amazon inserts a row with report information before the headers.

When trying to skip over that row when assigning headers to the dataframe, not all columns are captured. Below is my attempt at explicitly stating which row to pull columns from but it doesn't appear to be correct.

df = pd.read_csv(path + 'Amazon Search Terms_Search Terms_US.csv', sep=',', error_bad_lines=False, index_col=False, encoding='utf-8')

headers = df.loc[0]

new_df = pd.DataFrame(df.values[1:], columns=headers)
print('Copying data into new data frame....')

Before it looks like this(I want row 2 to be all the columns in the new df:前代码

After the fact it looks like this(it only selects 5):后码

I've also tried having it skiprows when opening the CSV, it doesn't treat the report row as data so it just ends up skipping actual data. Not really sure what is going wrong here, any help would be appreciated.

正如@suvayu 在评论中发布的那样,将 header=1 添加到读取的 csv 中完成了这项工作。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM