简体   繁体   English

熊猫跳过某些列

[英]Pandas skipping certain columns

I'm trying to format an Amazon Vendor CSV using Pandas but I'm running into an issue.我正在尝试使用 Pandas 格式化 Amazon Vendor CSV,但遇到了问题。 The issue stems from the fact that Amazon inserts a row with report information before the headers.该问题源于亚马逊在标题前插入了包含报告信息的行。

When trying to skip over that row when assigning headers to the dataframe, not all columns are captured.在将标题分配给数据帧时尝试跳过该行时,并非所有列都被捕获。 Below is my attempt at explicitly stating which row to pull columns from but it doesn't appear to be correct.下面是我试图明确说明从哪一行提取列,但它似乎不正确。

df = pd.read_csv(path + 'Amazon Search Terms_Search Terms_US.csv', sep=',', error_bad_lines=False, index_col=False, encoding='utf-8')

headers = df.loc[0]

new_df = pd.DataFrame(df.values[1:], columns=headers)
print('Copying data into new data frame....')

Before it looks like this(I want row 2 to be all the columns in the new df:在它看起来像这样之前(我希望第 2 行是新 df 中的所有列:前代码

After the fact it looks like this(it only selects 5):事实上它看起来像这样(它只选择 5):后码

I've also tried having it skiprows when opening the CSV, it doesn't treat the report row as data so it just ends up skipping actual data.我也试过在打开 CSV 时让它跳过行,它不会将报告行视为数据,因此它最终会跳过实际数据。 Not really sure what is going wrong here, any help would be appreciated.不太确定这里出了什么问题,任何帮助将不胜感激。

正如@suvayu 在评论中发布的那样,将 header=1 添加到读取的 csv 中完成了这项工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM