简体   繁体   English

Python Pandas“未命名”列不断出现

[英]Python Pandas 'Unnamed' column keeps appearing

I am running into an issue where each time I run my program (which reads the dataframe from a .csv file) a new column shows up called 'Unnamed'. 我遇到一个问题,每次我运行程序(从.csv文件读取数据帧)时,都会出现一个名为“未命名”的新列。

sample output columns after running 3 times - 运行3次后采样输出列-

  Unnamed: 0  Unnamed: 0.1            Subreddit  Appearances

here is my code. 这是我的代码。 for each row, the 'Unnamed' columns simply increase by 1. 对于每一行,“未命名”列仅增加1。

df = pd.read_csv(Location)
while counter < 50:
    #gets just the subreddit name
    e = str(elem[counter].get_attribute("href"))
    e = e.replace("https://www.reddit.com/r/", "")
    e = e[:-1]
    if e in df['Subreddit'].values:
        #adds 1 to Appearances if the subreddit is already in the DF
        df.loc[df['Subreddit'] == e, 'Appearances'] += 1
    else:
        #adds new row with the subreddit name and sets the amount of appearances to 1.
        df = df.append({'Subreddit': e, 'Appearances': 1}, ignore_index=True)
    df.reset_index(inplace=True, drop=True)
    print(e)
    counter = counter + 2
#(doesn't work) df.drop(df.columns[df.columns.str.contains('Unnamed', case=False)], axis=1)

The first time i run it, with a clean .csv file, it works perfect, but each time after, another 'Unnamed' column shoes up. 我第一次使用干净的.csv文件运行它时,它运行完美,但是每次之后,都会出现另一个“未命名”列。 I just wanted the 'Subreddit' and 'Appearances' columns to show each time. 我只是想每次都显示“ Subreddit”和“ Appearances”列。

另一种解决方案是读取属性为index_col=0 csv,而不考虑索引列: df = pd.read_csv(Location, index_col=0)

each time I run my program (...) a new column shows up called 'Unnamed'. 每次我运行程序(...)时,都会出现一个名为“未命名”的新列。

I suppose that's due to reset_index or maybe you have a to_csv somewhere in your code as @jpp suggested. 我想那是由于reset_index引起的,或者您的代码中某处有一个to_csv ,如@jpp建议的那样。 To fix the to_csv be sure to use index=False : 要修复to_csv确保使用index=False

df.to_csv(path, index=False)

In general, here's how I would approach your task. 通常,这是我将如何处理您的任务。 What this does is to count all appearances first (keyed by e ), and from these counts create a new dataframe to merge with the one you already have ( how='outer' adds rows that don't exist yet). 这样做是首先对所有外观进行计数(由e ),然后从这些计数中创建一个新的数据框,以与您已有的数据框合并( how='outer'添加尚不存在的行)。 This avoids resetting the index for each element which should avoid the problem and is also more performant. 这样避免了为每个元素重置索引,从而避免了该问题,并且性能更高。

Here's the code with these thoughts included: 以下是包含这些想法的代码:

base_df = pd.read_csv(location)
appearances = Counter()  # from collections
while counter < 50:
    #gets just the subreddit name
    e = str(elem[counter].get_attribute("href"))
    e = e.replace("https://www.reddit.com/r/", "")
    e = e[:-1]
    appearances[e] += 1
    counter = counter + 2
appearances_df = pd.DataFrame({'e': e, 'appearances': c } 
                               for e, c in x.items())
df = base_df.merge(appearances_df, how='outer', on='e')

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python/Pandas - 删除第一行未命名:0,未命名:1,未命名:2,未命名:3,未命名:4,未命名:5,未命名:6,未命名:7 - Python/Pandas - Remove the first row with Unnamed: 0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7 Python pandas:仅使用上一列名称更改下一个未命名列 - Python pandas: change only next unnamed column with the previous column name 为什么熊猫数据框将列名显示为&#39;未命名:1&#39;,未命名:2&#39;,.......&#39;未命名:n&#39; - Why pandas dataframe displaying column names as 'unnamed: 1', unnamed: 2',.......'unnamed: n' 从 csv 文件 Pandas Python 中删除未命名的列 - Deleting an unnamed column from a csv file Pandas Python "重命名未命名的列熊猫数据框" - Rename unnamed column pandas dataframe Pandas合并如何避免未命名的列 - Pandas merge how to avoid unnamed column 无法在熊猫的 csv 中删除未命名的列 - Cannot drop an unnamed column in a csv in pandas 如何使用python pandas从未命名列excel中过滤包含关键字的文本数据并打印到txt文件 - How to filter text data containing key words from an unnamed column excel with python pandas and print to txt file python pandas - 创建一个保持连续值的运行计数的列 - python pandas - creating a column which keeps a running count of consecutive values 从json创建的Pandas数据框有未命名的列 - 由于未命名的列问题而无法插入MySQL - Pandas dataframe created from json has unnamed column - can't insert into MySQL due to unnamed column issue
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM