Python Pandas 'Unnamed' column keeps appearing

Question

I am running into an issue where each time I run my program (which reads the dataframe from a .csv file) a new column shows up called 'Unnamed'.

sample output columns after running 3 times -

  Unnamed: 0  Unnamed: 0.1            Subreddit  Appearances

here is my code. for each row, the 'Unnamed' columns simply increase by 1.

df = pd.read_csv(Location)
while counter < 50:
    #gets just the subreddit name
    e = str(elem[counter].get_attribute("href"))
    e = e.replace("https://www.reddit.com/r/", "")
    e = e[:-1]
    if e in df['Subreddit'].values:
        #adds 1 to Appearances if the subreddit is already in the DF
        df.loc[df['Subreddit'] == e, 'Appearances'] += 1
    else:
        #adds new row with the subreddit name and sets the amount of appearances to 1.
        df = df.append({'Subreddit': e, 'Appearances': 1}, ignore_index=True)
    df.reset_index(inplace=True, drop=True)
    print(e)
    counter = counter + 2
#(doesn't work) df.drop(df.columns[df.columns.str.contains('Unnamed', case=False)], axis=1)

The first time i run it, with a clean .csv file, it works perfect, but each time after, another 'Unnamed' column shoes up. I just wanted the 'Subreddit' and 'Appearances' columns to show each time.

Answer 1

另一种解决方案是读取属性为index_col=0 csv，而不考虑索引列： df = pd.read_csv(Location, index_col=0) 。

Answer 2

each time I run my program (...) a new column shows up called 'Unnamed'.

I suppose that's due to reset_index or maybe you have a to_csv somewhere in your code as @jpp suggested. To fix the to_csv be sure to use index=False :

df.to_csv(path, index=False)

In general, here's how I would approach your task. What this does is to count all appearances first (keyed by e ), and from these counts create a new dataframe to merge with the one you already have ( how='outer' adds rows that don't exist yet). This avoids resetting the index for each element which should avoid the problem and is also more performant.

Here's the code with these thoughts included:

base_df = pd.read_csv(location)
appearances = Counter()  # from collections
while counter < 50:
    #gets just the subreddit name
    e = str(elem[counter].get_attribute("href"))
    e = e.replace("https://www.reddit.com/r/", "")
    e = e[:-1]
    appearances[e] += 1
    counter = counter + 2
appearances_df = pd.DataFrame({'e': e, 'appearances': c } 
                               for e, c in x.items())
df = base_df.merge(appearances_df, how='outer', on='e')

Python Pandas 'Unnamed' column keeps appearing

Question

2 answers

solution1
4 2019-01-31 12:44:28

solution2
2 ACCPTED 2018-10-10 00:13:34

Python Pandas 'Unnamed' column keeps appearing

Question

2 answers

solution1 4 2019-01-31 12:44:28

solution2 2 ACCPTED 2018-10-10 00:13:34

solution1
4 2019-01-31 12:44:28

solution2
2 ACCPTED 2018-10-10 00:13:34