Create a new column as a count of the Pandas DataFrame

Question

I have a Pandas DataFrame. How do I create a new column that is like a count of the Pandas DataFrame because I already made my index a Datatime.

For example, the following code is reproducible on your local PC:

import datetime
import numpy

dates = [
    datetime.date(2019, 1, 13),
    datetime.date(2020, 5, 11),
    datetime.date(2018, 7, 24),
    datetime.date(2019, 3, 23),
    datetime.date(2020, 2, 16)
]

data = {
    "a": [13.3,12.3,np.nan,10.3,np.nan],
    "b": [1,0,0,1,1],
    "c": ["no","yes","no","","yes"]
}

pd.DataFrame(index=dates,data=data)

Right now, I would like to add a new column as a count. Something like 1,2,3,4,5 until the end of the data

Answer 1

df['count'] = range(1, len(df) + 1)

len(df) returns the number of rows in the DataFrame, so you can call the builtin range function to create a range from 1 to the number of rows in the DataFrame, and then assign it to a new column. When assigning a range to a column, it is automatically converted to a pandas Series.

Answer 2

You can build a Series using df.index and apply some processing to it before assigning it to a column of the dataframe.

Here, we could use:

df['count'] = pd.Series(1, index=df.index()).cumsum()

Here it would be far less efficient (more than 1 magnitude order) than df['count'] = np.arange(1, 1 + len(df)) that directly builds a numpy array with the expected values, but it can be useful in more complex uses cases.

Create a new column as a count of the Pandas DataFrame

Question

2 answers

solution1
1 2020-03-04 16:37:44

solution2
1 2020-03-05 08:38:10

Create a new column as a count of the Pandas DataFrame

Question

2 answers

solution1 1 2020-03-04 16:37:44

solution2 1 2020-03-05 08:38:10

solution1
1 2020-03-04 16:37:44

solution2
1 2020-03-05 08:38:10