简体   繁体   中英

Add 'name' properties to row and column names when creating a pandas DataFrame

I am creating a Pandas dataframe to learn about handling missing data. What I want is to add the Row and Column names to the DataFrame when creating it, instead of passing them later with 'df.index.name =' and 'df.columns.name ='. How can I do this?

# Program to generate a m x n DataFrame with random NaN values scattered in:
import random
def df_maker(m, n): 
    df = pd.DataFrame(np.random.randint(1, 100, (m*n)).reshape(m, n), index = [f'Row {i+1}' for i in range(m)], columns = [f'Col {j+1}' for j in range(n)] )
    for i in range(m):
        df.iloc[[i],[random.randrange(n)]] = np.nan
    return df
df = df_maker(10, 10) 
df.index.name = 'Rows'
df.columns.name = 'Columns'
df

I tried looking up the doc for pandas.DataFrame , pandas.DataFrame.rename_axis and some other methods, but can't find what i am looking for. So how can I create the above dataframe with 1 line of code, without using df.index.name = 'Rows' and df.columns.name = 'Columns' ? Thanks.

Create the Index objects representing the rows and columns separately:

def df_maker(m, n):
    index = pd.Index([f'Row {i + 1}' for i in range(m)], name='Rows')
    columns = pd.Index([f'Col {i + 1}' for i in range(n)], name='Columns')
    df = pd.DataFrame(np.random.randint(1, 100, size=(m, n)), index=index, columns=columns)
    # rest of your code here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM