简体   繁体   中英

How to get this single column data into data frame with appropriate columns

I am learning pandas and Data Science and am a beginner. I have a data as following

Rahul
1
2
5
Suresh
4
2
1
Dharm
1
3
4

I would like it in my dataframe as

Rahul   1
        2
        5
Suresh  4
        2
        1
Dharm   1
        3
        4

How can I achieve this without iterating over every row, as I have data in hundreds of thousand. I have searched a lot but cannot find anything other than iteration yet. Is there a better way.

Thank you for your kindness and patience

How it'd be best formatted depends on what you plan to do with it, but a good starting place would be doing this:

Given:

Rahul
1
2
5
Suresh
4
2
1
Dharm
1
3
4

Doing:

# Read in the file and call the column 'values':
df = pd.read_table(filepath, header=None, names=['values'])

# Create a new column with names filled in:
df['names'] = df['values'].replace('\d+', np.nan, regex=True).ffill()

# Drop the extra rows:
df = df[df['values'].str.isnumeric()].reset_index(drop=True)

print(df[['names', 'values']])

Output:

    names values
0   Rahul      1
1   Rahul      2
2   Rahul      5
3  Suresh      4
4  Suresh      2
5  Suresh      1
6   Dharm      1
7   Dharm      3
8   Dharm      4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM