Below is an example DataFrame.
0 1 2 3 4
0 0.0 13.00 4.50 30.0 0.0,13.0
1 0.0 13.00 4.75 30.0 0.0,13.0
2 0.0 13.00 5.00 30.0 0.0,13.0
3 0.0 13.00 5.25 30.0 0.0,13.0
4 0.0 13.00 5.50 30.0 0.0,13.0
5 0.0 13.00 5.75 0.0 0.0,13.0
6 0.0 13.00 6.00 30.0 0.0,13.0
7 1.0 13.25 0.00 30.0 0.0,13.25
8 1.0 13.25 0.25 0.0 0.0,13.25
9 1.0 13.25 0.50 30.0 0.0,13.25
10 1.0 13.25 0.75 30.0 0.0,13.25
11 2.0 13.25 1.00 30.0 0.0,13.25
12 2.0 13.25 1.25 30.0 0.0,13.25
13 2.0 13.25 1.50 30.0 0.0,13.25
14 2.0 13.25 1.75 30.0 0.0,13.25
15 2.0 13.25 2.00 30.0 0.0,13.25
16 2.0 13.25 2.25 30.0 0.0,13.25
I want to split this into new dataframes when the row in column 0 changes.
0 1 2 3 4
0 0.0 13.00 4.50 30.0 0.0,13.0
1 0.0 13.00 4.75 30.0 0.0,13.0
2 0.0 13.00 5.00 30.0 0.0,13.0
3 0.0 13.00 5.25 30.0 0.0,13.0
4 0.0 13.00 5.50 30.0 0.0,13.0
5 0.0 13.00 5.75 0.0 0.0,13.0
6 0.0 13.00 6.00 30.0 0.0,13.0
7 1.0 13.25 0.00 30.0 0.0,13.25
8 1.0 13.25 0.25 0.0 0.0,13.25
9 1.0 13.25 0.50 30.0 0.0,13.25
10 1.0 13.25 0.75 30.0 0.0,13.25
11 2.0 13.25 1.00 30.0 0.0,13.25
12 2.0 13.25 1.25 30.0 0.0,13.25
13 2.0 13.25 1.50 30.0 0.0,13.25
14 2.0 13.25 1.75 30.0 0.0,13.25
15 2.0 13.25 2.00 30.0 0.0,13.25
16 2.0 13.25 2.25 30.0 0.0,13.25
I've tried adapting the following solutions without any luck so far. Split array at value in numpy Split a large pandas dataframe
Looks like you want to groupby
the first colum. You could create a dictionary from the groupby object, and have the groupby keys be the dictionary keys:
out = dict(tuple(df.groupby(0)))
Or we could also build a list from the groupby object. This becomes more useful when we only want positional indexing rather than based on the grouping key:
out = [sub_df for _, sub_df in df.groupby(0)]
We could then index the dict based on the grouping key , or the list based on the group's position:
print(out[0])
0 1 2 3 4
0 0.0 13.0 4.50 30.0 0.0,13.0
1 0.0 13.0 4.75 30.0 0.0,13.0
2 0.0 13.0 5.00 30.0 0.0,13.0
3 0.0 13.0 5.25 30.0 0.0,13.0
4 0.0 13.0 5.50 30.0 0.0,13.0
5 0.0 13.0 5.75 0.0 0.0,13.0
6 0.0 13.0 6.00 30.0 0.0,13.0
Based on
I want to split this into new dataframes when the row in column 0 changes.
If you only want to group when value in column 0 changes , You can try:
d=dict([*df.groupby(df['0'].ne(df['0'].shift()).cumsum())])
print(d[1])
print(d[2])
0 1 2 3 4
0 0.0 13.0 4.50 30.0 0.0,13.0
1 0.0 13.0 4.75 30.0 0.0,13.0
2 0.0 13.0 5.00 30.0 0.0,13.0
3 0.0 13.0 5.25 30.0 0.0,13.0
4 0.0 13.0 5.50 30.0 0.0,13.0
5 0.0 13.0 5.75 0.0 0.0,13.0
6 0.0 13.0 6.00 30.0 0.0,13.0
0 1 2 3 4
7 1.0 13.25 0.00 30.0 0.0,13.25
8 1.0 13.25 0.25 0.0 0.0,13.25
9 1.0 13.25 0.50 30.0 0.0,13.25
10 1.0 13.25 0.75 30.0 0.0,13.25
I will use GroupBy.__iter__
:
d = dict(df.groupby(df['0'].diff().ne(0).cumsum()).__iter__())
#d = dict(df.groupby(df[0].diff().ne(0).cumsum()).__iter__())
Note that if there are repeated non-consecutive values different groups will be created, if you only use groupby(0)
they will be grouped in the same group
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.