I have pandas dataframe A that looks like :
2007-12-31 50230.62
2008-01-02 48646.84
2008-01-03 48748.04
2008-01-04 46992.22
2008-01-07 46491.28
2008-01-08 45347.72
2008-01-09 45681.68
2008-01-10 46430.5
Where the date column is the index. I also have an numpy array B of the same length which has element -1, 0 and 1. What is the cleanest way to split the dataframe A into 3 dataframes such that the rows with equal corresponding B elements are grouped together. For eg. if B = numpy.array([0, 0, 0, 1, 1, -1, -1, 0]) then the dataframe should be split into :
X
2007-12-31 50230.62
2008-01-02 48646.84
2008-01-03 48748.04
2008-01-10 46430.5
Y
2008-01-04 46992.22
2008-01-07 46491.28
Z
2008-01-08 45347.72
2008-01-09 45681.68
It's easy to utilize groupby
from pandas, then you have the option to keep them grouped so you're not doubling your data. But you can always assign then
import numpy as np
import pandas as pd
import io
data = """ 2007-12-31 50230.62
2008-01-02 48646.84
2008-01-03 48748.04
2008-01-04 46992.22
2008-01-07 46491.28
2008-01-08 45347.72
2008-01-09 45681.68
2008-01-10 46430.5"""
df = pd.read_csv(io.StringIO(data), delimiter='\s+', header=None)
B = np.array([0, 0, 0, 1, 1, -1, -1, 0])
df['B'] = B
df_groups = df.groupby(['B'])
x = df_groups.get_group((0))
y = df_groups.get_group((-1))
z = df_groups.get_group((1))
The 0,-1,1
are the names based on the B
value.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.