Splitting Dataframe based on corresponding numpy array values

Question

I have pandas dataframe A that looks like :

    2007-12-31    50230.62
    2008-01-02    48646.84
    2008-01-03    48748.04
    2008-01-04    46992.22
    2008-01-07    46491.28
    2008-01-08    45347.72
    2008-01-09    45681.68
    2008-01-10    46430.5

Where the date column is the index. I also have an numpy array B of the same length which has element -1, 0 and 1. What is the cleanest way to split the dataframe A into 3 dataframes such that the rows with equal corresponding B elements are grouped together. For eg. if B = numpy.array([0, 0, 0, 1, 1, -1, -1, 0]) then the dataframe should be split into :

    X
    2007-12-31    50230.62
    2008-01-02    48646.84
    2008-01-03    48748.04
    2008-01-10    46430.5

    Y
    2008-01-04    46992.22
    2008-01-07    46491.28

    Z
    2008-01-08    45347.72
    2008-01-09    45681.68

Answer 1

It's easy to utilize groupby from pandas, then you have the option to keep them grouped so you're not doubling your data. But you can always assign then

import numpy as np
import pandas as pd
import io

data = """    2007-12-31    50230.62
    2008-01-02    48646.84
    2008-01-03    48748.04
    2008-01-04    46992.22
    2008-01-07    46491.28
    2008-01-08    45347.72
    2008-01-09    45681.68
    2008-01-10    46430.5"""

df = pd.read_csv(io.StringIO(data), delimiter='\s+', header=None)
B = np.array([0, 0, 0, 1, 1, -1, -1, 0])

df['B'] = B

df_groups = df.groupby(['B'])

x = df_groups.get_group((0))
y = df_groups.get_group((-1))
z = df_groups.get_group((1))

The 0,-1,1 are the names based on the B value.

Splitting Dataframe based on corresponding numpy array values

Question

1 answers

solution1
1 ACCPTED 2015-10-31 04:38:43

Splitting Dataframe based on corresponding numpy array values

Question

1 answers

solution1 1 ACCPTED 2015-10-31 04:38:43

solution1
1 ACCPTED 2015-10-31 04:38:43