简体   繁体   English

如何连接列

[英]How to concat columns

I have download two data frames, that I want to concat them together (I am not programmer). 我已经下载了两个数据帧,希望将它们连接在一起(我不是程序员)。 I am using Python 2.7. 我正在使用Python 2.7。

My code is like this: 我的代码是这样的:

from pandas_datareader.data import Options
import datetime
import pandas as pd

pd.set_option('display.width',280) # default is 80

aapl = Options('aapl', 'yahoo')
expiry = datetime.date(2017, 1, 1)

data_CALL = aapl.get_call_data(expiry=expiry)
data_PUT = aapl.get_put_data(expiry=expiry)

#The first DataFrame
Q1=data_CALL.iloc[0:5, 0:8]
print(Q1.head())

#The Second DataFrame
Q2=data_PUT.iloc[0:5:, 0:8]
print(Q2.head())


#I got this DataFrame
concat=pd.concat([Q1,Q2])
print(concat)

The results looks like: 结果如下: 在此处输入图片说明

I want that my final data frame look like this: 我希望我的最终数据框如下所示: 在此处输入图片说明

your result looks like a result of using groupby object. 您的结果看起来像是使用groupby对象的结果。 ie you have grouped your data by strike as your grouping attribute. 即,您已将按strike将数据分组为分组属性。

to do it this way, you need to remove your composite index for each dataframe; 为此,您需要为每个数据框删除复合索引; from the look of it you have a index of (Strik, Expiry, Type, Symbol) . 从外观上看,您的索引为(Strik, Expiry, Type, Symbol)

I can only assume this is set up in your get_call_data and get_put_data functions as they return dataframes directly. 我只能假设这是在您的get_call_dataget_put_data函数中设置的,因为它们直接返回数据帧。 I would suggest that you import raw dataframe as is without setting any index, at this moment; 我建议您此时不设置任何索引就按原样导入原始数据框; this way strike will remain as column which you could easily use as your grouping attribute as shown below. 这样, strike将保留为列,您可以轻松地将其用作分组属性,如下所示。

actually grouping is very easy: 实际上分组非常容易:

combined = pd.concat([Q1, Q2])
grouped = combined.groupby('strike')  # single grouping attribute

to validate result: 验证结果:

for name, group in grouped:
    print name
    print group

to manipulate the data (just a few examples here): 处理数据(此处仅举几个例子):

# 1. using groupby builtin functions

first_group = grouped.first()  # you can do last()
first_strike = grouped.get_group('115')  # return you a dataframe as given by you

# 2. using list

group_list = list(grouped)
first_group = group_list[0]  # return a tuple with 2 elements: name and DataFrame
first_strike_name = group_list[0][0]
first_strike_data = group_list[0][1]

you can read more about Python GroupBy here . 您可以在此处阅读有关Python GroupBy的更多信息。

As per your description you have two dataframes Q1 and Q2 and you wanted to concatenate both on "Strike" index . 根据您的描述,您有两个数据帧Q1 and Q2并且您想将两个数据帧都连接在"Strike"索引上。 For the same code will be something like these , 对于相同的代码将是这样的,

Concetenateddf=pd.concat([Q1.set_index('Strike'),Q2.set_index('Strike')], axis=1, join='inner')

Now if you will do Concetenateddf.head() then you will get similar results as you are expecting . 现在,如果您要执行Concetenateddf.head() ,则将得到与预期相似的结果。 连续数据框

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM