[英]Combining corresponding columns between two separate dataframes into new dataframe
I have two dataframes looking like below:我有两个如下所示的数据框:
df1
Column 1 Column 2 Column 3
0.2 0.4 0.5
0.25 0.44 0.45
0.26 0.32 0.33
df2
Column 1 Column 2 Column 3
340 350 360
410 400 350
234 324 450
You could try a more fancy way of ordering here: Pandas concatenate alternating columns 您可以在此处尝试更理想的订购方式: Pandas串联交替的列
But it would be much easier to read the code if the dataframes were combined explicitly in the desired way. 但是,如果以所需方式显式组合数据帧,则读取代码会容易得多。
First declare new column names: 首先声明新的列名:
dataCols = ['c1', 'c2', 'c3', 'c4', 'c5', 'c6']
Then alternate the Series: 然后替换系列:
dataSeries = [df1.Column1, df2.Column1, df1.Column2, df2.Column2, df1.Column3, df2.Column3]
(Use df1['Column 1']
if there are spaces in your current column names) (如果当前列名称中有空格,请使用df1['Column 1']
)
Then combine into a dictionary and create a dataframe: 然后合并成一个字典并创建一个数据框:
dataDict = dict(list(zip(dataCols, dataSeries)
newDf = pd.DataFrame(dataDict)
This will create a dataframe with alternating columns. 这将创建具有交替列的数据框。
To alternate the columns for any dataframe (with any, possibly non-identical column names), first combine the two dataframes and then reorder them by passing a list of column names in the order that you want. 要替换任何数据框的列(具有任何可能不相同的列名),请首先组合两个数据框,然后通过按所需顺序传递列名列表来对其重新排序。
For alternating order, first get lists of the two dataframe column names 对于交替顺序,首先获取两个数据框列名称的列表
l1 = df1.columns
l2 = df2.columns
Then create pairs of column names zipping them the two lists (results in ('col1','col1')
.... ect.) 然后创建成对的列名,将它们压缩到两个列表中(结果为('col1','col1')
.... ect。)
colNames = zip(l1, l2)
Then combine in an alternating fashion with list comprehension 然后以交替方式与列表理解相结合
combinedNames = [name for pair in colNames for name in pair]
This will create a list with paired columns names. 这将创建带有成对的列名称的列表。
Apply this list to your combined dataframe to reorder it: 将此列表应用于合并的数据框以对其重新排序:
combinedDf = combinedDf[combinedNames]
A simpler way to do that is by defining a new DataFrame and using the pandas.DataFrame.append() function in a for loop so that you alternate the 2 DataFrames.一种更简单的方法是定义一个新的 DataFrame 并在 for 循环中使用 pandas.DataFrame.append() 函数,以便交替使用 2 个 DataFrame。 Then you need to transform your new DataFrame to get it right:然后,您需要转换新的 DataFrame 以使其正确:
NumColumn=d1.shape[1]
NewDF = pd.DataFrame()
for i in range(NumColumn):
NewDF = NewDF.append(d1.iloc[:, i])
NewDF = NewDF.append(d2.iloc[:, i])
NewDF = NewDF.T
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.