熊猫结合数据框

Question

I have the following two dataframes: The 1st column is the index and the last column is derived from the index by appending a '.txt' to it. 我有以下两个数据帧：第一列是索引，最后一列是通过在索引后附加一个'.txt'来从索引派生的。

A
1  0.2   0.3   1.txt
2  0.4   0.6   2.txt

B
1  0.1   0.8   1.txt
2  3.0   4.5   2.txt

I would like to combine them so: 我想将它们结合起来：

1  0.2   0.3   1.txt
2  0.4   0.6   2.txt
3  0.1   0.8   3.txt
4  3.0   4.5   4.txt

I tried using pandas merge, but not sure of how to go about it without explicitly iterating using a for loop. 我尝试使用pandas merge，但是不确定如何使用for循环进行迭代。 Any suggestions? 有什么建议么？

Answer 1

Just concat them as a list and pass param ignore_index=true , then assign the index values to the 3rd column, convert to str dtype and then append the txt '.txt: 只需将它们concat为列表，然后传递param ignore_index=true ，然后将索引值分配给第三列，转换为str dtype，然后附加txt'.txt：

In [93]:

merged = pd.concat([A,B], ignore_index=True)
merged[3] = pd.Series(merged.index).astype(str) + '.txt'
merged
Out[93]:
     1    2      3
0  0.2  0.3  0.txt
1  0.4  0.6  1.txt
2  0.1  0.8  2.txt
3  3.0  4.5  3.txt

If you insist on the indexing being 1-based you can reassign to it and then run my code above: 如果您坚持索引基于1，则可以将其重新分配给它，然后在上面运行我的代码：

In [100]:

merged = pd.concat([A,B], ignore_index=True)
merged.index = np.arange(1, len(merged) + 1)
merged[3] = pd.Series(index=merged.index, data=merged.index.values).astype(str) + '.txt'
merged
Out[100]:
     1    2      3
1  0.2  0.3  1.txt
2  0.4  0.6  2.txt
3  0.1  0.8  3.txt
4  3.0  4.5  4.txt

As a side not I find it a little weird I have to specify the index values in the Series constructor in order for the alignment to be correct. 顺便说一句，我觉得有点怪异，我必须在Series构造函数中指定索引值，以便对齐正确。

Answer 2

Here's one to go about it 这是一个要做的

In [207]: df1
Out[207]:
   col1  col2    txt
0   0.2   0.3  1.txt
1   0.4   0.6  2.txt

In [208]: df2
Out[208]:
   col1  col2    txt
0   0.1   0.8  1.txt
1   3.0   4.5  2.txt

In [209]: df1.append(df2, ignore_index=True)
Out[209]:
   col1  col2    txt
0   0.2   0.3  1.txt
1   0.4   0.6  2.txt
2   0.1   0.8  1.txt
3   3.0   4.5  2.txt

In [217]: dff = df1.append(df2, ignore_index=True)

In [218]: dff['txt'] = dff.index.map(lambda x: '%d.txt' % (x+1))

In [219]: dff
Out[219]:
   col1  col2    txt
0   0.2   0.3  1.txt
1   0.4   0.6  2.txt
2   0.1   0.8  3.txt
3   3.0   4.5  4.txt

熊猫结合数据框

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-04-26 21:47:42

解决方案2
1 2015-04-26 21:57:31

熊猫结合数据框

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-04-26 21:47:42

解决方案2 1 2015-04-26 21:57:31

解决方案1
3 已采纳 2015-04-26 21:47:42

解决方案2
1 2015-04-26 21:57:31