[英]Align dates on alternate columns Pandas Dataframe
I have a Pandas Dataframe, the columns 1-3-5-7...contain dates, the columns 2-4-6-8-.. contain data values. 我有一个Pandas Dataframe,第1-3-5-7列包含日期,第2-4-6-8-列包含数据值。 The dates in the columns do not correspond.
列中的日期不对应。 I want a single column containing all dates and the remaining columns containing just values Example:
我想要一个包含所有日期的列,其余的列仅包含值,例如:
input
date val1 date val2 date val3
2007-12-01 35.6 2007-12-05 101.1 2007-12-05 89.1
2007-12-02 36.7. 2007-12-06 102.3 2007-12-07 89.3
2007-12-05 36.7 2007-12-07 108.3. 2007-12-08 89.5
2007-12-06 36.9 2007-12-08 110.0 2007-12-09 89.3
2007-12-07 36.9. 2007-12-09 102.3 2007-12-12 89.9
output
date val1 val2 val3
2007-12-01 35.6 na na
2007-12-02 36.7 na na
2007-12-05 36.7 101.1 89.1
2007-12-06 36.9 102.3 na
2007-12-07 36.9 108.3 89.3
2007-12-08 na 110.0 89.5
2007-12-09 na 102.3 89.3
2007-12-12 na na 89.9
You can iteratively join all the couple of columns into a new empty dataframe. 您可以迭代将所有两列连接到一个新的空数据框中。
dft = pd.DataFrame({"date": []})
N = len(df.columns)
for n in range(N // 2):
dft = dft.merge(df.iloc[:, 2*n:2*(n+1)], on='date', how='outer')
Notice that we define an empty column date to merge on it the first iteration. 请注意,我们定义了一个空列日期以在第一次迭代时在其上合并。 The
'outer'
key says that all the values coming both from the left (initial) and right (merged) dataframe are to be kept, and nans
added where needed. 'outer'
键表示必须保留来自左侧(初始)和右侧(合并)数据帧的所有值,并在需要nans
添加nans
。 Hope this helps. 希望这可以帮助。
You can try so(It can happen that columns with same name are renamed): 您可以尝试这样做(可能会重命名具有相同名称的列):
df:
date val1 date.1 val2 date.2 val3
0 2007-12-01 35.6 2007-12-05 101.1 2007-12-05 89.1
1 2007-12-02 36.7. 2007-12-06 102.3 2007-12-07 89.3
2 2007-12-05 36.7 2007-12-07 108.3. 2007-12-08 89.5
3 2007-12-06 36.9 2007-12-08 110.0 2007-12-09 89.3
4 2007-12-07 36.9. 2007-12-09 102.3 2007-12-12 89.9
for index, i in enumerate(xrange(0,len(df.columns),2)):
col = df.columns[i]
name = 'df' + str(index)
name = df.iloc[:,i:i+2]
if index == 0:
dft = name
name.columns = ['date', ('value' + str(i/2+1))]
if index !=0:
dft = dft.merge(name, on='date', how='outer')
print dft
Output: 输出:
date value1 value2 value3
0 2007-12-01 35.6 NaN NaN
1 2007-12-02 36.7. NaN NaN
2 2007-12-05 36.7 101.1 89.1
3 2007-12-06 36.9 102.3 NaN
4 2007-12-07 36.9. 108.3. 89.3
5 2007-12-08 NaN 110.0 89.5
6 2007-12-09 NaN 102.3 89.3
7 2007-12-12 NaN NaN 89.9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.