[英]Python:how to split column into multiple columns in a dataframe and with dynamic column naming
[英]How to split a column in a dataframe into multiple columns in python?
这是我使用 Pandas 和 NumPy 的解决方案:
# Process the dataframe
import pandas as pd
import numpy as np
# Define the columns
rowName = np.arange(0,df.shape[0])
colName = ['Rate','Indicator','Date 1','Date 2','Date 3']
# Create new empty df
df2 = pd.DataFrame(index=rowName, columns=colName)
# Process each concatenated column
for i in range(df.shape[0]):
# Get concatenated string
rowText = df.ConcatenatedColumn[i]
# Find rate and indicator
for j in range(len(rowText)):
if (rowText[j].isalpha()): # isalpha checks for any character
df2.at[i, 'Rate'] = rowText[0:j]
df2.at[i, 'Indicator'] = rowText[j]
remStr = rowText[j+1:]
# Find the 3 dates
lenDate = 10 # Assuming dates are in YYYY-MM-DD format
df2.at[i, 'Date 1'] = remStr[0:lenDate]
df2.at[i, 'Date 2'] = remStr[lenDate:(2*lenDate)]
df2.at[i, 'Date 3'] = remStr[(2*lenDate):]
其中df
是您的串联列数据:
ConcatenatedColumn
0 0.9147V2020-08-042020-06-092019-11-09
1 3.2F2020-09-112019-05-052020-10-12
和df2
是您的拆分列输出:
Rate Indicator Date 1 Date 2 Date 3
0 0.9147 V 2020-08-04 2020-06-09 2019-11-09
1 3.2 F 2020-09-11 2019-05-05 2020-10-12
但是,请注意,我假设串联的列包含 3 个日期。 如果日期数量不同,我的日期代码可以替换为字符串不断更新的 while 循环。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.