简体   繁体   English

列拆分为两个单独的列

[英]Column split into two separate columns

I am trying to split my Period column into two different column.我正在尝试将我的Period列拆分为两个不同的列。 First will be quarter and second will be its corresponding year.第一个是季度,第二个是相应的年份。

import pandas as pd
df = pd.DataFrame({'Period': ["Q3'16", "Q1'17", "Q2'17","Q3'17"]})

dtype: Object dtype: Object

Result should look like this:结果应如下所示:

df = pd.Dataframe({'Quarter': ['Q3', 'Q1', 'Q2','Q3'],
'Year': ['2016','2017','2017','2017']})

Since this doesn't belong to any timestamp format.因为这不属于任何时间戳格式。 I am having some difficulties figuring this out.我在解决这个问题时遇到了一些困难。

For reference my original df looks like this.作为参考,我原来的 df 看起来像这样。 And the column is object type.列为 object 类型。 作为参考,我原来的 df 看起来像这样

Use Series.str.extract with convert periods to datetimes and extract year :使用Series.str.extract将句点转换为日期时间并提取year

dates = pd.to_datetime(df['Period'].replace(r"(Q\d)'(\d+)", r'\2-\1', regex=True))

df['Quarter'] = df['Period'].str.extract(r"(Q\d)")
df['Year']  = dates.dt.strftime('%Y')

Or if all years are greater like 2000 use str.extract :或者,如果所有年份都大于2000 ,请使用str.extract

df['Quarter'] = df['Period'].str.extract(r"(Q\d)")
df['Year']  = '20' + df['Period'].str.extract(r"'(\d+)")

Or solution with Series.str.split :或使用Series.str.split的解决方案:

s = df['Period'].str.split("'")
df['Quarter'] = s.str[0]
df['Year']  = '20' + s.str[1]

Alternative:选择:

df[['Quarter','Year']] = df['Period'].str.split("'", expand=True)
df['Year']  = '20' + df['Year']

print (df)
  Period Quarter  Year
0  Q3'16      Q3  2016
1  Q1'17      Q1  2017
2  Q2'17      Q2  2017
3  Q3'17      Q3  2017
    
import pandas as pd
df = pd.DataFrame({'Period': ["Q3'16", "Q1'17", "Q2'17","Q3'17"]})
df = pd.DataFrame(df.Period.str.split("'",1).tolist(),
                             columns = ['Quarter','Year'])
df["Year"] = "20"+df["Year"] 
print(df)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM