简体   繁体   English

在 pandas 中将月份和年份转换为天

[英]convert months and years into days in pandas

I have dataframe which has months and years i want to convert it into days我有 dataframe 有几个月和几年我想把它转换成天

 Name     details
    
    prem     6 months probation included
    
    shaves    3 years 6 months  suspended
    
    geroge    48 hours work time
    
    julvie    4 years 20 days terms included 
    
   tiz        80 days work
   lamp       44 days work

here i want to change 3 years as 1095 days, 6 months as 186 days , leap year can also be included, and i want to remove all other words like probation included, suspended , i want to get all the results in a new column.在这里,我想将3 years as 1095 days, 6 months as 186 days ,也可以包括闰年,并且我想删除所有其他单词,例如probation included, suspended ,我想在新列中获得所有结果。

expected result:预期结果:

 Name     details                            Time
    
    prem     6 months probation included     186 days
    
    shaves    3 years 6 months suspended              1181 days
    
    geroge    48 hours work time             48 hours
    
    julvie    4 years 20 days terms included         1480 days
   tiz        80 days  work                      80 days
  lamp       44 days   work                      44 days

Use Series.str.extract for get years and monts in numeric, then multiple by scalars, because is not specified date of start (it should be more precise, eg for year=365.2564days ) by Series.map , and last add units by condition in numpy.where :使用Series.str.extract获取数字中的年份和 monts,然后乘以标量,因为没有指定开始日期(应该更精确,例如year=365.2564days )由Series.map ,最后添加单位numpy.where中的条件:

d = {'months': 31, 'years':365, 'hours':1, 'days':1}
df1 = df['details'].str.extract('(\d+)\s+(years|months|hours|days)', expand=True)
df['Time'] = df1[0].astype(float).mul(df1[1].map(d)).astype('Int64').astype(str)

df['Unit'] = np.where(df1[1].isin(['years','months', 'days']), ' days', ' ' + df1[1])

df['Time'] += df.pop('Unit')  
print (df)
     Name                      details       Time
0    prem  6 months probation included   186 days
1  shaves            3 years suspended  1095 days
2  geroge           48 hours work time   48 hours
3  julvie       4 years terms included  1460 days
4     tiz                 80 days work    80 days
5    lamp                 44 days work    44 days    

EDIT: If possible multiple units you can use:编辑:如果可能的话,您可以使用多个单位:

#specified dictionary for extract to days
d = {'months': 31, 'years':365, 'days':1}

#extract anf multiple by dictionary
out = {k: df['details'].str.extract(rf'(\d+)\s+{k}', expand=False).astype(float).mul(d[k])
          for k, v in d.items()}
#join together, sum and convert to days with replace 0 days 
days = pd.concat(out, axis=1).sum(axis=1).astype(int).astype('str').add(' days').replace('0 days','')

#extract hours
hours = df['details'].str.extract(r'(\d+\s+hours)', expand=False).radd(' ').fillna('')

#join together
df['Time'] = days + hours
print (df)
     Name                                      details               Time
0    john  2 years 1 months 10 days 15 hours work time  771 days 15 hours
1    prem                  6 months probation included           186 days
2  shaves                  3 years 6 months  suspended          1281 days
3  geroge                           48 hours work time           48 hours
4  julvie               4 years 20 days terms included          1480 days
5     tiz                                 80 days work            80 days
6    lamp                                 44 days work            44 days
    
# extract the date
date_cols = ['years', 'months', 'days', 'hours']
for col in date_cols:
    df[col] = df.details.str.extract(f'(\d+)\s+{col}').fillna('0')

# convert to int
df[date_cols] = df[date_cols].astype(int)
days = df['years'] * 365 + df['months'] * 31 + df['days']
hours = df['hours'] 

# convert to string
days = days.astype('str') + ' days'
days[days == '0 days'] = ''
hours = hours.astype('str') + ' hours'
hours[hours == '0 hours'] = ''
df['tag'] = days + ' ' + hours
print(df)

0    Name                          details  years  months  days  hours  \
1    prem      6 months probation included      0       6     0      0   
2  shaves      3 years 6 months  suspended      3       6     0      0   
3  geroge               48 hours work time      0       0     0     48   
4  julvie  4 years 20 days terms included       4       0    20      0   
5     tiz                     80 days work      0       0    80      0   
6    lamp                     44 days work      0       0    44      0   

0         tag  
1   186 days   
2  1281 days   
3    48 hours  
4  1480 days   
5    80 days   
6    44 days

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM