簡體   English   中英

在 pandas 中將月份和年份轉換為天

[英]convert months and years into days in pandas

我有 dataframe 有幾個月和幾年我想把它轉換成天

 Name     details
    
    prem     6 months probation included
    
    shaves    3 years 6 months  suspended
    
    geroge    48 hours work time
    
    julvie    4 years 20 days terms included 
    
   tiz        80 days work
   lamp       44 days work

在這里,我想將3 years as 1095 days, 6 months as 186 days ,也可以包括閏年,並且我想刪除所有其他單詞,例如probation included, suspended ,我想在新列中獲得所有結果。

預期結果:

 Name     details                            Time
    
    prem     6 months probation included     186 days
    
    shaves    3 years 6 months suspended              1181 days
    
    geroge    48 hours work time             48 hours
    
    julvie    4 years 20 days terms included         1480 days
   tiz        80 days  work                      80 days
  lamp       44 days   work                      44 days

使用Series.str.extract獲取數字中的年份和 monts,然后乘以標量,因為沒有指定開始日期(應該更精確,例如year=365.2564days )由Series.map ,最后添加單位numpy.where中的條件:

d = {'months': 31, 'years':365, 'hours':1, 'days':1}
df1 = df['details'].str.extract('(\d+)\s+(years|months|hours|days)', expand=True)
df['Time'] = df1[0].astype(float).mul(df1[1].map(d)).astype('Int64').astype(str)

df['Unit'] = np.where(df1[1].isin(['years','months', 'days']), ' days', ' ' + df1[1])

df['Time'] += df.pop('Unit')  
print (df)
     Name                      details       Time
0    prem  6 months probation included   186 days
1  shaves            3 years suspended  1095 days
2  geroge           48 hours work time   48 hours
3  julvie       4 years terms included  1460 days
4     tiz                 80 days work    80 days
5    lamp                 44 days work    44 days    

編輯:如果可能的話,您可以使用多個單位:

#specified dictionary for extract to days
d = {'months': 31, 'years':365, 'days':1}

#extract anf multiple by dictionary
out = {k: df['details'].str.extract(rf'(\d+)\s+{k}', expand=False).astype(float).mul(d[k])
          for k, v in d.items()}
#join together, sum and convert to days with replace 0 days 
days = pd.concat(out, axis=1).sum(axis=1).astype(int).astype('str').add(' days').replace('0 days','')

#extract hours
hours = df['details'].str.extract(r'(\d+\s+hours)', expand=False).radd(' ').fillna('')

#join together
df['Time'] = days + hours
print (df)
     Name                                      details               Time
0    john  2 years 1 months 10 days 15 hours work time  771 days 15 hours
1    prem                  6 months probation included           186 days
2  shaves                  3 years 6 months  suspended          1281 days
3  geroge                           48 hours work time           48 hours
4  julvie               4 years 20 days terms included          1480 days
5     tiz                                 80 days work            80 days
6    lamp                                 44 days work            44 days
    
# extract the date
date_cols = ['years', 'months', 'days', 'hours']
for col in date_cols:
    df[col] = df.details.str.extract(f'(\d+)\s+{col}').fillna('0')

# convert to int
df[date_cols] = df[date_cols].astype(int)
days = df['years'] * 365 + df['months'] * 31 + df['days']
hours = df['hours'] 

# convert to string
days = days.astype('str') + ' days'
days[days == '0 days'] = ''
hours = hours.astype('str') + ' hours'
hours[hours == '0 hours'] = ''
df['tag'] = days + ' ' + hours
print(df)

0    Name                          details  years  months  days  hours  \
1    prem      6 months probation included      0       6     0      0   
2  shaves      3 years 6 months  suspended      3       6     0      0   
3  geroge               48 hours work time      0       0     0     48   
4  julvie  4 years 20 days terms included       4       0    20      0   
5     tiz                     80 days work      0       0    80      0   
6    lamp                     44 days work      0       0    44      0   

0         tag  
1   186 days   
2  1281 days   
3    48 hours  
4  1480 days   
5    80 days   
6    44 days

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM