简体   繁体   English

根据列值中第一次出现的项目将数据框列拆分为两个

[英]Split dataframe column into two based on first occurrence of an item in column value

I have the following dataframe with 4 columns:我有以下 4 列数据框:

    IP  Time    URL Staus
0   10.128.2.1  [29/Nov/2017:06:58:55   GET /login.php HTTP/1.1 200
1   10.128.2.1  [29/Nov/2017:06:59:02   POST /process.php HTTP/1.1  302
2   10.128.2.1  [29/Nov/2017:06:59:03   GET /home.php HTTP/1.1  200
3   10.131.2.1  [29/Nov/2017:06:59:04   GET /js/vendor/moment.min.js HTTP/1.1   200
4   10.130.2.1  [29/Nov/2017:06:59:06   GET /bootstrap-3.3.7/js/bootstrap.js HTTP/1.1   200
5   10.130.2.1  [29/Nov/2017:06:59:19   GET /profile.php?user=bala HTTP/1.1 200

I need to split the Time column into two new columns titled 'date' and 'time'.我需要将时间列拆分为两个名为“日期”和“时间”的新列。 I need to split the current value under the Time column by the first occurrence of ':'.我需要通过第一次出现“:”来拆分时间列下的当前值。

I have tried the split function for the first instance of ':' as follows:我已经为 ':' 的第一个实例尝试了 split 函数,如下所示:

df['date','time']=df.Time.str.split(":", 1)

But this is what i end up getting:但这就是我最终得到的:

    IP  Time    URL Staus   (date, time)
0   10.128.2.1  [29/Nov/2017:06:58:55   GET /login.php HTTP/1.1 200 [[29/Nov/2017, 06:58:55]
1   10.128.2.1  [29/Nov/2017:06:59:02   POST /process.php HTTP/1.1  302 [[29/Nov/2017, 06:59:02]
2   10.128.2.1  [29/Nov/2017:06:59:03   GET /home.php HTTP/1.1  200 [[29/Nov/2017, 06:59:03]
3   10.131.2.1  [29/Nov/2017:06:59:04   GET /js/vendor/moment.min.js HTTP/1.1   200 [[29/Nov/2017, 06:59:04]

How do I properly split into two columns?我如何正确地分成两列? What am I doing wrong?我究竟做错了什么? Help :(帮助 :(

Add parameter expand=True for DataFrame and then add [] for new columns:DataFrame添加参数expand=True ,然后为新列添加[]

df[['date','time']] = df.Time.str.split(":", 1, expand=True)
print (df)
           IP                   Time                        URL  Staus  \
0  10.128.2.1  [29/Nov/2017:06:58:55     GET/login.php HTTP/1.1    200   
1  10.128.2.1  [29/Nov/2017:06:59:02  POST/process.php HTTP/1.1    302   

           date      time  
0  [29/Nov/2017  06:58:55  
1  [29/Nov/2017  06:59:02  

Or also add Series.str.strip for remove trailing [] :或者也添加Series.str.strip以删除尾随[]

df[['date','time']] = df.Time.str.strip('[]').str.split(":", 1, expand=True)
print (df)
           IP                   Time                        URL  Staus  \
0  10.128.2.1  [29/Nov/2017:06:58:55     GET/login.php HTTP/1.1    200   
1  10.128.2.1  [29/Nov/2017:06:59:02  POST/process.php HTTP/1.1    302   

          date      time  
0  29/Nov/2017  06:58:55  
1  29/Nov/2017  06:59:02  

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM