I have some raw data with issues relating to the date and time information-- things like not having a colon to separate hours from minutes, as well as containing 2400
. I'm converting the individual columns to strings and modifying as required with the purpose of creating a single column of strings that can be parsed. I have about 20 data sets with about 35,000 rows each.
a = ["2000"] * 100000
b = ["176"] * 100000
c = ["00:15","00:30","00:45","01:00"] * 25000
d = {"year":a,"DOY":b,"time":c}
df = pd.DataFrame(d)
df.head()
DOY time year
0 176 00:15 2000
1 176 00:30 2000
2 176 00:45 2000
3 176 01:00 2000
4 176 00:15 2000
I have created the following line to complete the task but it is quite : 很 :
df["date"] = [df["year"][i]+"-"+df["DOY"][i]+" "+df["time"][i] for i in range(0,len(df),1)]
df.head()
DOY time year date
0 176 00:15 2000 2000-176 00:15
1 176 00:30 2000 2000-176 00:30
2 176 00:45 2000 2000-176 00:45
3 176 01:00 2000 2000-176 01:00
4 176 00:15 2000 2000-176 00:15
What is the way to concatenate the year
, DOY
, and time
columns while inserting the appropriate hyphens and spaces for the purpose of parsing into datetime format? 方法是什么? Or is this the wrong approach altogether?
As always, thanks for advice.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.