简体   繁体   中英

Subtracting value in a column and change another one

I have a data frame which looks like this:

structure(list(V1 = c(1174060957322141696, 1174107739209043968, 
1175456617980149760, 1175463444805558272, 1175475052307013632, 
1175916108697808896, 1177035962104369152, 1177959867077791744, 
1180512511436709888, 1179879113844236288), V2 = structure(c(573L, 
595L, 87L, 88L, 91L, 67L, 561L, 100L, 77L, 1L), .Label = c("Fri Oct 04 00:01:16 CEST 2019", 
"Sat Oct 05 13:55:30 CEST 2019", "Sat Oct 05 13:55:56 CEST 2019", 
"Wed Oct 02 10:25:36 CEST 2019", "Wed Oct 02 11:47:16 CEST 2019", 
"Wed Oct 02 23:43:18 CEST 2019", "Wed Oct 02 23:46:07 CEST 2019", 
"Wed Oct 02 23:52:27 CEST 2019", "Wed Oct 02 23:54:42 CEST 2019", 
"Wed Oct 02 23:55:50 CEST 2019", "Wed Oct 02 23:56:11 CEST 2019", 
"Wed Oct 02 23:56:41 CEST 2019", "Wed Oct 02 23:57:12 CEST 2019", 
"Wed Oct 02 23:58:02 CEST 2019", "Wed Oct 02 23:58:53 CEST 2019", 
"Wed Oct 02 23:59:05 CEST 2019", "Wed Oct 02 23:59:16 CEST 2019", 
"Wed Oct 02 23:59:42 CEST 2019", "Wed Sep 18 01:47:53 CEST 2019", 
"Wed Sep 25 00:50:36 CEST 2019", "Wed Sep 25 01:06:26 CEST 2019"
), class = "factor")), row.names = c(NA, 10L), class = "data.frame")

I want to change the hours in column V4 by subtracting 07:00:00. If the hours in column V4 is smaller than 07:00:00 then it should as well change the day in column V3 and in case the day goes to the month before then it should change the month in column V2. The final aim of this is to count how many rows are there for each day, for which I can use: count(entertainment_one, c("V2", "V3")) but before I need to reorganise my data frame. I am new to R and do not know where to start. Any help would be really appreciated, thank you very much!

First thing to notice is that your V2 is a factor; they do not behave as you think. Quickly convert it back to a character vector!

df$V2 <- as.character(df$V2)

Now, let's have our date as an actual datetime vector. But first, set the locale to English, as it seems your dates are English; otherwise parsing dates from a different language than your computer might work:

Sys.getlocale('LC_TIME') # take note of this value if you want to reset it.
Sys.setlocale('LC_TIME', 'english')  # works on windows

df$dates <- strptime(df$V2, '%a %b %d %T CEST %Y', tz='XXX')

You see that 'XXX ' - that's because I have no idea what timezone CEST is. If all your dates are in the same timezone, you probably wouldn't notice...

At this point, df$dates is a POSIXlt-class object. Try adding 10 (or 1 or any small integer)

df$dates + 1
 [1] "2019-10-04 00:01:17 EDT" "2019-10-05 13:55:31 EDT" "2019-10-05 13:55:57 EDT" ...

Ahh, it's counting seconds. So to subtract 7 hours, subtract 7 hours worth of seconds:

df$offset <- df$dates - 7 * 60 * 60

See, both days and months change accordingly. Now use the package lubridate to extract day and month-components:

library(lubridate)
month(df$offset)
day(df$offset)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM