简体   繁体   English

减去一列中的值并更改另一个值

[英]Subtracting value in a column and change another one

I have a data frame which looks like this:我有一个如下所示的数据框:

structure(list(V1 = c(1174060957322141696, 1174107739209043968, 
1175456617980149760, 1175463444805558272, 1175475052307013632, 
1175916108697808896, 1177035962104369152, 1177959867077791744, 
1180512511436709888, 1179879113844236288), V2 = structure(c(573L, 
595L, 87L, 88L, 91L, 67L, 561L, 100L, 77L, 1L), .Label = c("Fri Oct 04 00:01:16 CEST 2019", 
"Sat Oct 05 13:55:30 CEST 2019", "Sat Oct 05 13:55:56 CEST 2019", 
"Wed Oct 02 10:25:36 CEST 2019", "Wed Oct 02 11:47:16 CEST 2019", 
"Wed Oct 02 23:43:18 CEST 2019", "Wed Oct 02 23:46:07 CEST 2019", 
"Wed Oct 02 23:52:27 CEST 2019", "Wed Oct 02 23:54:42 CEST 2019", 
"Wed Oct 02 23:55:50 CEST 2019", "Wed Oct 02 23:56:11 CEST 2019", 
"Wed Oct 02 23:56:41 CEST 2019", "Wed Oct 02 23:57:12 CEST 2019", 
"Wed Oct 02 23:58:02 CEST 2019", "Wed Oct 02 23:58:53 CEST 2019", 
"Wed Oct 02 23:59:05 CEST 2019", "Wed Oct 02 23:59:16 CEST 2019", 
"Wed Oct 02 23:59:42 CEST 2019", "Wed Sep 18 01:47:53 CEST 2019", 
"Wed Sep 25 00:50:36 CEST 2019", "Wed Sep 25 01:06:26 CEST 2019"
), class = "factor")), row.names = c(NA, 10L), class = "data.frame")

I want to change the hours in column V4 by subtracting 07:00:00.我想通过减去 07:00:00 来更改列 V4 中的小时数。 If the hours in column V4 is smaller than 07:00:00 then it should as well change the day in column V3 and in case the day goes to the month before then it should change the month in column V2.如果 V4 列中的小时数小于 07:00:00,那么它也应该更改 V3 列中的日期,如果当天转到前一个月,则它应该更改 V2 列中的月份。 The final aim of this is to count how many rows are there for each day, for which I can use: count(entertainment_one, c("V2", "V3")) but before I need to reorganise my data frame.这样做的最终目的是计算每天有多少行,我可以使用: count(entertainment_one, c("V2", "V3")) 但在我需要重新组织我的数据框之前。 I am new to R and do not know where to start.我是 R 新手,不知道从哪里开始。 Any help would be really appreciated, thank you very much!任何帮助将不胜感激,非常感谢!

First thing to notice is that your V2 is a factor;首先要注意的是,您的V2是一个因素; they do not behave as you think.他们不像你想的那样行事。 Quickly convert it back to a character vector!快速将其转换回字符向量!

df$V2 <- as.character(df$V2)

Now, let's have our date as an actual datetime vector.现在,让我们将日期作为实际的日期时间向量。 But first, set the locale to English, as it seems your dates are English;但首先,将语言环境设置为英语,因为您的日期似乎是英语; otherwise parsing dates from a different language than your computer might work:否则从与您的计算机不同的语言解析日期可能会起作用:

Sys.getlocale('LC_TIME') # take note of this value if you want to reset it.
Sys.setlocale('LC_TIME', 'english')  # works on windows

df$dates <- strptime(df$V2, '%a %b %d %T CEST %Y', tz='XXX')

You see that 'XXX ' - that's because I have no idea what timezone CEST is.你看到'XXX ' - 那是因为我不知道 CEST 是什么时区。 If all your dates are in the same timezone, you probably wouldn't notice...如果你所有的日期都在同一个时区,你可能不会注意到......

At this point, df$dates is a POSIXlt-class object.此时, df$dates是一个 POSIXlt 类对象。 Try adding 10 (or 1 or any small integer)尝试添加10 (或 1 或任何小整数)

df$dates + 1
 [1] "2019-10-04 00:01:17 EDT" "2019-10-05 13:55:31 EDT" "2019-10-05 13:55:57 EDT" ...

Ahh, it's counting seconds.啊,它在数秒。 So to subtract 7 hours, subtract 7 hours worth of seconds:所以要减去 7 小时,减去 7 小时的秒数:

df$offset <- df$dates - 7 * 60 * 60

See, both days and months change accordingly.看,日期和月份都相应地变化。 Now use the package lubridate to extract day and month-components:现在使用包lubridate来提取日和月分量:

library(lubridate)
month(df$offset)
day(df$offset)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM