[英]How to extract month from a date column in R?
I need to create a dataframe column in R that contains month and year for the observation (in this case, publications from the Web of Science database).我需要在 R 中创建一个 dataframe 列,其中包含观察的月份和年份(在这种情况下,来自科学数据库 Web 的出版物)。 I have tried concatenating the current columns "PD" (publication date) and "PY" (publication year).
我尝试连接当前列“PD”(出版日期)和“PY”(出版年份)。 However, the column "PD" uses two formats: abbreviated month alone (eg "MAR") and day-abbreviated month (eg "12-Mar").
但是,“PD”列使用两种格式:单独的缩写月份(例如“MAR”)和日期缩写月份(例如“12-Mar”)。 I would like the new "date" column to have a uniform format of "abbreviated-month year" (eg "MAR 2020") so that I can statistically analyze it.
我希望新的“日期”列具有统一的“缩写月份年份”格式(例如“2020 年 3 月”),以便我可以对其进行统计分析。
How do I extract the month from the "PD" column (ie "MAR" instead of "12-Mar")?如何从“PD”列中提取月份(即“MAR”而不是“12-Mar”)?
We can use sub
我们可以使用
sub
toupper(sub("[0-9 -]+", "", df1$PD))
#[1] "MAR" "MAR" "JUNE" "JUNE"
df1 <- data.frame(PD = c("MAR", "12-Mar", "JUNE", "24-June"),
stringsAsFactors= FALSE)
We can extract only alphabets from PD
column.我们只能从
PD
列中提取字母。
toupper(stringr::str_extract(df$PD, '[A-Za-z]+'))
#[1] "MAR" "MAY" "APRIL" "JUNE"
data数据
df <- data.frame(PD = c("MAR", "13-May", "April", "24-June"),
stringsAsFactors= FALSE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.