[英]R removing selected characters from a string
Sorry in case of duplication, but the solutions I have seen does not solve my issue. 抱歉,如果出现重复,但是我看到的解决方案不能解决我的问题。
I have a data frame (df). 我有一个数据框(df)。 One of its variables (df$Year) includes a list of years, such as: 其变量之一(df $ Year)包括年份列表,例如:
> df$Year
Year
2001–
2013–
2016–
2003–
2012–2013
2013–
1993–2007, 2010–
In case of multiple years, I just want to keep the last one (ie rather than '1993–2007, 2010–' only '2010') and get rid of the '-'. 在多年的情况下,我只想保留最后一个(即不是“ 1993-2007、2010-”而是“ 2010”),而去掉“-”。 Yet, I have tried with: 但是,我尝试过:
unlist(str_extract_all(df$Year, "[[:digit:]]4$"))
but this does not seem to work. 但这似乎不起作用。
Any hint? 有什么提示吗?
We can use sub
for a one liner: 我们可以将sub
用作一个衬板:
df$Year <- sub(".*(\\d{4})\\–?", "\\1", df$Year)
df$Year
[1] "2001" "2013" "2016" "2003" "2013" "2013" "2010"
Note that the dashes you use in your year ranges appear to be em dashes (or maybe en dashes), not the regular ASCII character. 请注意,您在年份范围内使用的破折号似乎是破折号(或可能是破折号),而不是常规的ASCII字符。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.