[英]extract last 2 chars from a column in a data.frame
I am new to R programming and have searched SO for many hours. 我是R编程的新手,已经搜索了很多个小时。 I would appreciate your help.
我很感激你的帮助。
I have a dataframe, with 3 columns (Date,Description, Debit) 我有一个数据框,有3列(日期,描述,借记)
Date Description Debit
2014-01-01 "abcdef VA" 15
2014-01-01 "ghijkl" NY" 56
I am trying to extract the last 2 chars of the second (Description) column (ie the 2 letter state abbreviation). 我试图提取第二个(描述)列的最后2个字符(即2个字母的州名缩写)。 I am not very comfortable with apply-type functions.
我对应用类型函数不太满意。
I have tried using 我试过用
l <- lapply(a$Description, function(x) {substr(x, nchar(x)-2+1, nchar(x))})
but get the following error message 但得到以下错误消息
Error in nchar(x) : invalid multibyte string, element 1
I have tried multiple other approaches, but with the same error. 我尝试了多种其他方法,但有相同的错误。
I am quite sure that I am missing something very basic, so would appreciate your help 我很确定我遗漏了一些非常基本的东西,所以非常感谢你的帮助
thanks 谢谢
library(stringr)
str_sub(a$Description,-2,-1)
df <- data.frame(date = c("2015-01-01", "2015-02-01", "2015-01-15"),
jumble = c("12345 VA", "123 FL", "12354567732 GA"),
debit = c(15, 36, 20))
df$jumble <- as.character(df$jumble)
df$state <- substr(df$jumble, nchar(df$jumble)-1, nchar(df$jumble))
df
date jumble debit state
1 2015-01-01 12345 VA 15 VA
2 2015-02-01 123 FL 36 FL
3 2015-01-15 12354567732 GA 20 GA
Here's a regex version, using Brandon S's sample data. 这是一个正则表达式版本,使用Brandon S的样本数据。 The regex captures everything after the last whitespace character to the end of the string.
正则表达式捕获最后一个空白字符后的所有内容到字符串的结尾。
df <- data.frame(date = c("2015-01-01", "2015-02-01", "2015-01-15"),
jumble = c("12345 VA", "123 FL", "12354567732 GA"),
debit = c(15, 36, 20))
df$state <- gsub(".+\\s(.+)$", "\\1", df$jumble)
df
date jumble debit state
1 2015-01-01 12345 VA 15 VA
2 2015-02-01 123 FL 36 FL
3 2015-01-15 12354567732 GA 20 GA
We can use sub
我们可以使用
sub
df$State <- sub(".*\\s+", "", df[,2])
df$State
#[1] "VA" "FL" "GA"
A more elegant way: 更优雅的方式:
df['Description'].str[-2:]
I assume that your description column is of String type (or Object type). 我假设您的描述列是String类型(或对象类型)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.