[英]Remove everything after a string in a data frame column with missing values
I have a data frame resembling the extract below: 我有一个类似于以下摘录的数据框:
Observation Identifier Value
Obs001 ABC_2001 54
Obs002 ABC_2002 -2
Obs003 1
Obs004 1
Obs005 Def_2001/05
I would like to transform this data frame into a data frame where portions of the string after the "_" sign would be removed: as illustrated below: 我想将此数据帧转换为数据框,其中“_”符号后面的部分字符串将被删除:如下图所示:
Observation Identifier_NoTime Value
Obs001 ABC 54
Obs002 ABC -2
Obs003 1
Obs004 1
Obs005 Def
I tried experimenting with strsplit
, gsub
and sub
as discussed here but cannot force those commends to work. 我试着用这里讨论的strsplit
, gsub
和sub
实验,但是不能强迫那些strsplit
工作。 I have to account for the fact that: 我必须说明以下事实:
You could try the below sub
command to remove all the non-space characters from _
symbol. 您可以尝试使用以下sub
命令从_
符号中删除所有非空格字符。
sub("_\\S*", "", string)
Explanation: 说明:
_
Matches a literal _
symbol. _
匹配文字_
符号。 \\S*
Matches zero or more non-space characters. \\S*
匹配零个或多个非空格字符。 OR 要么
This would remove all the characters from _
symbol, 这将删除_
符号中的所有字符,
sub("_.*", "", string)
Explanation: 说明:
_
Matches a literal _
symbol. _
匹配文字_
符号。 .*
Matches any character zero or more times. .*
匹配任何字符零次或多次。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.