I have two columns and I want to create a binary column for if there is a partial match between the two columns.
For example:
X Y Match
hello hello 1
hi hello hi 1
NA bye NA
bye hi bye 1
good bad 0
I used following code,
df['Match'] <- ifelse(with(df, str_detect(x, y)|str_detect(y, x)), 1, 0)
which worked for the first few rows but when I used it on the whole dataset (n=14000), I keep getting this error:
Error in stri_detect_regex(string, pattern, opts_regex = opts(pattern)) :
Incorrectly nested parentheses in regexp pattern. (U_REGEX_MISMATCHED_PAREN)
How should I go about solving this problem?
You probably have parentheses in your data or special characters that cause this error.
Try a loop like so:
for(i in 1:nrow(df)) {
print(i)
str_detect(df$x[i], df$y[i])
}
the last i
printed will tell you which row the problem is in.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.