Regex/grep strings containing us currency

Question

I have a list of strings, some of which contain dollar figures. For example:

'$34232 foo    \n  bar'

is there an [r] command that can return to me only the strings which contain dollar amounts in them?

Thank you!

Answer 1

Use \\\\$ to protect the $ which otherwise means "end of string":

   grep("\\$[0-9]+",c("123","$567","abc $57","$abc"),value=TRUE)

This will select strings that contain a dollar sign followed by one or more digits (but not eg $abc ). grep with value=FALSE returns the indices. grepl returns a logical vector. One R-specific point is that you need to specify \\\\$ , not just \\$ (ie an additional backslash is required for protection): \\$ will give you an "unrecognized escape" error.

@Cerbrus's answer, '\\\\$[0-9,.]+' , will match slightly more broadly (eg it will match $456.89 or $367,245,100 ). It will also match some implausible currency strings, eg $45.13.89 or $467.43,2,1 (ie commas should be allowed only for groupings of 3 digits in the dollars segment; there should be only one decimal point separating dollars and cents). Both of our answers will (incorrectly?) match $45abc . If you're lucky, your data don't have contain any of these tricky possibilities. Getting this right in general is hard; the answer referred to in the comments ( What is "The Best" US Currency RegEx? ) tries to do this, and as a result has significantly more complex answers, but could be useful if you adapt the answers to R by protecting $ appropriately.

Answer 2

Sure there is:

'\\$[0-9,.]+'

\\$ //Dollar sign
[0-9,.]+ // One or more numbers, dots, or comma's.

Regex/grep strings containing us currency

Question

2 answers

solution1
4 ACCPTED 2013-01-04 15:12:57

solution2
2 2013-01-04 15:15:34

Regex/grep strings containing us currency

Question

2 answers

solution1 4 ACCPTED 2013-01-04 15:12:57

solution2 2 2013-01-04 15:15:34

solution1
4 ACCPTED 2013-01-04 15:12:57

solution2
2 2013-01-04 15:15:34