简体   繁体   中英

Capturing arbitrary numbers of trailing digits

I have some spreadsheets where people have written 13.14 (for example), where the decimal point is a delimiter not a number - ie it would have been better to write 13,14 or 13-14 . Between Excel and read_excel this can get converted to something like 13.140000000000001 , or 9.1699999999999999 or 13.279999999999999 . I need to chop off the 9 s (and annoyingly round the number up) or the 0..01 s I thought a regex like:

^(.*)0{3,}[12]$

might work, but all it does is capture three of the trailing 0's and the 1. Similarly

^(.*)9{3,}$

Does not capture all of the 9s. I could probably specify the 0 pattern exactly (13 x 0 + 1), but the 9s are trickier because there might be 13 or 14 of them.

Regex are not the right tool here. You are going to fail at rounding up, and you're currently failing at matching something like "129999". You need to interpret these each as a number, not as a sequence of characters.

The trick turned out being more specific about the numbers that I wanted to keep. I used negative lookahead to make sure I only dealt with "possible" numbers, ie I want 1,2,3,..,10,11 etc. but I don't want 09 for example. The regex for the zeros is

^(([1-9](?!0)|[12][0-9])\\.([1-9](?!0)|[12][0-9]))0{3,}[12]$

and

^(([1-9](?!0)|[12][0-9])\\.([1-9](?!0)|[12][0-9]))9{3,}$

Note this is for R, so the . is double-escaped, which it wouldn't be in other languages.

The rounding was handled by capturing the cleaned numbers, incrementing the second number by 1, and concatenating them.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM