简体   繁体   中英

How do I format a string with Swift in iOS?

I am working on an iOS Swift project that takes takes OCR data and then searches the text for key phrases. The OCR output looks like this:

INGREDIENTS WATER, BROWN SUGAR, RED RIPE

TOMATO CONCENTRATE, APPLE CIDERVINEGAR

W01CESTERSHlWSMJCE(WATERW4EGAR CORN

SYRUP, SALT, MOLASSE, SPICE, NATURAL FLAVOR

GARLIC POWDER, CARAMEL COLOR, ANCHOVIES

CFlSril,TAMARiN0), MOLASSES, LEMON JUICE,

ONION, HONEY, MODIFIED TAVIOCA STARCH,

When I search the string for "corn syrup", nothing is found. Searching for "corn" and "syrup" does produce positive results.

I have also tried

tesseract.recognizedText.stringByTrimmingCharactersInSet(NSCharacterSet.whitespaceAndNewlineCharacterSet())

to no avail.

Any thoughts on how to format this text for searching that would allow "corn syrup" to be identified? The qualifier is that only the exact phrase is useful - after all there are corn, corn starch, maple syrup, etc. as potential ingredients.

Thanks.

OK here is the solution that worked

'textView.text = tesseract.recognizedText.stringByReplacingOccurrencesOfString("\\n", withString: " ", options: NSStringCompareOptions.LiteralSearch, range: nil)'

I thought the initial code was accomplishing the same task.

If you want to search for "corn syrup", you most likely need to replace all new lines with spaces (and then ideally check for double spaces and replace with single space).

The quality of the character recognition is not very good and I think the text would deserve more maintenance before being used for searching. You might, for example split the phrases into array of individual strings, then trim spaces etc. from beginning and the end, perhaps you could use UITextChecker to help identify misspelled terms and fix them...

That's because "corn syrup", which is the string you're looking for, is not the same as "corn\\nsyrup", which is what your wall of text is showing.

You could instead try searching for "corn\\nsyrup" or "corn \\nsyrup" instead.

Notice in your picture how "corn\\nsyrup" produces the same results that your wall of text is showing?

Also, your code to replace "\\n" by " " might not be working because it could be "corn\\n syrup", which will make it have 2 spaces in between.

图片比较

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM