I have the following expression written in a standard java-style with "if".
val punctuationChars = setOf('!', '?', '.')
if (text[index] in punctuationChars &&
text[index + 1].isWhitespace() &&
text[index + 2].isUpperCase()
){return index}
I want to rewrite it kotlin-style. I got something like this:
text.indexOfFirst { it in punctuationChars && (((it + 1).isWhitespace()) && (it + 2).isUpperCase()) }
And it absolutely does not work. But. If i use just like that:
text.indexOfFirst { it in punctuationChars }
It's working. So, how can i use multiple predicates in functions like this?
I think it is good to use windowed function. Something like:
text.windowed(3).indexOfFirst { it[0] in punctuationChars && it[1].isWhitespace() && it[2].isUpperCase() }
Your code isn't working because you're doing this:
it in punctuationChars && (it + 1).isWhitespace()
What's it
here? This function is being used with text.indexOfFirst
which iterates over each Char
in a String
. So it
is a Char
, which you're using correctly with it in punctuationChars
- "is this character in this set of characters?"
But in the next condition, you're treating it
like an index . You're trying to see if the next character in the string is whitespace - but you don't have the current character's index, you have the character itself . (it + 1).isWhitespace()
effectively checks the next character in the code table (every character has a code number).
So to do it the way you're approaching it here, you need access to the index . You could do something like this:
text.withIndex().indexOf { indexed ->
// using nullable stuff here protects you from invalid index exceptions
text.elementAtOrNull(indexed.index + 1)?.isWhitespace() == true
}
which wraps each element in an Indexed
you can access (the element itself is the value
property). Personally I'm not a fan, you're working on individual list elements but then using the index property to go poking around the rest of the list - I feel like if you're going to do that, just work with the indices directly, and access each element using that:
text.indices.firstOrNull { index ->
// every element accessed through an index
text[index] in punctuationChars &&
text[index + 1].isWhitespace()
...
}
Again, accessing later indices like that is dangerous (what if it's the last character in the string?) so you should use elementAtOrNull
- I'm just writing it like that for brevity.
If you do want to work with the chars themselves, in a safe way, you can use windowed
which allows you to work on a sliding view of n elements from the list:
// partial windows allows for smaller windows with elements missing as you run into
// the end of the list - you don't want that, you want it to end as soon as you can't
// fill a window (false is the default but I'm just putting it here for clarity!)
text.windowed(size=3, partialWindows=false).indexOfFirst { window ->
window[0] in punctuationChars &&
window[1].isWhitespace()
..
}
The direct equivalent to your Java-like code is the following:
fun indexOfEndOfSentenceAndStartOfNextSentence(text: String): Int =
(0 until text.length - 2).indexOfFirst {
text[it] in ".!?" && text[it + 1].isWhitespace() && text[it + 2].isUpperCase()
}
But using a regex may be better:
private val pattern = """[!?.]\s\p{Lu}""".toRegex()
fun indexOfEndOfSentenceAndStartOfNextSentence(text: String): Int =
pattern.find(text)?.let { it.range.first } ?: -1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.