简体   繁体   中英

substring on Non Asci charater in java and Scala

I am not able to find a method in java or scala to do a substring on non-ascii character using the absolute length from getBytes

 val string = "achâth33Franklin"

string.length
 Int = 16

string.getBytes.length
 Int = 17

 string.substring(0,7)
String = achâth3

I need a method that results in achâth as it has non-ascii character whose length is 2

 val test = "â"
test.getBytes.length
res26: Int = 2

To give more perspective on the problem.

The length of the field is constant which is 7, it always will be ascii value. Some times, they send non ascii value in the string. The result substring(0,7), when they non-ascii values moving the next field values into current value.

Explination for @VGR

scala> val string = "achâth33Franklin"
string: String = achâth33Franklin

scala> new String(string.getBytes,0,7)
res30: String = achâth

scala> string.substring(0,7)
res31: String = achâth3

One way to do that is to combine the getBytes() method with this constructor .

So your method would look like this:

String string = "achâth33Franklin";
string.substring(0,7); //achâth3
new String(string.getBytes(), 0, 7)); //achâth

That constructor takes an array of bytes, an offset into the array, and the number of bytes to use. so new String(string.getBytes(), a, b) works with the same logic as string.substring(a, b) , but per-byte instead of per-character.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM