繁体 English 中英

是否有一种简单的方法来拥有utf8编码字符串的子字符串，该子字符串的repr长度小于python中的N

[英]Is there a easy way to have a substring of a utf8 encode string, the substring's repr's length less than N in python

原文 2013-04-28 16:08:03 9 1 python/ algorithm/ utf-8/ substring/ repr

例如我有一个字符串，我希望找到一种简单的方法来获取以utf-8编码的子字符串，并且该子字符串的repr的长度为<= N，当然我可以尝试使用N / 3子字符串并增加N / 3 + 1，N / 3 + 2，...，但是有没有简单的方法？

word = u"this is a ship, and some other words".encode("utf-8")
#some way got a substring
substring = func(word, N)
#assert len(repr(substring)) <= N

谢谢！

1 个解决方案

可能的方法：

取整个字符串的repr的前N-1个字节。
检查最后3个字节，以查看是否中断了转义序列并在必要时削减了字节
请加上引号，并记住它可能是'或" 。
评估代表回到utf-8。
检查最后几个字节，看看是否在Unicode代码点的中间中断了字符串，并在必要时切出了字节。 您可以通过检查位模式来区分前导字节和连续字节。

相当于 python 在 golang 中的 encode('utf8')

[英]Equivalent of python's encode('utf8') in golang

python的if子串在字符串中的运行时

[英]Runtime of python's if substring in string

查找具有给定长度的String的唯一子字符串

[英]Find unique substring of String s with given length

从字符串数组中替换 s 子字符串 - Python

[英]Replace s substring from array of string - Python

Length of the longest substring - 给定一个字符串S，找出最长的substring的长度，没有重复字符

[英]Length of the longest substring -Given a string S, find the length of the longest substring without repeating characters

当字符串和子字符串的长度相等时，Python中的string的find（）是否不起作用？

[英]Does string's find() in Python not work when length of the string and substring are equal?

Python：通过连接较小的子字符串N次来构建字符串的正确方法

[英]Python: right way to build a string by joining a smaller substring N times

返回不是其他字符串子串的字符串 - 是否可能在时间内小于O（n ^ 2）？

[英]Return string that is not a substring of other strings - is it possible in time less than O(n^2)?

在utf8中编码字节字符串

[英]Encode byte string in utf8

Python检测字符串是否包含特定长度的子字符串

[英]Python detect if string contains specific length substring

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 相当于 python 在 golang 中的 encode('utf8') python的if子串在字符串中的运行时查找具有给定长度的String的唯一子字符串从字符串数组中替换 s 子字符串 - Python Length of the longest substring - 给定一个字符串S，找出最长的substring的长度，没有重复字符当字符串和子字符串的长度相等时，Python中的string的find（）是否不起作用？ Python：通过连接较小的子字符串N次来构建字符串的正确方法返回不是其他字符串子串的字符串 - 是否可能在时间内小于O（n ^ 2）？在utf8中编码字节字符串 Python检测字符串是否包含特定长度的子字符串

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM