简体   繁体   中英

Issue with removing newline character from unicode string in python

I have a piece of Unicode text. I wanted to remove all newline characters from the text before printing the output. My code looks something like this:

input_string = u'\u3010JK\u3011\u9032\u5b66\u306b\u56f0\u3063\u305f\uff2a\uff2b\u304c\u5148\u751f\u306b\u52a9\u3051\u3066\u3082\u3089\u3046\u305f\u3081\u306b\uff33\uff25\uff38\uff01.mov'
output_string = ' '.join(input_string.splitlines())
print output_string

I was hoping the library method to take care of all the dirty newline uncicode character cases. But, it looks like this method doesn't remove newline character from the given input. Please suggest a solution to remove newline character from the given input.

According to my Python, your string contains no characters of category Cc (control character):

>>> unicodedata.category(u'\n') in map(unicodedata.category, input_string)
False

so there is no newline in this string. unicodedata.name confirms this:

>>> for c in s: print unicodedata.name(c)
... 
LEFT BLACK LENTICULAR BRACKET
LATIN CAPITAL LETTER J
LATIN CAPITAL LETTER K
RIGHT BLACK LENTICULAR BRACKET
CJK UNIFIED IDEOGRAPH-9032
CJK UNIFIED IDEOGRAPH-5B66
HIRAGANA LETTER NI
CJK UNIFIED IDEOGRAPH-56F0
HIRAGANA LETTER SMALL TU
HIRAGANA LETTER TA
FULLWIDTH LATIN CAPITAL LETTER J
FULLWIDTH LATIN CAPITAL LETTER K
HIRAGANA LETTER GA
CJK UNIFIED IDEOGRAPH-5148
CJK UNIFIED IDEOGRAPH-751F
HIRAGANA LETTER NI
CJK UNIFIED IDEOGRAPH-52A9
HIRAGANA LETTER KE
HIRAGANA LETTER TE
HIRAGANA LETTER MO
HIRAGANA LETTER RA
HIRAGANA LETTER U
HIRAGANA LETTER TA
HIRAGANA LETTER ME
HIRAGANA LETTER NI
FULLWIDTH LATIN CAPITAL LETTER S
FULLWIDTH LATIN CAPITAL LETTER E
FULLWIDTH LATIN CAPITAL LETTER X
FULLWIDTH EXCLAMATION MARK
FULL STOP
LATIN SMALL LETTER M
LATIN SMALL LETTER O
LATIN SMALL LETTER V

There are no newlines or anything like newlines in this string. It has 33 characters and all of them are printable characters, not formatting.

Maybe you're confused by the fact that the print statement adds a newline to the end vs. the behavior of sys.stdout.write ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM