简体   繁体   English

如何使用python摆脱字符串中的\\ n和\\ r

[英]How to get rid of \n and \r in a string using python

I have written a python(2.7) program to retreive data from a table in a database and copy it into a csv file. 我已经编写了python(2.7)程序来检索数据库中表中的数据并将其复制到csv文件中。 There are various data in non-printable format(unicode) which contain \\n, \\r. 存在各种不可打印格式(unicode)的数据,其中包含\\ n,\\ r。 Because of \\n, \\r I am not able to retreive the data as it is in the table. 由于\\ n,\\ r我无法检索表中的数据。

I have tried the following 我尝试了以下

str.replace('\n','').replace('\r',' ')
str.replace('\n','\\n').replace('\r', '\\r')

but it did not work out 但没有成功

csv code CSV代码

 cur.execute('select * from db.table_name)
with open('test.csv','w') as csv_file:
    csv_writer=csv.writer(csv_file)
    for row in cur:
        print "row = ", count
        count = count + 1
        newrow=[];
        for index in range(0, len(row)):
            value= row[index]
            if(type(row[index])is str):
                 value=row[index].replace("\n"," ").replace("\r"," ")
            newrow.append(value)
       csv_writer.writerow(newrow)

str.replace()返回一个新字符串,因此您必须将其分配给原始字符串才能进行更改:

s = s.replace('\n','').replace('\r','')

Unicode has external serialized representations such as UTF-8 and UTF-16 and language-dependent internal implementations such as WCHAR. Unicode具有外部序列化的表示形式(例如UTF-8和UTF-16)以及依赖于语言的内部实现(例如WCHAR)。 Your database read appears to have given you a UTF-16 serialized version of the string and all you have to do is decode it. 您读取的数据库似乎为您提供了字符串的UTF-16序列化版本,您要做的就是对其进行解码。 You certainly don't want to remove the \\r and \\n because they are part of the multi-byte sequence and not really carriage return or newline at all. 您当然不希望删除\\r\\n因为它们是多字节序列的一部分,并且根本不是回车符或换行符。

As a simple example, I can remove all the the database and looping stuff and just work with the string you posted: 作为一个简单的示例,我可以删除所有数据库和循环内容,而仅使用您发布的字符串:

>>> value = '\r\xaeJ\x92>J\xe7\x1d\n\x89`\xc6\xf8\x9c<\x18'
>>> decoded = value.decode('UTF-16')
>>> print repr(decoded)
u'\uae0d\u924a\u4a3e\u1de7\u890a\uc660\u9cf8\u183c'
>>> print decoded
긍鉊䨾ᷧ褊왠鳸ᠼ
>>> 

You can use regular expression to simplify your code: 您可以使用正则表达式来简化代码:

For example: 例如:

import re
s = "Salut \n Comment ca va ?"
s = re.sub("\n|\r|\t", "",  s)

print(s)

Output will be as: 输出将为:

Salut Comment ca va ? Salut评论ca va吗?

您只需在输入末尾添加.strip()即可完成此操作,例如:n = input()。strip()它将删除字符串中的所有“ / r”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM