简体   繁体   English

Python脚本将乱码写入文件

[英]Python script writes gibberish to file

Here's a script I am using to receive syslog and append it to a text file: 这是我用来接收syslog并将其附加到文本文件的脚本:

# Receives packets on udp port 514 and
# writes to syslog.txt

from socket import *

# Set the socket parameters
host = "myhost"
port = 514
buf = 1024
addr = (host,port)

# Create socket and bind to address
UDPSock = socket(AF_INET,SOCK_DGRAM)
UDPSock.bind(addr)

# Receive messages
while 1:
    data,addr = UDPSock.recvfrom(buf)
    if not data:
        print "Client has exited!"
        break
    else:
        print "\nReceived message '", data,"'"

        # This will create a new file or overwrite an existing file.
        with open("C:\syslog.txt", "a") as myfile:
            myfile.write(str(data))

# Close socket
UDPSock.close()

Scripts works fine and text is appended to file. 脚本可以正常工作,并且文本会附加到文件中。 I see it and it's read well. 我看到了,而且阅读很好。 However, the moment I close python, that txt file data is translated to gibberish text. 但是,当我关闭python时,该txt文件数据被转换为乱码文本。 Any ideas why? 有什么想法吗? Am I supposed to do something else before appended socket data to a file? 在将套接字数据附加到文件之前,我是否应该做其他事情?

Thanks. 谢谢。

You're not parsing the syslog packets. 您没有解析syslog数据包。 Syslog is a protocol ; Syslog 是一个协议 ; it's not just plain text. 这不只是纯文本。 Data characters are most likely ending up in your file, which may be tripping some automatic character detection. 数据字符很有可能最终出现在文件中,这可能会导致某些自动字符检测失败。

这可能直接完成您想要实现的目标(解析syslog协议并将其转储): http : //pypi.python.org/pypi/loggerglue/0.9

I was going to suggesting doing open("C:\\syslog.txt", "at") instead of open("C:\\syslog.txt", "a"), but re-reading the python dox, text is the default (unlike with C, where my memory says that binary is the default which leads to issues when running on windows). 我本来建议做open(“ C:\\ syslog.txt”,“ at”)而不是open(“ C:\\ syslog.txt”,“ a”),但是重新读取python dox,文本是默认值(与C不同,我的内存说二进制是默认值,在Windows上运行时会导致问题)。

My other suggestion would be to put a plain text header at the top of the file when you first create it; 我的另一个建议是,在您首次创建文件时,在文件顶部放置一个纯文本标题。 not sure what you're using to read the file after, but Notepad and Wordpad use some heuristics to figure out what UTF-8 or other encoding is being used, and I've definitely seen cases where this fails badly. 不确定之后要使用什么来读取文件,但是Notepad和Wordpad使用一些试探法来确定正在使用什么UTF-8或其他编码,并且我肯定已经看到这种情况严重失败了。 (Search wordpad BOM guess) (搜索wordpad BOM猜测)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM