Python Socket从服务器接收不一致的消息

Question

So I am very new to networking and I was using the Python Socket library to connect to a server that is transmitting a stream of location data. 因此，我刚接触网络，并且正在使用Python Socket库连接到正在传输位置数据流的服务器。

Here is the code used. 这是使用的代码。

import socket

BUFFER_SIZE = 1024
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((gump.gatech.edu, 756))

try:
    while (1):
        data = s.recv(BUFFER_SIZE).decode('utf-8')
        print(data)
except KeyboardInterrupt:
    s.close()

The issue is that the data arrives in inconsistent forms. 问题在于数据到达的格式不一致。

Most of the times it arrives in the correct form like this: 在大多数情况下，它以正确的形式到达，如下所示：

2016-01-21 22:40:07,441,-84.404153,33.778685,5,3

Yet other times it can arrive split up into two lines like so: 在其他时候，它可能会分成两行，如下所示：

2016-01-21

22:40:07,404,-84.396004,33.778085,0,0

The interesting thing is that when I establish a raw connection to the server using Putty I only get the correct form and never the split. 有趣的是，当我使用Putty建立到服务器的原始连接时，我只会得到正确的格式，而不会得到拆分。 So I imagine that there must be something happening that is splitting the message. 因此，我想一定会发生一些正在拆分消息的事情。 Or something Putty is doing to always assemble it correctly. 或者Putty所做的事情总是可以正确地组装它。

What I need is for the variable data to contain the proper line always. 我需要的是变量data始终包含正确的行。 Any idea how to accomplish this? 任何想法如何做到这一点？

Answer 1

It is best to think of a socket as a continuous stream of data, that may arrive in dribs and drabs, or a flood. 最好将套接字视为连续的数据流，它可能会以点点滴滴的形式出现，或者是洪水泛滥。

In particular, it is the receivers job to break the data up into the "records" that it should consist of, the socket does not magically know how to do this for you. 特别是，接收者的工作是将数据分解为应包含的“记录”，套接字不知道如何为您执行此操作。 Here the records are lines, so you must read the data and split into lines yourself. 这里的记录是行，因此您必须读取数据并自己拆分为行。

You cannot guarantee that a single recv will be a single full line. 您不能保证单个recv将是单个完整行。 It could be: 它可能是：

just part of a line; 只是一行的一部分；
or several lines; 或几行；
or, most probably, several lines and another part line. 或最有可能是几条线和另一条分形线。

Try something like: (untested) 尝试类似：（未测试）

# we'll use this to collate partial data
data = ""

while 1:
    # receive the next batch of data
    data += s.recv(BUFFER_SIZE).decode('utf-8')

    # split the data into lines
    lines = data.splitlines(keepends=True)

    # the last of these may be a part line
    full_lines, last_line = lines[:-1], lines[-1]

    # print (or do something else!) with the full lines
    for l in full_lines:
        print(l, end="")

    # was the last line received a full line, or just half a line?
    if last_line.endswith("\n"):
        # print it (or do something else!)
        print(last_line, end="")

        # and reset our partial data to nothing
        data = ""
    else:
        # reset our partial data to this part line
        data = last_line

Answer 2

The easiest way to fix your code is to print the received data without adding a new line, which the print statement (Python 2) and the print() function (Python 3) do by default. 修复代码的最简单方法是在不添加新行的情况下打印接收到的数据，默认情况下， print语句（Python 2）和print()函数（Python 3）会这样做。 Like this: 像这样：

Python 2: Python 2：

print data,

Python 3: Python 3：

print(data, end='')

Now print will not add its own new line character to the end of each printed value and only the new lines present in the received data will be printed. 现在， print将不会在每个打印值的末尾添加其自己的新行字符，而只会打印接收到的数据中出现的新行。 The result is that each line is printed without being split based on the amount of data received by each `socket.recv(). 结果是，每行打印时都不会根据每个`socket.recv（）接收到的数据量进行拆分。 For example: 例如：

from __future__ import print_function
import socket

s = socket.socket()
s.connect(('gump.gatech.edu', 756))

while True:
    data = s.recv(3).decode('utf8')
    if not data:
        break    # socket closed, all data read
    print(data, end='')

Here I have used a very small buffer size of 3 which helps to highlight the problem. 在这里，我使用了非常小的3缓冲区大小，这有助于突出问题。

Note that this only fixes the problem from the POV of printing the data. 请注意，这仅解决了打印数据的POV中的问题。 If you wanted to process the data line-by-line then you would need to do your own buffering of the incoming data, and process the line when you receive a new line or the socket is closed. 如果要逐行处理数据，则需要自己对传入的数据进行缓冲，并在收到新行或套接字关闭时处理该行。

Answer 3

Edit : socket.recv() is blocking and like the others said, you wont get an exact line each time you call the method. 编辑： socket.recv()正在阻塞，并且像其他人所说的那样，每次调用该方法都不会得到确切的行。 So as a result, the socket is waiting for data, gets what it can get and then returns. 因此，套接字正在等待数据，获取可以获取的内容，然后返回。 When you print this, because of pythons default end argument, you may get more newlines than you expected. 当您打印此文件时，由于python默认的end参数，您可能会得到比预期更多的换行符。 So to get the raw stuff from your server, use this: 因此，要从服务器上获取原始内容，请使用以下命令：

import socket 
BUFFER_SIZE = 1024 
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('gump.gatech.edu', 756)) 
try: 
    while (1):   
        data=s.recv(BUFFER_SIZE).decode('utf-8')
        if not data: break
        print(data, end="") 
except KeyboardInterrupt: 
    s.close()

Python Socket从服务器接收不一致的消息

问题描述

3 个解决方案

解决方案1
1 已采纳 2016-01-21 23:05:06

解决方案2
1 2016-01-21 23:44:59

解决方案3
-2 2016-01-21 23:00:18

Python Socket从服务器接收不一致的消息

问题描述

3 个解决方案

解决方案1 1 已采纳 2016-01-21 23:05:06

解决方案2 1 2016-01-21 23:44:59

解决方案3 -2 2016-01-21 23:00:18

解决方案1
1 已采纳 2016-01-21 23:05:06

解决方案2
1 2016-01-21 23:44:59

解决方案3
-2 2016-01-21 23:00:18