打印时套接字的接收端会拆分数据

Question

So while programming sockets using Java and Python, I stumbled upon something weird. 因此，在使用Java和Python编程套接字时，我偶然发现了一些奇怪的东西。

When sending a message using Java to the receiving end of the Python socket, it splits the message into 2 parts, even though this was not intended. 当使用Java将消息发送到Python套接字的接收端时，即使不希望这样做，也会将消息分为两部分。

I probably made a mistake somewhere that's causing this problem, but I really don't know what it is. 我可能在导致此问题的某个地方犯了一个错误，但我真的不知道这是什么。

You can see that Java sends "Test1" in one command and Python only receives parts of that message: 您可以看到Java在一个命令中发送了“ Test1”，而Python仅收到了该消息的一部分：

http://i.imgur.com/tbwa7C5.png http://i.imgur.com/tbwa7C5.png

Pyhton Server Socket Source: Pyhton服务器套接字源：

'''
Created on 23 okt. 2014

@author: Rano
'''

#import serial
import socket

HOST = ''
PORT = 1234
running = True;

skt = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
skt.bind((HOST, PORT))
skt.listen(1)
conne, addr = skt.accept()

#ser = serial.Serial('/dev/tty.usbmodem411', 9600)

while running == True:
    data = conne.recvall(1024)

    if(data == "quit"):
        running = False
        break

    rawrecvstring = data + ""
    recvstring = rawrecvstring.split("|")
    print(recvstring[0])

#_______________________ABOVE IS RECEIVE_______________UNDER IS SEND_______________________#    

#  sendstring = ser.readline()
#   if sendstring != "":
#       conne.sendall(sendstring)


conne.close()
#ser.close()

And the Java Socket send function: 和Java Socket发送功能：

private String message;
private DataOutputStream out;
private BufferedReader in;
private Socket socket;
private boolean socketOnline;

public SocketModule(String IP, int Port){
    try {
        socket = new Socket(IP, Port);
        out = new DataOutputStream(socket.getOutputStream());
        in = new BufferedReader(new InputStreamReader(socket.getInputStream()));   
    } catch (UnknownHostException e) {
        e.printStackTrace();
    } catch (IOException e) {
        e.printStackTrace();
    }
};

void setMessage(String s){
    try {
        out.writeBytes(s);
        out.flush();
        System.out.println("message '" + s + "' sent!\n");
    } catch (IOException e) {
        e.printStackTrace();
    }
};

Any ideas as to why the message is being split? 关于为何拆分消息有任何想法吗？

Answer 1

TCP is a stream protocol, not a message protocol. TCP是流协议，不是消息协议。

As far as TCP is concerned, s.send("abd"); s.send("def"); 就TCP而言， s.send("abd"); s.send("def"); s.send("abd"); s.send("def"); is exactly the same thing as s.send("abcdef") . 与s.send("abcdef")完全相同。 At the other end of the socket, when you go to receive, it may return as soon as the first send arrives and give you "abc" , but it could just as easily return "abcdef" , or "a" , or "abcd" . 在套接字的另一端，当您去接收时，它可能会在第一个发送到达时立即返回并给您"abc" ，但它也可以很容易地返回"abcdef"或"a"或"abcd" 。 They're all perfectly legal, and your code has to be able to deal with all of them. 它们都是完全合法的，并且您的代码必须能够处理所有这些。

If you want to process entire messages separately, it's up to you to build a protocol that delineates messages—whether that means using some separator that can't appear in the actual data (possibly because, if it does appear in the actual data, you escape it), or length-prefixing each message, or using some self-delineating format like JSON. 如果要单独处理整个消息，则需要构建一个描述消息的协议-是否意味着使用一些不能出现在实际数据中的分隔符（可能是因为，如果确实出现在实际数据中，则可能是因为转义），或为每条消息加上前缀，或使用一些自描述格式（如JSON）。

It looks like you're part-way to building such a thing, because you've got that split('|') for some reason. 看来您正在构建这种东西，因为您出于某种原因而获得了split('|') 。 But you still need to add the rest of it—loop around receiving bytes, adding them to a buffer, splitting any complete messages off the buffer to process them, and holding any incomplete message at the end for the next loop. 但是，您仍然需要添加其余部分-围绕接收字节循环，将它们添加到缓冲区中，将所有完整的消息从缓冲区中分离出来进行处理，并在下一个循环的最后保留所有不完整的消息。 And, of course, sending the | 并且，当然，发送| separators on the other side. 另一侧的分隔符。

For example, your Java code can do this: 例如，您的Java代码可以执行以下操作：

out.writeBytes(s + "|");

Then, on the Python side: 然后，在Python端：

buf = ""
while True:
    data = conne.recvall(1024)
    if not data:
        # socket closed
        if buf:
            # but we still had a leftover message
            process_message(buf)
        break
    buf += data
    pieces = buf.split("|")
    buf = pieces.pop()
    for piece in pieces:
        process_message(piece)

That process_message function can handle the special "quit" message, print out anything else, whatever you want. 该process_message函数可以处理特殊的“退出”消息，打印出任何您想要的内容。 (And if it's simple enough, you can inline it into the two places it's called.) （如果足够简单，则可以将其内联到被称为的两个位置。）

From a comment, it sounds like you wanted to use that | 从评论看来，您想使用它| to separate fields within each message, not to separate messages. 分隔每个消息中的字段，而不分隔消息。 If so, just pick another character that will never appear in your data and use that in place of | 如果是这样，只需选择一个永远不会出现在您的数据中的字符并用它代替| above (and then do the msg.split('|') inside process_message ). 上面（然后在process_message内执行msg.split('|') ）。 One really nice option is \\n , because then (on the Python side) you can use socket.makefile , which gives you a file-like object that does the buffering for you and just yields lines one by one when you iterate it (or call readline on it, if you prefer). 一个非常好的选择是\\n ，因为然后（在Python端），您可以使用socket.makefile ，它为您提供了一个类似文件的对象，该对象可以为您进行缓冲，并且在迭代时socket.makefile生成（或如果愿意，请致电readline ）。

For more detail on this, see Sockets are byte streams, not message streams . 有关更多信息，请参见套接字是字节流，而不是消息流。

As a side note, I also removed the running flag, because the only time you're ever going to set it, you're also going to break , so it's not doing any good. 附带说明一下，我也删除了running标志，因为只有在您要设置它时，您才会break ，所以它没有任何用处。 (But if you are going to test a flag, just use while running: , not while running == True: .) （但是，如果你要测试一个标志，只要使用while running:不是while running == True: 。）

打印时套接字的接收端会拆分数据

问题描述

1 个解决方案

解决方案1
1 已采纳 2014-10-29 22:46:46

打印时套接字的接收端会拆分数据

问题描述

1 个解决方案

解决方案1 1 已采纳 2014-10-29 22:46:46

解决方案1
1 已采纳 2014-10-29 22:46:46