简体   繁体   中英

urllib2.urlopen(url).read() timed out in virtual machine Os x 10.11(Vmware Workstation 12 pro)

I have a http resource whose size is 3GB.

I have some codes like below.

#the url is actually a http resource which is 3GB.
res = urllib2.urlopen(url, timeout = 10)
data = res.read(1024)
while data:
    data = res.read(1024)

In Vmware workstation 11 or below, it works fine.But in Vmware workstation 12, it gives me the error.

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 384, in read
    data = self._sock.recv(left)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/httplib.py", line 612, in read
    s = self.fp.read(amt)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 384, in read
    data = self._sock.recv(left)
socket.timeout: timed out

I use safari to download the resource in Vmware workstation 12, it works fine. And if the resource is less than some size such as 10K, it also works fine.

They fixed it in VMware Fusion 8.5.7! See https://communities.vmware.com/thread/544049

I can't really provide you an answer right now and what I have to say is a bit longer than a comment, but I'm experiencing a similar issue in VMWare Fusion Pro 8.5 on 10.12 with Python's urllib2. It has nothing to do with urllib2.

I started receiving this issue randomly during transfer sessions and, after some Wireshark debugging, determined that it was due to the TCP window reaching 0 on the receiver. For some reason, it never updates again.

If you don't know what a TCP window is, it's basically the size of the receive buffer on one end of a TCP connection. That buffer should expand and contract as a congestion control mechanism during normal transfer, but what shouldn't happen is getting the window stuck at 0.

The reason your sessions work for transfers less than 10k is because the default TCP window is usually about 8k. Anything less than that are you won't even fill up the receiving buffer. Anymore more and you're basically hoping you process the data faster than you receive it.

To reproduce the issue on my local machine, here are two [terribly] written C programs you can compile with cc client.c -o client and cc server.c -o server . Run the client in the VM and the server on your local machine.

server.c:

/* server.c */
/* A simple server in the internet domain using TCP
   The port number is passed as an argument */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

void error(const char *msg)
{
    perror(msg);
    exit(1);
}

int main(int argc, char *argv[])
{
    int sockfd, newsockfd, portno;
    socklen_t clilen;
    char buffer[1024];
    struct sockaddr_in serv_addr, cli_addr;
    int n, total;
    if (argc < 2) {
        fprintf(stderr,"ERROR, no port provided\n");
        exit(1);
    }
    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0)
       error("ERROR opening socket");
    bzero((char *) &serv_addr, sizeof(serv_addr));
    portno = atoi(argv[1]);
    serv_addr.sin_family = AF_INET;
    serv_addr.sin_addr.s_addr = INADDR_ANY;
    serv_addr.sin_port = htons(portno);
    if (bind(sockfd, (struct sockaddr *) &serv_addr,
            sizeof(serv_addr)) < 0)
        error("ERROR on binding");
    listen(sockfd,5);
    clilen = sizeof(cli_addr);
    newsockfd = accept(sockfd,
              (struct sockaddr *) &cli_addr,
              &clilen);
    if (newsockfd < 0)
        error("ERROR on accept");
    memset(buffer, '0xAB', sizeof(buffer));
    total = 0;
    for (;;) {
        n = write(newsockfd, buffer, sizeof(buffer));
        if (n < 0)
            error("ERROR writing to socket");
        else
            total += n;
            printf("wrote %d / %d\n", n, total);
    }
    close(newsockfd);
    close(sockfd);
    return 0;
}

client.c:

/* client.c */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <netdb.h>

void error(const char *msg)
{
    perror(msg);
    exit(0);
}

int main(int argc, char *argv[])
{
    fd_set set;
    int sockfd, portno, n, total, rv;
    struct sockaddr_in serv_addr;
    struct hostent *server;
    struct timeval timeout;

    char buffer[256];
    if (argc < 3) {
       fprintf(stderr,"usage %s hostname port\n", argv[0]);
       exit(0);
    }
    portno = atoi(argv[2]);
    sockfd = socket(AF_INET, SOCK_STREAM, 0);
    if (sockfd < 0)
        error("ERROR opening socket");
    server = gethostbyname(argv[1]);
    if (server == NULL) {
        fprintf(stderr,"ERROR, no such host\n");
        exit(0);
    }
    bzero((char *) &serv_addr, sizeof(serv_addr));
    serv_addr.sin_family = AF_INET;
    bcopy((char *)server->h_addr,
         (char *)&serv_addr.sin_addr.s_addr,
         server->h_length);
    serv_addr.sin_port = htons(portno);
    if (connect(sockfd,(struct sockaddr *) &serv_addr,sizeof(serv_addr)) < 0)
        error("ERROR connecting");
    bzero(buffer, 256);

    FD_ZERO(&set);
    FD_SET(sockfd, &set);

    sleep(1);

    timeout.tv_sec = 1;
    timeout.tv_usec = 0;
    total = 0;
    for (;;) {
        rv = select(sockfd + 1, &set, NULL, NULL, &timeout);
        if (rv == -1) {
            perror("select\n");
        } else if(rv == 0) {
            printf("timeout\n");
            break;
        } else {
            n = read(sockfd, buffer, 256);
            if (n < 0)
            error("ERROR reading from socket");
            total += n;
            printf("read %d / %d\n", n, total);
        }
    }
    close(sockfd);
    return 0;
}

These programs are both taken directly from http://www.linuxhowtos.org/C_C++/socket.htm with modification to report additional stats and to force a stall out.

Here is a screenshot from Wireshark demonstrating the TCP Window reducing to 0 and sticking:

Wireshark中的TCP零窗口

My current theory is that there is some kind of bug in the network stack on the VMWare side on the client, but it's difficult to tell. So far I've tried using three different virtual network interfaces (e1000, e1000e, vlance) and still had the same issue with each of them.

I'm going to attempt to try various vmx options to reduce the likelihood that the issue occurs, but this is obviously a killer for a stable system and my use case (Virtualized Jenkins slaves for CI) simply won't allow this kind of bug.

I'll report back if I'm able to learn anything new.

EDIT: I posted a bug in the VMWare Community board: https://communities.vmware.com/message/2648727

EDIT again: They fixed it in VMware Fusion 8.5.7! See the same link as above.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM