简体   繁体   中英

Python 64 bit not storing as long of string as 32 bit python

I have two computers, both running 64-bit Windows 7. One machine has python 32-bit, one is running python 64-bit. Both machines have 8GB of RAM.

I'm using BeautifulSoup to scrape a webpage, but I've been running into issues on my python64 machine. I've been able to figure out that the output of my len(str(BeautifulSoup(request.get(http://www.sampleurl.com).text))) in 64bit is only returning 92520 characters but on the same, static, site on my python32-bit machine, it's returning 135000 characters.

At some point in the past on my python64-bit machine I had python32-bit, but uninstalled it to install python64-bit because I was having issues installing scipy using pip install (turns out that wasn't the issue).

Anyway, I'm unsure as to why my 64bit python machine isn't returning the entire html string and I was wondering if anyone can help me understand what is going on and how can I fix it.

This is not a 32bit / 64bit issue. You are most likely a parser issue; one machine using lxml vs. html.parser on the other, for example.

Different parsers deal differently with broken HTML, and lxml is the default only when installed.

See for example:

etc.

Run import lxml on both machines to verify. When you replaced your Python installation on one machine with a 64-bit version, you likely didn't include a compatible lxml version.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM