简体   繁体   English

readlines函数应用于应用于作为网页响应的文件时抛出错误

[英]readlines function throwing an error when applied to a file obtained as a response from a webpage

I have the following code which is almost similar to the code in my last question : 我有以下代码与我上一个问题中的代码几乎相似:

import sys , os
import requests, webbrowser,bs4
from PIL import Image
import pyautogui
from bs4 import BeautifulSoup
ab = "Ozil is the best"

ff = ab.find("zil")

print (ff) 
print( ab[1:len(ab)])


p = requests.get('http://www.goal.com/en-ie/news/ozil-agent-eviscerates-jealous-keown-over-stupid-comments/1javhtwzz72q113dnonn24mnr1')
j = "                                                                                                                                                          "
n = open("exml.txt" , 'wb')
for i in p.iter_content(1000) :
    n.write(i)


n.close()
n = open("exml.txt",'rb')
lis_lines = n.readlines()
#print (lis_lines[0])
#print(yy.encode("ascii"))
yy = lis_lines[0]
k = yy.find(".png")
#print(yy.decode("ascii"))
#yy = lis_lines[0].split(".png" , lis_lines[0].count(".png"))
#print(yy.encode("ascii"))
soupy= bs4.BeautifulSoup(p,"lxml")
#print(yy.encode("ascii"))
#print(yy)

What I intended to do is to have a script which will save all the images in the webpage in my system . 我打算做的是拥有一个脚本,该脚本会将所有图像保存在系统中的网页中。

In the script of the last question I was set out to do it with "select" attribute of beautifulSoup . 在最后一个问题的脚本中,我着手使用beautifulSoup的“ select”属性进行此操作。

In the script of the last question I was set out to do it with "select" attribute of beautifulSoup . 在最后一个问题的脚本中,我着手使用beautifulSoup的“ select”属性进行此操作。

However , there I was stuck with some errors as such I thought I will read the xml file and find all the places where ".png" is present and from there I will move back one character at a time till I reach "WWW". 但是,我遇到了一些错误,因为我以为我会阅读xml文件并找到存在“ .png”的所有位置,然后从那里我一次移回一个字符,直到到达“ WWW”。 In this way I will form a list of strings which will be having a list of links to the images in the webpage . 这样,我将形成一个字符串列表,其中将包含指向网页中图像的链接的列表。 Then one by one , I will use the webbrowser module to open these links and will take a screenshot and save it to some directory in my computer . 然后一个接一个地,我将使用webbrowser模块打开这些链接,并将截屏并将其保存到计算机的某个目录中。

However . 但是。 I am getting an error in the following line : 我在以下行中遇到错误:

 k = yy.find(".png")

It states : 它指出 :

File "C:\\perl\\webscratcher.py", line 27, in k = yy.find(".png") TypeError: a bytes-like object is required, not 'str' 文件“ C:\\ perl \\ webscratcher.py”,第27行,在k = yy.find(“。png”)中TypeError:需要类似字节的对象,而不是'str'

I need to understand this error in depth . 我需要深入了解此错误。 I think I was reading it in binary mode that's why it expects byte type data to be searched in the string . 我想我正在以二进制模式读取它,这就是为什么它希望在字符串中搜索字节类型数据。 So , how to avoid this ? 那么,如何避免这种情况呢? I need to understand this concept in depth to be very clear about this . 我需要深入了解这个概念,以便对此非常清楚。

Don't use rb mode when reading the file. 读取文件时不要使用rb模式。 Replace n = open("exml.txt",'rb') with n = open("exml.txt",'r') . n = open("exml.txt",'rb')替换为n = open("exml.txt",'r')

Btw, when posting questions on Stack Overflow, try to make your question as minimal as possible. 顺便说一句,当在Stack Overflow上发布问题时,请尽量减少您的问题。 For instance, remove the commented lines and use more descriptive variable names. 例如,删除注释行并使用更多描述性变量名。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM