简体   繁体   中英

How to extract a part of data, that we get from website using url.open()

I wrote a Program that connects to this website

http://mbox.dr-chuck.net/sakai.devel/1/2

I need to parse it and get email in that website

url = http://mbox.dr-chuck.net/sakai.devel/1/2
data = urllib.urlopen(url).read()
for line in data:
    templine = line.strip()
    print templine

but it prints individual letters instead of words like when i try to print a particular line from it

F
r
o
m

n
e
w
s

how to fix this please help me what to do,I need my program to print as lines

  • sorry about my language, this is my first question to post

If you are using python3 , you can do something like this:

from urllib.request import urlopen

data = urlopen("http://mbox.dr-chuck.net/sakai.devel/1/2").read().decode("utf8").split("\n")

for k in data:
    print(k)

Update:

If you want to print only the second line from the given url, you can do something like this:

print(data[1])
>>> 'From: "Glenn R. Golden" <ggolden@umich.edu>'

otherwise, if you want to print all the lines which starts with From or From: , you can do something like this:

for k in data:
    if k.split(" ")[0] == "From" or k.split(" ")[0] == "From:":
        print(k)

Output:

From news@gmane.org Tue Mar 04 03:33:20 200
From: "Glenn R. Golden" <ggolden@umich.edu>
url = 'http://mbox.dr-chuck.net/sakai.devel/1/2'
data = urllib.urlopen(url).readlines()
for line in data:
    if line.startswith('From'):
        print (line)

out:

From news@gmane.org Tue Mar 04 03:33:20 2003

From: "Glenn R. Golden" <ggolden@umich.edu>

use readlines() to get each line in the file

use startswith() to get line which starts with From

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM