简体   繁体   中英

how to print a certain amount of characters before a string in python

hi i have a python script that is going to a website and searching for strings inside of certain tags and printing it. my screen will look like this after it prints it - textidontwant textiwanthere.com how can i search for the .com and print a number of characters before it to only get the textiwanthere.com to show up instead of all of it. here is my code -

import urllib.request
import re
import os

url = "http://www.throwawaymail.com/"

request = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
sourcecode = urllib.request.urlopen(request).read()
output = sourcecode.decode("utf-8")

findemail = re.findall('>(.*?)</span>', str(output))

print(findemail)

os.system("pause")

i want to search "findemail" for it i want to print the phamepracl@throwam.com but its different everytime but the length is the same this is what my console says -

['Toggle navigation', '', '', '', '', 'phamepracl@throwam.com']

Just print the last entry of the list

print(findemail)[-1]

You could also assign this value to findmail if you don't want the other stuff

findemail = re.findall('>(.*?)</span>', str(output))[-1]

This worked for me:

import urllib.request
import re
import os

url = "http://www.throwawaymail.com/"

request = urllib.request.Request(url, headers={'User-Agent': 'Mozilla/5.0'})
sourcecode = urllib.request.urlopen(request).read()
output = sourcecode.decode("utf-8")

findemail = re.findall('>(.*?)</span>', str(output))

print(findemail[-1])

This is my solution:

for i in findemail:
    if i.find('.com')>=0:
        print(i)

Output:

hudininona@throwam.com

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM