简体   繁体   English

用python在新行上将文本写入txt文件?

[英]Writing text to txt file in python on new lines?

So I am trying to check whether a url exists and if it does I would like to write the url to a file using python. 因此,我试图检查网址是否存在,如果存在,我想使用python将网址写入文件。 I would also like each url to be on its own line within the file. 我还希望每个URL在文件中都位于其自己的行上。 Here is the code I already have: 这是我已经拥有的代码:

import urllib2  

CREATE A BLANK TXT FILE THE DESKTOP 创建桌面的空白TXT文件

urlhere = "http://www.google.com"   
print "for url: " + urlhere + ":"  

try: 
    fileHandle = urllib2.urlopen(urlhere)
    data = fileHandle.read()
    fileHandle.close()
    print "It exists"

Then, If the URL does exist, write the url on a new line in the text file 然后,如果确实存在该URL,则将该URL写在文本文件的新行中

except urllib2.URLError, e:
    print 'PAGE 404: It Doesnt Exist', e

If the URL doesn't exist, don't write anything to the file. 如果URL不存在,则不要向文件写入任何内容。


How about something like this: 这样的事情怎么样:

import urllib2

url  = 'http://www.google.com'
data = ''

try:
    data = urllib2.urlopen(url).read()
except urllib2.URLError, e:
    data = 'PAGE 404: It Doesnt Exist ' + e

with open('outfile.txt', 'w') as out_file:
   out_file.write(data)

The way you worded your question is a bit confusing but if I understand you correctly all your trying to do is test if a url is valid using urllib2 and if it is write the url to a file? 您对问题的措辞方式有些令人困惑,但是如果我理解正确,您所做的所有尝试就是使用urllib2测试url是否有效以及是否将url写入文件中? If that is correct the following should work. 如果正确的话,下面的方法应该起作用。

import urllib2
f = open("url_file.txt","a+")
urlhere = "http://www.google.com"   
print "for url: " + urlhere + ":"  

try: 
    fileHandle = urllib2.urlopen(urlhere)
    data = fileHandle.read()
    fileHandle.close()
    f.write(urlhere + "\n")
    f.close()
    print "It exists"

except urllib2.URLError, e:
    print 'PAGE 404: It Doesnt Exist', e

If you want to test multiple urls but don't want to edit the the python script you could use the following script by typing python python_script.py "http://url_here.com" . 如果要测试多个URL,但不想编辑python脚本,则可以通过键入python python_script.py "http://url_here.com"使用以下脚本。 This is made possible by using the sys module where sys.argv[1] is equal to the first argument passed to python_script.py. 这可以通过使用sys模块来实现,其中sys.argv [1]等于传递给python_script.py的第一个参数。 Which in this example is the url (' http://url_here.com '). 在此示例中,它是url(' http://url_here.com ')。

import urllib2,sys
f = open("url_file.txt","a+")
urlhere = sys.argv[1]   
print "for url: " + urlhere + ":"  

try: 
    fileHandle = urllib2.urlopen(urlhere)
    data = fileHandle.read()
    fileHandle.close()
    f.write(urlhere+ "\n")
    f.close()
    print "It exists"

except urllib2.URLError, e:
    print 'PAGE 404: It Doesnt Exist', e

Or if you really wanted to make your job easy you could use the following script by typing the following into the command line python python_script http://url1.com,http://url2.com where all the urls you wish to test are separated by commas with no spaces. 或者,如果您真的想python python_script http://url1.com,http://url2.com工作,可以在命令行python python_script http://url1.com,http://url2.com中键入以下内容,以使用以下脚本python python_script http://url1.com,http://url2.com您要测试的所有url都是以逗号分隔,没有空格。

import urllib2,sys
f = open("url_file.txt","a+")
urlhere_list = sys.argv[1].split(",")   

for urls in urlhere_list:
    print "for url: " + urls + ":" 
    try: 
        fileHandle = urllib2.urlopen(urls)
        data = fileHandle.read()
        fileHandle.close()
        f.write(urls+ "\n")

        print "It exists"

    except urllib2.URLError, e:
        print 'PAGE 404: It Doesnt Exist', e
    except:
        print "invalid url"
f.close()

sys.argv[1].split() can also be replaced by a python list within the script if you don't want to use the command line functionality. 如果不想使用命令行功能,也可以在脚本中用python列表替换sys.argv[1].split() Hope this is of some use to you and good luck with your program. 希望这对您有所帮助,并祝您程序顺利。

note The scripts using command line inputs were tested on the ubuntu linux, so if you are using windows or another operating system I can't guarantee that it will work with the instructions given but it should. note注意使用命令行输入的脚本已在ubuntu linux上进行了测试,因此,如果您使用的是Windows或其他操作系统,我不能保证它会与给定的指令一起使用,但是应该可以。

Use requests : 使用requests

import requests

def url_checker(urls):
    with open('somefile.txt', 'a') as f:
       for url in urls:
           r = requests.get(url)
           if r.status_code == 200:
              f.write('{0}\n'.format(url))

url_checker(['http://www.google.com','http://example.com'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM