I have a function that checks if the text is in file.txt
or not.
The function works like this: If the text is contained in the file, the file is closed. If the text is not contained in the file, it is added.
But it doesn't work.
import urllib2, re
from bs4 import BeautifulSoup as BS
def SaveToFile(fileToSave, textToSave):
datafile = file(fileToSave)
for line in datafile:
if textToSave in line:
datafile.close()
else:
datafile.write(textToSave + '\n')
datafile.close()
urls = ['url1', 'url2'] # i dont want to public the links.
patGetTitle = re.compile(r'<title>(.*)</title>')
for url in urls:
u = urllib2.urlopen(url)
webpage = u.read()
title = re.findall(patGetTitle, webpage)
SaveToFile('articles.txt', title)
# so here. If the title of the website is already in articles.txt
# the function should close the file.
# But if the title is not found in articles.txt the function should add it.
You can change the SaveToFile
function like this
Your title
is a list and not a string so you should call it like this SaveToFile('articles.txt', title[0])
to get the first element of the list
def SaveToFile(fileToSave, textToSave):
with open(fileToSave, "r+") as datafile:
for line in datafile:
if textToSave in line:
break
else:
datafile.write(textToSave + '\n')
Notes:
ie)
for i in []:
print i # This will print nothing since it is iterating over empty list same as yours
list
and not a string
since re.findall
returns a list object you have to pass the first element of the list to the function. for..else
here if the loop is not terminated properly the else case will work. ie)
for i in []:
print i
else:
print "Nooooo"
Output:
Nooooo
You should refactor your SaveToFile function to like this.
def SaveToFile(fileToSave, titleList):
with open(fileToSave, 'a+') as f:
data = f.read()
for titleText in titleList:
if titleText not in data:
f.write(titleText + '\n')
f.close()
This function read a content of file (if exist or created if not) and checks whether textToSave is in the file contents. If it found textToSave then, close file otherwise write content to file.
Just use r+
mode like this:
def SaveToFile(fileToSave, textToSave):
with open(fileToSave, 'r+') as datafile:
if textToSave not in datafile.read():
datafile.write(textToSave + '\n')
About that file mode, from this answer :
``r+'' Open for reading and writing. The stream is positioned at the
beginning of the file.
And re.find_all()
always return a list, so if you're trying to write a list instead of string you'll get an error.
So you could use:
def SaveToFile(fileToSave, textToSave):
if len(textToSave) => 1:
textToSave = textToSave[0]
else:
return
with open(fileToSave, 'r+') as datafile:
if textToSave not in datafile.read():
datafile.write(textToSave + '\n')
This seems closer to your problem.
This checks if the text in the file:
def is_text_in_file(file_name, text):
with open(file_name) as fobj:
for line in fobj:
if text in line:
return True
return False
This use the function above to check and writes the text to end of the file if it is not in file yet.
def save_to_file(file_name, text):
if not is_text_in_file in (file_name, text):
with open(file_name, 'a') as fobj:
fobj.write(text + '\n')
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.