简体   繁体   English

如何在多个文件中保存多个输出,其中每个文件的标题来自python对象?

[英]How to save multiple output in multiple file where each file has a different title coming from an object in python?

I'm scraping rss feed from a web site ( http://www.gfrvitale.altervista.org/index.php/autismo-in?format=feed&type=rss ). 我正在从网站( http://www.gfrvitale.altervista.org/index.php/autismo-in?format=feed&type=rss )中抓取RSS提要。 I have wrote down a script to extract and purifie the text from every of the feed. 我写下了一个脚本,从每个提要中提取和纯化文本。 My main problem is to save each text of each item in a different file, I also need to name each file with it's proper title exctractet from the item. 我的主要问题是将每个项目的每个文本保存在不同的文件中,我还需要使用每个项目的正确标题摘录来命名每个文件。 My code is: 我的代码是:

for item in myFeed["items"]:
    time_structure=item["published_parsed"]
    dt = datetime.fromtimestamp(mktime(time_structure))

    if dt>t:

     link=item["link"]           
     response= requests.get(link)
     doc=Document(response.text)
     doc.summary(html_partial=False)

     # extracting text
     h = html2text.HTML2Text()

     # converting
     h.ignore_links = True  #ignoro i link
     h.skip_internal_links=True  #ignoro i link esterni
     h.inline_links=True
     h.ignore_images=True  #ignoro i link alle immagini
     h.ignore_emphasis=True
     h.ignore_anchors=True
     h.ignore_tables=True

     testo= h.handle(doc.summary())  #testo estratto

     s = doc.title()+"."+" "+testo  #contenuto da stampare nel file finale

     tit=item["title"]

     # save each file with it's proper title
     with codecs.open("testo_%s", %tit "w", encoding="utf-8") as f:
         f.write(s)
         f.close()

The error is: 错误是:

File "<ipython-input-57-cd683dec157f>", line 34 with codecs.open("testo_%s", %tit "w", encoding="utf-8") as f:
                                 ^
SyntaxError: invalid syntax

You need to put the comma after %tit 您需要在%tit之后加上逗号

should be: 应该:

#save each file with it's proper title
with codecs.open("testo_%s" %tit, "w", encoding="utf-8") as f:
     f.write(s)
     f.close()

However, if your file name has invalid characters it will return an error (ie [Errno 22] ) 但是,如果您的文件名包含无效字符,它将返回错误(即[Errno 22]

You can try this code: 您可以尝试以下代码:

...
tit = item["title"]
tit = tit.replace(' ', '').replace("'", "").replace('?', '') # Not the best way, but it could help for now (will be better to create a list of stop characters)

with codecs.open("testo_%s" %tit, "w", encoding="utf-8") as f:
     f.write(s)
     f.close()

Other way using nltk : 使用nltk其他方式:

from nltk.tokenize import RegexpTokenizer
tokenizer = RegexpTokenizer(r'\w+')
tit = item["title"]
tit = tokenizer.tokenize(tit)
tit = ''.join(tit)
with codecs.open("testo_%s" %tit, "w", encoding="utf-8") as f:
     f.write(s)
     f.close()

First off, you misplaced the comma, it should be after the %tit not before. 首先,您放错了逗号,应该在%tit之后,而不是之前。

Secondly, you don't need to close the file because the with statement that you use, does it automatically for you. 其次,您不需要关闭文件,因为您使用的with语句会自动为您完成文件。 And where did the codecs came from? 编解码器是从哪里来的? I don't see it anywhere else.... anyway, the correct with statement would be: 我在其他任何地方都看不到...。无论如何,正确的with语句是:

with open("testo_%s" %tit, "w", encoding="utf-8") as f:
     f.write(s)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何打开多个终端并在每个终端中执行命令,然后将每个终端的 output 保存到一个变量或文件中 - How to open multiple terminals and execute command in each terminal then save the output from each terminal to a one variable or a file 如何将多个列表保存到具有每个列表标题的csv文件中? - How do you save multiple lists into a csv file with a title for each list? 如何在 python 中保存多个文件名不同的文件? - How to save multiple files with different file names in python? 如何通过在python中使用glob.glob从具有不同名称的多个输入文本文件写入多个输出文本文件? - How to write multiple output text file from multiple input text file with different name by using glob.glob in python? 如何从 python 的循环中保存 CSV 文件中的多个值? - How to save multiple values in CSV file from loop in python? 如何将python中for循环的多个输出保存到文本文件? - How to save multiple outputs from a for loop in python to a text file? 如何使用python从一个zip中的URL保存多个文件? - How to save multiple file from URLs in one zip using python? 从嵌套的JSON文件中提取文本,其中每个JSON对象在Python中具有可变数量的条目 - Extracting text from a nested JSON file where each JSON object has variable number of entries in Python 如何为每 3 列从文件中提取 plot 个多图? 在 python - How to plot multiple graphs from file for each 3 columns ? in python Python:如何比较fasta文件中的多个序列? - Python: How to compare multiple sequences from a fasta file with each other?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM