[英]Python: Write all files in a directory to one cdv file
I am trying to create a bimodal graph of a collection of texts such that I can project a network of either texts by words or words by texts. 我正在尝试创建一个文本集合的双峰图,这样我就可以投影一个单词或单词一个文本的网络。 A colleague of mine has indicated that if I can get all my files in a single csv file of the format below, then has a workflow that will take care of the rest: 我的一位同事表示,如果我可以将我的所有文件保存为以下格式的单个csv文件,则可以使用剩下的工作流程:
textfile1, words words words
textfile2, words words words
I have written the following script: 我写了以下脚本:
#! /usr/bin/env python
# a script to convert all text files in a directory to the format:
# filename, words from file (no punctuation)
import glob
import re
files = {}
for fpath in glob.glob("*.txt"):
with open(fpath) as f:
just_words = re.sub("[^a-zA-Z'-]"," ",f.read())
with open("mastertext.csv", "w") as f:
for fname in files:
print >> f , "%s,%s"%(fname,just_words)
This script will run and produce the output file, but the output file is blank and I get no error response -- the source of much learning for me as a Python newbie. 该脚本将运行并生成输出文件,但是输出文件为空白,并且我没有收到错误响应-作为Python新手,这对我来说是很多学习的来源。 Am I on the right track here, and if so what am I missing? 我在这里走的正确吗?如果是,我想念的是什么?
You need to save the data in just_words
to files
. 您需要将just_words
的数据保存到files
。 In this case, I use a list of tuples instead of a dictionary but you can still use a dictionary if you prefer. 在这种情况下,我使用元组列表而不是字典,但是如果愿意,您仍然可以使用字典。 :-) :-)
files = []
for fpath in glob.glob("*.txt"):
with open(fpath) as f:
just_words = re.sub("[^a-zA-Z'-]"," ",f.read())
files.append((fpath, just_words))
with open("mastertext.csv", "w") as f:
for fname, just_words in files:
print >> f , "%s,%s"%(fname,just_words)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.