为什么我的glob.glob循环不遍历文件夹中的所有文本文件？

Question

I am attempting to read from a folder containing text documents with python 3. Specifically, this is a modification of the LingSpam email spam dataset. 我试图从包含python 3的文本文档的文件夹中读取。具体来说，这是对LingSpam垃圾邮件数据集的修改。 I am expecting the code I wrote to return all 1893 text document names, however, the code instead returns the first 420 filenames. 我期待我写的代码返回所有1893个文本文档名称，但是，代码返回前420个文件名。 I do not understand why it is stopping short of the total number of filenames. 我不明白为什么它没有停止文件名的总数。 Any ideas? 有任何想法吗？

if not os.path.exists('train'):  # download data
  from urllib.request import urlretrieve
  import tarfile
  urlretrieve('http://cs.iit.edu/~culotta/cs429/lingspam.tgz', 'lingspam.tgz')
  tar = tarfile.open('lingspam.tgz')
  tar.extractall()
  tar.close()
abc = []
for f in glob.glob("train/*.txt"):
  print(f)
  abc.append(f)
print(len(abc))

I've tried changing the glob params but still no success. 我试过更改glob参数但仍然没有成功。

Edit: Apparently my code works for everyone but me. 编辑：显然我的代码适用于除我之外的所有人。 Here's my output 这是我的输出

Answer 1

Success! 成功！ The problem was 问题是

if not os.path.exists('train'):  # download data

To check my output, I had actually downloaded the files onto my computer, and since this line checked whether or not the folder existed, and it did exist, it caused issues. 为了检查我的输出，我实际上已经将文件下载到我的计算机上，并且由于该行检查了该文件夹是否存在，并且它确实存在，因此导致了问题。 I deleted the files off of my machine and now it works as it should, though I suspect running 我删除了我的机器上的文件，现在它可以正常工作，但我怀疑是在运行

  from urllib.request import urlretrieve
  import tarfile
  urlretrieve('http://cs.iit.edu/~culotta/cs429/lingspam.tgz', 'lingspam.tgz')
  tar = tarfile.open('lingspam.tgz')
  tar.extractall()
  tar.close()

without the if statement would have had the same result. 没有if语句就会有相同的结果。

为什么我的glob.glob循环不遍历文件夹中的所有文本文件？

问题描述

1 个解决方案

解决方案1
0 已采纳 2016-03-30 20:50:10

为什么我的glob.glob循环不遍历文件夹中的所有文本文件？

问题描述

1 个解决方案

解决方案1 0 已采纳 2016-03-30 20:50:10

解决方案1
0 已采纳 2016-03-30 20:50:10