简体   繁体   English

在列表中读取txt文件而不是python中的字符串

[英]Reading txt file as a list instead of a string in python

I have a file that is made of lines of the following format:- 我有一个由以下格式的行组成的文件:-

[123, something, some other thing, "text that i want", more details]

eg:- 例如:-

[1393349463, u'Tue Feb 25 17:31:03 +0000 2014', 438365537261735936, u'A Falcon character poster for Captain America: The Winter Soldier has swooped in', [], [u'totalfilm'], [u'//1bJdCJ2'], [u'http://pbs.twimg.com/media/BhViUNICQAAoBue.jpg'], 369, 362]

Now i want to read this as list directly into python instead of a string first and then spliting the string by , and joining it back and all because the text section can have a ',' and I dont want to split that. 现在,我想将其作为列表直接读取到python中,而不是首先从字符串中读取,然后通过分割字符串,然后再将其连接起来,这都是因为文本部分可以有一个','而我不想分割它。

I am looking for something like this: 我正在寻找这样的东西:

with open("input.txt") as fp:
   for line in fp:
       corpus.append(line[3]) #read only text

Your input is obviously generated by calling just print ing out Python lists (or calling str or repr on them). 您的输入显然是通过仅调用print出Python列表(或对其调用strrepr )生成的。

This particular example can be handled by using literal_eval : 这个特殊的例子可以通过使用literal_eval来处理:

with open("input.txt") as fp:
    for line in fp:
        obj = ast.literal_eval(line)
        corpus.append(obj[3])

However, that won't work for all Python list displays in general. 但是,这通常不适用于所有Python列表显示。 And when it doesn't work… well, there's not much you can do in general. 而且当它不起作用时…好吧,一般来说您无能为力。 But you can just literal_eval until you get an error, and then, for each error, laboriously work out how to pre-process things to work around it. 但是,您可以只使用literal_eval直到遇到错误为止,然后针对每个错误努力解决如何进行预处理。

The right thing to do is generate output that's actually parseable, like JSON, and then you can just parse it trivially. 正确的做法是生成实际上是可解析的输出(例如JSON),然后您可以对其进行简单解析。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM