简体   繁体   中英

Read file as a list in python

Most pythonic way to import a raw string from a txt file into a list? contents of "file.txt" looks like this (all in a single line):

["string1","anotha one","more text","foo","2the","bar","fin"]

I could easily copy/paste the string into my script but am sure there is a more dynamic method.

In basic pseudocode:

my_list = *contents of file.txt*

Read it as json

import json
with open('file.txt', 'r') as list_file:
    my_list = json.load(list_file)

print (my_list)

Output should be

['string1', 'anotha one', 'more text', 'foo', '2the', 'bar', 'fin']

From the Python input-output tutorial :

To read a file's contents, call f.read(size) , which reads some quantity of data and returns it as a string. size is an optional numeric argument. When size is omitted or negative, the entire contents of the file will be read and returned; it's your problem if the file is twice as large as your machine's memory. Otherwise, at most size bytes are read and returned. If the end of the file has been reached, f.read() will return an empty string ( "" ).

>>> f.read()
'This is the entire file.\n'
>>> f.read()
''

I got this in a couple of lines:

# Read your single line into the workspace.
with open(fname) as f:
    content = f.readline()
# Process the line to get your list of strings.
processed = [s.strip('[]"') for s in content[0].split(sep=',')]
# processed = ['string1', 'anotha one', 'more text', 'foo', '2the', 'bar', 'fin']

Breaking that second part down:

  • content[0].split(sep=',') gives ['["string1"', '"anotha one"', '"more text"', '"foo"', '"2the"', '"bar"', '"fin"]'] , so it splits your input into a list at each comma character, but leaves some ugly extra characters in place from your raw input string
  • s.strip('[]"') will remove any instances of bracket characters or the double quotes character " from the string s
  • [s.strip(...) for s in stuff] applies the strip to every string in the newly separated list

If you have multiple lines in your file:

# Read your file into the workspace.
with open(fname) as f:
    content = f.readlines()
# Process each line to get your list of strings.
processed = []
for line in content:
    processed.append([s.strip('[]"\n ') for s in line.split(sep=',')])

Note I had to add the newline character and a white space character to the characters to be stripped to fully clean the multi-line case.

Try this,

import re

with open('filename.txt', 'r') as f:

    print [i for i in  re.sub('\s+',' ',f.read()).strip().split(" ")]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM