讀取以\\ n分隔的python文件，但忽略最后一個\\ n

Question

我有一個名為list.txt的文件，看起來像這樣：

input1
input2
input3

我確定最后一行（input3）之后沒有空白行。 然后，我有一個Python腳本，它將逐行讀取此文件，並將文本寫入更多文本以創建3個文件（每行一個）：

import os
os.chdir("/Users/user/Desktop/Folder")

with open('list.txt','r') as f:
    lines = f.read().split('\n')

    #for l in lines:
        header = "#!/bin/bash \n#BSUB -J %s.sh \n#BSUB -o /scratch/DBC/user/%s.sh.out \n#BSUB -e /scratch/DBC/user/%s.sh.err \n#BSUB -n 1 \n#BSUB -q normal \n#BSUB -P DBCDOBZAK \n#BSUB -W 168:00\n"%(l,l,l)
        script = "cd /scratch/DBC/user\n"
        script2 = 'grep "input" %s > result.%s.txt\n'%(l,l)
        all= "\n".join([header,script,script2])

        with open('script_{}.sh'.format(l), 'w') as output:
            output.write(all)

我的問題是，這將創建4個文件，而不是3個：script_input1.sh，script_input.sh，script_input3.sh和script_.sh。 最后一個文件沒有文本，其他文件將具有input1或input2或input3。

似乎Python逐行讀取了我的list.txt，但是當到達“ input3”時，它以某種方式繼續嗎？ 如何告訴Python逐行讀取文件，並用“ \\ n”分隔，但在最后一個文本之后停止顯示？

Answer 1

首先，不要在沒有足夠時間的情況下將整個文件讀入內存-文件是可迭代的，因此逐行讀取文件的正確方法是：

with open("/path/to/file.ext") as f:
    for line in f:
        do_something_with(line)

現在在您的for循環中，您只需要剝離該行，如果它為空，則忽略它：

with open("/path/to/file.ext") as f:
    for line in f:
        line = line.strip()
        if not line:
            continue
        do_something_with(line)

稍微不相關，但是Python具有多行字符串，因此您也不需要串聯：

# not sure I got it right actually ;)
script_tpl = """
#!/bin/bash 
#BSUB -J {line}.sh 
#BSUB -o /scratch/DBC/user/{line}.sh.out 
#BSUB -e /scratch/DBC/user/{line}.sh.err 
#BSUB -n 1 
#BSUB -q normal 
#BSUB -P DBCDOBZAK 
#BSUB -W 168:00
cd /scratch/DBC/user
grep "input" {line} > result.{line}.txt
"""

with open("/path/to/file.ext") as f:
    for line in f:
        line = line.strip()
        if not line:
            continue
        script = script_tpl.format(line=line)
        with open('script_{}.sh'.format(line), 'w') as output:
            output.write(script)

最后一點：避免在腳本中更改目錄，而是使用os.path.join()來處理絕對路徑。

Answer 2

使用當前方法，您將需要：

檢查lines的最后一個元素是否為空（ lines[-1] == '' ）
如果是這樣，則將其丟棄（ lines = lines[:-1] ）。

with open('list.txt','r') as f:
    lines = f.read().split('\n')

if lines[-1] == '':
    lines = lines[:-1]

for line in lines:    
    print(line)

不要忘記，文件不以換行符結尾（末尾有空行）是合法的……這將解決這種情況。

另外，正如@setsquare指出的那樣，您可能想嘗試使用readlines() ：

with open('list.txt', 'r') as f:
    lines = [ line.rstrip('\n') for line in f.readlines() ]

for line in lines:
    print(line)

Answer 3

您是否考慮過使用readlines（）代替read（）？ 這將使Python為您處理最后一行是否帶有\\ n的問題。

請記住，如果輸入文件的最后一行確實有\\ n，則使用read（）並按'\\ n'拆分將創建一個額外的值。 例如：

my_string = 'one\ntwo\nthree\n'
my_list = my_string.split('\n')
print my_list
# >> ['one', 'two', 'three', '']

潛在的解決方案

lines = f.readlines()
# remove newlines
lines = [line.strip() for line in lines]
# remove any empty values, just in case
lines = filter(bool, lines)

對於一個簡單的示例，請參見此處：如何將文件逐行讀入列表？

Answer 4

f.read()返回一個與換行，從而結束的字符串split忠實地把作為從空串組分離的最后一行。 目前尚不清楚為什么要將整個文件顯式讀取到內存中。 只需遍歷文件對象並使其處理換行即可。

with open('list.txt','r') as f:
    for l in f:
        # ...

Answer 5

我認為您使用的分割錯誤。

如果您具有以下條件：

text = 'xxx yyy'
text.split(' ') # or simply text.split()

結果將是

['xxx', 'yyy']

現在，如果您有：

text = 'xxx yyy ' # extra space at the end
text.split()

結果將是

['xxx', 'yyy', '']

，因為split獲取每個''（空格）之前和之后的內容。 在這種情況下，最后一個空格之后是空字符串。

您可能使用的一些功能：

strip([chars]) # This removes all chars at the beggining or end of a string

例：

text = '___text_about_something___'
text.strip('_')

結果將是：

'text_about_something'

在您的特定問題中，您可以簡單地：

lines = f.readlines() # read all lines of the file without '\n'
for l in lines:
    l.strip(' ') # remove extra spaces at the start or end of line if you need

讀取以\\ n分隔的python文件，但忽略最后一個\\ n

問題描述

5 個解決方案

解決方案1
3 已采納 2017-10-11 15:01:41

解決方案2
1 2017-10-11 14:47:38

解決方案3
1 2017-10-11 14:52:44

解決方案4
1 2017-10-11 15:08:37

解決方案5
0 2017-10-11 15:07:52

讀取以\\ n分隔的python文件，但忽略最后一個\\ n

問題描述

5 個解決方案

解決方案1 3 已采納 2017-10-11 15:01:41

解決方案2 1 2017-10-11 14:47:38

解決方案3 1 2017-10-11 14:52:44

解決方案4 1 2017-10-11 15:08:37

解決方案5 0 2017-10-11 15:07:52

解決方案1
3 已采納 2017-10-11 15:01:41

解決方案2
1 2017-10-11 14:47:38

解決方案3
1 2017-10-11 14:52:44

解決方案4
1 2017-10-11 15:08:37

解決方案5
0 2017-10-11 15:07:52