簡體   English   中英

將字符串列表從文件轉換為整數列表

[英]convert list of strings from file to list of integers

我有一個大文件,里面裝滿了用空格和逗號分隔的整數。 我試圖一次讀取 1KB 並將其轉換為整數列表。

這段代碼工作正常:

with open('test_age.txt', 'r+') as inf:
    with open('test_age_out.txt', 'r+') as outf:
        sorted_list =[]
        a = [x.strip() for x in inf.read(1000).split(',')]
        int_a = map(int, a)
        f = tempfile.TemporaryFile()
        outf_array = sorted(int_a)
        f.write(str(outf_array))
        f.seek(0)
        #etc...

輸出:

[1, 1, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, etc...

但是一旦我添加了一個 while 循環來讀取下一個 1KB:

with open('test_age.txt', 'r+') as inf:
    with open('test_age_out.txt', 'r+') as outf:
        sorted_list =[]
        while True:
            a = [x.strip() for x in inf.read(1000).split(',')]
            int_a = map(int, a)
            if not a:
                break
            f = tempfile.TemporaryFile()
            outf_array = sorted(int_a)
            print outf_array
            f.write(str(outf_array))
            f.seek(0)      

我得到輸出和一個 ValueError:

[1, 1, 2, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 
8, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 11, 11, 11, 12, 12, 12,
12, 12, 12, 13, 13, 13, 14, 14, 14, 14, 14, 14, 14, 15, 15, 16, 17, 18,
19, 19, 20, 20, 20, 20, 21, 21, 22, 22, 22, 23, 23, 24, 24, 24, 24, 25, 
25, 25, 25, 25, 26, 26, 26, 26, 27, 27, 27, 28, 28, 29, 30, 30, 30, 30,
31, 31, 31, 32, 32, 33, 33, 33, 33, 33, 33, 34, 34, 34, 34, 34, 35, 35,
35, 35, 35, 36, 36, 37, 37, 37, 37, 38, 38, 39, 39, 39, 39, 39, 39, 40,
40, 40, 40, 41, 41, 42, 43, 43, 43, 44, 44, 44, 44, 44, 45, 46, 46, 46,
46, 47, 47, 47, 47, 47, 48, 48, 48, 48, 48, 48, 49, 49, 49, 50, 50, 50,
50, 50, 50, 51, 51, 51, 51, 51, 51, 52, 52, 52, 52, 52, 52, 53, 53, 54,
54, 54, 55, 55, 55, 55, 56, 56, 56, 56, 56, 57, 57, 57, 57, 58, 58, 58,
59, 59, 60, 60, 60, 61, 62, 62, 62, 62, 63, 63, 63, 63, 63, 63, 63, 64,
64, 64, 65, 66, 66, 67, 67, 67, 67, 68, 68, 68, 68, 68, 69, 69, 69, 69, 
69, 69, 69, 70, 70, 70, 70, 71, 71, 72, 72, 73, 74, 74, 74, 75, 76, 76,
76, 76, 77, 77, 77, 77, 78, 78, 79, 79, 79, 79, 81, 81, 81, 81, 82, 82, 
82, 82, 82, 83, 83, 83, 83, 84, 85, 85, 85, 85, 86, 86, 86, 87, 87, 87,
87, 87, 87, 88, 88, 88, 88, 88, 88, 88, 89, 89, 89, 89, 90, 90, 90, 91,
91, 91, 91, 91, 91, 91, 92, 92, 93, 93, 93, 94, 94, 94, 94, 95,  95,
96, 96, 96, 97, 97, 98, 99, 100, 100, 100, 100, 100]
[2, 3, 3, 3, 3, 4, 4, 5, 5, 6, 8, 9, 10, 10, 11, 11, 11, 11, 12, 12,12, 
13, 14, 15, 17, 17, 17, 17, 17, 17, 18, 18, 18, 20, 21, 22, 22, 22, 22, 
23, 23, 24, 24, 24, 26, 27, 27, 27, 27, 28, 28, 29, 29, 29, 29, 30, 32, 
32, 32, 32, 33, 33, 34, 34, 36, 37, 37, 37, 37, 38, 39, 41, 41, 42, 43,   
44, 44, 46, 46, 47, 48, 49, 49, 49, 49, 51, 51, 52, 52, 52, 52, 53, 54, 
54, 54, 55, 55, 56, 60, 60, 61, 61, 61, 62, 63, 63, 64, 65, 65, 65, 65, 
66, 66, 67, 68, 68, 68, 70, 70, 73, 73, 73, 74, 74, 75, 75, 75, 77, 77, 
77, 77, 78, 78, 78, 78, 79, 80, 81, 81, 82, 82, 83, 83, 83, 83, 84, 84, 
85, 85, 85, 85, 86, 87, 88, 90, 91, 91, 91, 92, 93, 93, 93, 94, 95, 97, 
98, 98, 99, 100]
    int_a = map(int, a)
ValueError: invalid literal for int() with base 10: ''

我不確定為什么會這樣。 如果我調用打印,似乎列表正在創建和排序。 但是 ValueError 存在。 是什么賦予了?

查看str.split的輸出,其中傳遞的分隔符出現在字符串的頭部或尾部:

>>> ', 3, 5'.split(', ')
['', '3', '5']

該空字符串是您的程序試圖(但未能)解析為整數的內容。 ''.strip()沒有幫助(順便說一下,對於int()不是必需的 - 它會自動忽略前導和尾隨空格)。 我建議閱讀保證完整和有效的塊,例如行。 如果文件只是一個大行,您將不得不做一些額外的工作來保存一行中的最后一個字符並將它們移動到下一行的處理中。 不要忘記在循環后處理剩余的字符。

line = inf.read(1000)
new += line
current, delimiter, new = line.rpartition(', ')
# process current
# continue loop to add more content

如果文件可以輕松放入系統內存中,您可以讀取整個文件並一次性拆分:

numbers = map(int, inf.read().split(', '))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM