Python讀取文件並使用子字符串分析行

Question

在Python中，我正在閱讀一個包含許多行的大文件。 每行包含一個數字，然后是一個字符串，例如：

[37273738] Hello world!
[83847273747] Hey my name is James!

等等...

在我讀取txt文件並將其放入列表后，我想知道如何提取數字然后根據數字對整行代碼進行排序？

file = open("info.txt","r")
myList = []

for line in file:
    line = line.split()
    myList.append(line)

我想做什么：

由於消息1中的數字介於37273700和38000000之間，我將把它（以及遵循該規則的所有其他行）排序到一個單獨的列表中

Answer 1

這完全符合您的需求（用於分揀部分）

my_sorted_list = sorted(my_list, key=lambda line: int(line[0][1:-2]))

Answer 2

使用元組作為鍵值：

for line in file:
    line = line.split()
    keyval = (line[0].replace('[','').replace(']',''),line[1:])
    print(keyval)
    myList.append(keyval)

分類

my_sorted_list = sorted(myList, key=lambda line: line[0])

Answer 3

怎么樣：

# ---
# Function which gets a number from a line like so:
#  - searches for the pattern: start_of_line, [, sequence of digits
#  - if that's not found (e.g. empty line) return 0
#  - if it is found, try to convert it to a number type
#  - return the number, or 0 if that conversion fails

def extract_number(line):
    import re
    search_result = re.findall('^\[(\d+)\]', line)
    if not search_result:
        num = 0
    else:
        try:
            num = int(search_result[0])
        except ValueError:
            num = 0

    return num

# ---

# Read all the lines into a list
with open("info.txt") as f:
    lines = f.readlines()

# Sort them using the number function above, and print them
lines = sorted(lines, key=extract_number)
print ''.join(lines)

在沒有數字的線條的情況下，它更具彈性，如果數字可能出現在不同的位置（例如線條開頭的空格），則更具可調性。

（強制建議不要將file用作變量名，因為它已經是內置函數名，這很令人困惑）。

現在有一個extract_number()函數，它更容易過濾：

lines2 = [L for L in lines if 37273700 < extract_number(L) < 38000000]
print ''.join(lines2)

Python讀取文件並使用子字符串分析行

問題描述

3 個解決方案

解決方案1
1 2015-11-17 21:28:19

解決方案2
1 2015-11-17 21:47:57

解決方案3
1 2015-11-17 21:52:03

Python讀取文件並使用子字符串分析行

問題描述

3 個解決方案

解決方案1 1 2015-11-17 21:28:19

解決方案2 1 2015-11-17 21:47:57

解決方案3 1 2015-11-17 21:52:03

解決方案1
1 2015-11-17 21:28:19

解決方案2
1 2015-11-17 21:47:57

解決方案3
1 2015-11-17 21:52:03