读取文本文件并在python中解析

Question

I have a text file(.txt) just looks like below: 我有一个文本文件（.txt）如下所示：

Date, Day, Sect, 1, 2, 3 日期，日期，宗派，1、2、3

1, Sun, 1-1, 123, 345, 678 1，太阳，1-1，123，345，678

2, Mon, 2-2, 234, 585, 282 2，星期一，2-2，234，585，282

3, Tue, 2-2, 231, 232, 686 3，星期二，2-2，231，232，686

With this data I want to do the followings: 使用此数据，我想执行以下操作：

1) Read the text file by line as a separate element in the list 1）作为列表中的单独元素逐行读取文本文件

Split elements by comma 用逗号分割元素
Delete non-necessary elements('\\n') in the list 删除列表中不必要的元素（'\\ n'）

For the two, I did these. 对于这两个，我做了这些。

file = open('abc.txt', mode = 'r', encoding = 'utf-8-sig')
lines = file.readlines()
file.close()
my_dict = {}
my_list = []
for line in lines:
    line = line.split(',')
    line = [i.strip() for i in line]

2) Set the first row(Date, Day, Sect, 1, 2, 3) as key and set the other rows as values in the dictionary. 2）将第一行（日期，日期，日期，1、2、3）设置为键，并将其他行设置为字典中的值。

    my_dict['Date'] = line[0]
    my_dict['Day'] = line[1]
    my_dict['Sect'] = line[2]
    my_dict['1'] = line[3]
    my_dict['2'] = line[4]
    my_dict['3'] = line[5]

The above code has two issues: 1) Set the first row as dictionary, too. 上面的代码有两个问题：1）还将第一行设置为字典。 2) If I add this to the list as the below, it only keeps the last row as all elements in the list. 2）如果我将其添加到列表中，如下所示，它将仅保留最后一行作为列表中的所有元素。

3) Create a list including the dictionary as elements. 3）创建一个包含字典作为元素的列表。

    my_list.append(my_dict)

4) Subset the elements that I want to. 4）细分我想要的元素。

I couldn't write any code from here. 我无法从此处编写任何代码。 But What I want to do is subset elements meeting the condition: For example, choosing the element in the dictionary where the Sect is 2-2. 但是我要做的是满足条件的子集元素：例如，在Sect为2-2的字典中选择元素。 Then the wanted results could be as the follows: 然后，所需结果可能如下：

>> [{'Date': '2', 'Day': 'Mon', 'Sect': '2-2', '1': '234', '2': '585', '3': '282'}, {'Date': '3', 'Day': 'Tue', 'Sect': '2-2', '1': '231', '2':'232', '3':'686'}]

Thanks, 谢谢，

Answer 1

@supremed14 , you can also try the below code to prepare the list of dictionaries after reading the file. @ supremed14 ，您也可以在阅读文件后尝试以下代码来准备字典列表。

data.txt data.txt中

As white spaces are there in text file. 文本文件中有空格。 strip() method defined on strings will solve this problem. 在字符串上定义的strip（）方法将解决此问题。

Date, Day, Sect, 1, 2, 3

1, Sun, 1-1, 123, 345, 678

2, Mon, 2-2, 234, 585, 282

3, Tue, 2-2, 231, 232, 686

Source code: 源代码：

Here you do not need to worry about closing the file. 在这里，您不必担心关闭文件。 It will be taken care by Python. Python会注意的。

import json
my_list = [];

with open('data.txt') as f:
    lines = f.readlines() # list containing lines of file
    columns = [] # To store column names

    i = 1
    for line in lines:
        line = line.strip() # remove leading/trailing white spaces
        if line:
            if i == 1:
                columns = [item.strip() for item in line.split(',')]
                i = i + 1
            else:
                d = {} # dictionary to store file data (each line)
                data = [item.strip() for item in line.split(',')]
                for index, elem in enumerate(data):
                    d[columns[index]] = data[index]

                my_list.append(d) # append dictionary to list

# pretty printing list of dictionaries
print(json.dumps(my_list, indent=4))

Output: 输出：

[
    {
        "Date": "1",
        "Day": "Sun",
        "Sect": "1-1",
        "1": "123",
        "2": "345",
        "3": "678"
    },
    {
        "Date": "2",
        "Day": "Mon",
        "Sect": "2-2",
        "1": "234",
        "2": "585",
        "3": "282"
    },
    {
        "Date": "3",
        "Day": "Tue",
        "Sect": "2-2",
        "1": "231",
        "2": "232",
        "3": "686"
    }
]

Answer 2

Using pandas this is pretty easy: 使用熊猫很简单：

Input: 输入：

$cat test.txt
Date, Day, Sect, 1, 2, 3
1, Sun, 1-1, 123, 345, 678
2, Mon, 2-2, 234, 585, 282
3, Tue, 2-2, 231, 232, 686

Operations: 操作：

import pandas as pd
df = pd.read_csv('test.txt', skipinitialspace=True)
df.loc[df['Sect'] == '2-2'].to_dict(orient='records')

Output: 输出：

[{'1': 234, '2': 585, '3': 282, 'Date': 2, 'Day': 'Mon', 'Sect': '2-2'},
 {'1': 231, '2': 232, '3': 686, 'Date': 3, 'Day': 'Tue', 'Sect': '2-2'}]

Answer 3

If your .txt file is in the CSV format: 如果您的.txt文件为CSV格式：

Date, Day, Sect, 1, 2, 3

1, Sun, 1-1, 123, 345, 678

2, Mon, 2-2, 234, 585, 282

3, Tue, 2-2, 231, 232, 686

You can use the csv library: 您可以使用csv库：

from csv import reader
from pprint import pprint

result = []
with open('file.txt') as in_file:

    # create a csv reader object
    csv_reader = reader(in_file)

    # extract headers
    headers = [x.strip() for x in next(csv_reader)]

    # go over each line 
    for line in csv_reader:

        # if line is not empty
        if line:

            # create dict for line
            d = dict(zip(headers, map(str.strip, line)))

            # append dict if it matches your condition
            if d['Sect'] == '2-2':
                result.append(d)

pprint(result)

Which gives the following list: 给出以下列表：

[{'1': '234', '2': '585', '3': '282', 'Date': '2', 'Day': 'Mon', 'Sect': '2-2'},
 {'1': '231', '2': '232', '3': '686', 'Date': '3', 'Day': 'Tue', 'Sect': '2-2'}]

Answer 4

I recommend you make the file a .csv (comma seperated value) file a parser for that file would look something like this 我建议您将文件设为.csv（逗号分隔值）文件，该文件的解析器应如下所示

def parseCsvFile (dataFile):
    dict = {}
    with open(dataFile) as csvfile:
        reader = csv.DictReader(csvfile)
        for row in reader:
            key = None
            for k in row:
                stripK = k.strip()
                stripV = row[k].strip()
                if key == None:
                    key = stripV
                    dict[key] = {}
                dict[key][stripK] = stripV
    return dict

This returns a dictionary of dictionaries 这将返回词典字典

Answer 5

If you are allowed to use pandas , you can simply achieve your task by: 如果允许使用pandas ，则可以通过以下方式简单地完成任务：

import pandas as pd
df = pd.read_csv('abc.txt', skipinitialspace=True) # reads your cvs file into a DataFrame
d = df.loc[df['Sect'] == '2-2'].to_dict('records') # filters the records which `Sect` value is '2-2', and returns a list of dictionaries

To install pandas run: 要安装pandas运行：

python3 -m pip install pandas

Assumming, the contents of abc.txt is the one you have provided, d will be: 假设abc.txt的内容就是您提供的内容，则d为：

[{'Date': 2, 'Day': 'Mon', 'Sect': '2-2', '1': 234, '2': 585, '3': 282},
 {'Date': 3, 'Day': 'Tue', 'Sect': '2-2', '1': 231, '2': 232, '3': 686}]

读取文本文件并在python中解析

问题描述

5 个解决方案

解决方案1
2 2018-07-15 04:17:35

data.txt data.txt中

Source code: 源代码：

Output: 输出：

解决方案2
1 2018-07-15 03:54:13

解决方案3
1 2018-07-15 04:12:32

解决方案4
0 2018-07-15 03:39:04

解决方案5
0 2018-07-15 03:48:56

读取文本文件并在python中解析

问题描述

5 个解决方案

解决方案1 2 2018-07-15 04:17:35

data.txt data.txt中

Source code: 源代码：

Output: 输出：

解决方案2 1 2018-07-15 03:54:13

解决方案3 1 2018-07-15 04:12:32

解决方案4 0 2018-07-15 03:39:04

解决方案5 0 2018-07-15 03:48:56

解决方案1
2 2018-07-15 04:17:35

解决方案2
1 2018-07-15 03:54:13

解决方案3
1 2018-07-15 04:12:32

解决方案4
0 2018-07-15 03:39:04

解决方案5
0 2018-07-15 03:48:56