简体   繁体   English

如何在 Python 中处理来自 xlsx 文件的数据

[英]How to handle data from xlsx file in Python

These are the named ranges in an uploaded xlsx sheet, the titles are annoying and I wanted to classify them for easier calling in throughout the code.这些是上传的 xlsx 表中的命名范围,标题很烦人,我想对它们进行分类,以便在整个代码中更容易调用。

Fairly new, and unsure how I would be able to make the below look cleaner and be more efficient if I was going to add more named ranges相当新,如果我要添加更多命名范围,我不确定如何使下面看起来更干净和更高效

VIC_Male = 'Estimated Resident Population ;  Male ;  Victoria ;'
QL_Male = 'Estimated Resident Population ;  Male ;  Queensland ;'
SA_Male = 'Estimated Resident Population ;  Male ;  South Australia ;'
WA_Male = 'Estimated Resident Population ;  Male ;  Western Australia ;'
TAS_Male = 'Estimated Resident Population ;  Male ;  Tasmania ;'
NT_Male = 'Estimated Resident Population ;  Male ;  Northern Territory ;'
ACT_Male = 'Estimated Resident Population ;  Male ;  Australian Capital Territory ;'
TOTAL_Male = 'Estimated Resident Population ;  Male ;  Australia ;'
NSW_Female = 'Estimated Resident Population ;  Female ;  New South Wales ;'
VIC_Female = 'Estimated Resident Population ;  Female ;  Victoria ;'
QL_Female = 'Estimated Resident Population ;  Female ;  Queensland ;'
SA_Female = 'Estimated Resident Population ;  Female ;  South Australia ;'
WA_Female = 'Estimated Resident Population ;  Female ;  Western Australia ;'
TAS_Female = 'Estimated Resident Population ;  Female ;  Tasmania ;'
NT_Female = 'Estimated Resident Population ;  Female ;  Northern Territory ;'
ACT_Female = 'Estimated Resident Population ;  Female ;  Australian Capital Territory ;'
TOTAL_Female = 'Estimated Resident Population ;  Female ;  Australia ;'
NSW_Persons = 'Estimated Resident Population ;  Persons ;  New South Wales ;'
VIC_Persons = 'Estimated Resident Population ;  Persons ;  Victoria ;'
QL_Persons = 'Estimated Resident Population ;  Persons ;  Queensland ;'
SA_Persons = 'Estimated Resident Population ;  Persons ;  South Australia ;'
WA_Persons = 'Estimated Resident Population ;  Persons ;  Western Australia ;'
TAS_Persons = 'Estimated Resident Population ;  Persons ;  Tasmania ;'
NT_Persons = 'Estimated Resident Population ;  Persons ;  Northern Territory ;'
ACT_Persons = 'Estimated Resident Population ;  Persons ;  Australian Capital Territory ;'
TOTAL_Persons = 'Estimated Resident Population ;  Persons ;  Australia ;'```

Let's say you have this csv file (I added titles here in the first line but you can also have the same file without title, in the code bellow i commented the line that you can remove if you do not have titles):假设你有这个 csv 文件(我在第一行添加了标题,但你也可以有没有标题的相同文件,在下面的代码中,我评论了如果你没有标题可以删除的行):

"ResultType;Gender;Country
Estimated Resident Population ;  Male ;  Victoria ;
Estimated Resident Population ;  Male ;  Queensland ;
Estimated Resident Population ;  Male ;  South Australia ;
Estimated Resident Population ;  Male ;  Western Australia ;
Estimated Resident Population ;  Male ;  Tasmania ;
Estimated Resident Population ;  Male ;  Northern Territory ;
"

You can begin by making a data structure that corresponds to your data:您可以从创建与您的数据对应的数据结构开始:


class Record():
    def __init__(self, ResultType, Gender, Country):
        self.ResultType = ResultType
        self.Gender = Gender
        self.Country = Country

Then create an empty list然后创建一个空列表

My_records = []

Then open the csv file with the csv library and for each line of it create an instance of your data structure (here the Record class).然后使用 csv 库打开 csv 文件,并为它的每一行创建数据结构的实例(此处为Record类)。

with open('records.txt') as csv_file:

    csv_reader = csv.reader(csv_file, delimiter=';')
    line_count = 0
    for row in csv_reader:
        #You can remove this part if your csv file has no column name lines
        if line_count == 0:
            print(f'Column names are {", ".join(row)}') #
            line_count += 1
        else:
            instance = Record(row[0], row[1], row[2])
            My_records.append(instance)

All in one:一体:


import csv

class Record():
    def __init__(self, ResultType, Gender, Country):
        self.ResultType = ResultType
        self.Gender = Gender
        self.Country = Country
My_records = []
with open('records.txt') as csv_file:

    csv_reader = csv.reader(csv_file, delimiter=';')
    line_count = 0
    for row in csv_reader:
        if line_count == 0:
            print(f'Column names are {", ".join(row)}')
            line_count += 1
        else:
            instance = Record(row[0], row[1], row[2])
            My_records.append(instance)

Now the My_records list is a list filled with each line of your CSV file as an instance of class Record.现在My_records列表是一个列表,其中包含 CSV 文件的每一行作为 class 记录的实例。 Thus you can manipulate it as you wish.因此,您可以随心所欲地操纵它。

For example:例如:

All_countries = set([record.Country.strip() for record in My_records])
print(All_countries)

OUTPUT (All unique country present in your data): OUTPUT (您的数据中存在所有唯一国家/地区):

{'Northern Territory', 'Tasmania', 'South Australia', 'Queensland', 'Western Australia', 'Australia', 'Australian Capital Territory', 'New South Wales', 'Victoria'}

Of course you have many usefull libraries that aims to deal with those kind of stuff, like pandas but here I gave you example in plain python (using the csv included library though)当然,您有许多有用的库旨在处理这些东西,例如pandas但在这里我以普通 python 为您提供示例(尽管使用 csv 包含的库)

By the way, if your file is an xls file, those libraries (like pandas) has this kind of tools (but you'll have to pip install pandas first):顺便说一句,如果您的文件是 xls 文件,那么这些库(如 pandas)具有这种工具(但您必须先pip install pandas ):


import pandas as pd
dfs = pd.read_excel("record.xlsx", sheet_name="sheet1")

This code will actually replace the lines in the above example:此代码实际上将替换上面示例中的行:


with open('records.txt') as csv_file:

    csv_reader = csv.reader(csv_file, delimiter=';')
    ...

The rest is the same. rest 也是一样。

Maybe something like this:也许是这样的:

VIC_Male       = 'Estimated Resident Population ;  Male ;  Victoria ;'
QL_Male        = 'Estimated Resident Population ;  Male ;  Queensland ;'
SA_Male        = 'Estimated Resident Population ;  Male ;  South Australia ;'
WA_Male        = 'Estimated Resident Population ;  Male ;  Western Australia ;'
TAS_Male       = 'Estimated Resident Population ;  Male ;  Tasmania ;'
NT_Male        = 'Estimated Resident Population ;  Male ;  Northern Territory ;'
ACT_Male       = 'Estimated Resident Population ;  Male ;  Australian Capital Territory ;'
TOTAL_Male     = 'Estimated Resident Population ;  Male ;  Australia ;'
NSW_Female     = 'Estimated Resident Population ;  Female ;  New South Wales ;'
VIC_Female     = 'Estimated Resident Population ;  Female ;  Victoria ;'
QL_Female      = 'Estimated Resident Population ;  Female ;  Queensland ;'
SA_Female      = 'Estimated Resident Population ;  Female ;  South Australia ;'
WA_Female      = 'Estimated Resident Population ;  Female ;  Western Australia ;'
TAS_Female     = 'Estimated Resident Population ;  Female ;  Tasmania ;'
NT_Female      = 'Estimated Resident Population ;  Female ;  Northern Territory ;'
ACT_Female     = 'Estimated Resident Population ;  Female ;  Australian Capital Territory ;'
TOTAL_Female   = 'Estimated Resident Population ;  Female ;  Australia ;'
NSW_Persons    = 'Estimated Resident Population ;  Persons ;  New South Wales ;'
VIC_Persons    = 'Estimated Resident Population ;  Persons ;  Victoria ;'
QL_Persons     = 'Estimated Resident Population ;  Persons ;  Queensland ;'
SA_Persons     = 'Estimated Resident Population ;  Persons ;  South Australia ;'
WA_Persons     = 'Estimated Resident Population ;  Persons ;  Western Australia ;'
TAS_Persons    = 'Estimated Resident Population ;  Persons ;  Tasmania ;'
NT_Persons     = 'Estimated Resident Population ;  Persons ;  Northern Territory ;'
ACT_Persons    = 'Estimated Resident Population ;  Persons ;  Australian Capital Territory ;'
TOTAL_Persons  = 'Estimated Resident Population ;  Persons ;  Australia ;'```

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python:如何从 xlsx 文件中抓取数据的语法? - Python: How do I syntax data scraping from xlsx file? 如何从python中的数据创建.csv或.xlsx文件 - How to create .csv or .xlsx file from a data in python 如何使用python从xlsx文件加载数据 - How to load data from an xlsx file using python 如何使用 Python 将数据从 txt 文件复制并粘贴到 XLSX 作为值? - How to copy data from txt file and paste to XLSX as value with Python? 如何使用python 3在xlsx文件中搜索数据? - How to search for data in an xlsx file using python 3? 如何通过 uniqueid 从 xlsx 文件中提取数据并使用 Python 将该数据写入另一个具有相同列名的 xlsx 文件? - How can I pull data by uniqueid from an xlsx file and write that data to another xlsx file with the same column name using Python? 如何从Python中的xlsx文件获取信息? - How to get information from an xlsx file in Python? 如何在python中从网络下载xlsx文件 - How to download a xlsx file from web in python 如何从 URL 下载 xlsx 文件并通过 python 保存在数据框中 - How to download xlsx file from URL and save in data frame via python 如何使用python根据句子中的关键字从xlsx文件中过滤数据? - How do I filter data from an xlsx file based on key words in a sentence using python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM