简体   繁体   English

读取 OpenOffice xlsx 文件中的数据,然后使用 python 转换为列表

[英]Read data in OpenOffice xlsx file then convert to a list using python

Good day.再会。 I am using pandas to read a column of data in a xlsx file.我正在使用熊猫读取 xlsx 文件中的一列数据。 I only have 1 column and it is filled with ages of people.我只有 1 列,里面装满了不同年龄的人。 This is my code:这是我的代码:

import math
import pandas as pd
import openpyxl

df = pd.read_excel('/run/media/soumi/Luna (HDD)/CF-0001/PSHS/Statistics/Module 1/SLG 3.0 Data set 3.1 DOH COVID Data Drop Case Information.xlsx') # can also index sheet by name or fetch all sheets
num = df['B2:B57001;'].tolist()

num.sort() 
print(f'\n{num}')

fdt = pd.Series(num).value_counts().sort_index().reset_index().reset_index(drop=True)
fdt.columns = ['Element', 'Frequency']
print (fdt)

I am expecting that it will create a frequency table for me, however, when I run this, it shows me this:我期待它会为我创建一个频率表,但是,当我运行它时,它向我显示:

 python -u "/run/media/soumi/Luna (HDD)/CF-0001/Programming/Python/Stat.py"
Traceback (most recent call last):
  File "/home/soumi/.local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 5198, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 5206, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'B2:B57001;'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/run/media/soumi/Luna (HDD)/CF-0001/Programming/Python/Stat.py", line 6, in <module>
    num = df['B2:B57001;'].tolist()
  File "/home/soumi/.local/lib/python3.9/site-packages/pandas/core/frame.py", line 3458, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/soumi/.local/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc
    raise KeyError(key) from err
KeyError: 'B2:B57001;'

What should I do to fix this?我该怎么做才能解决这个问题? Thanks in advance提前致谢

Edit 1: I'd also to add that I'm using Manjaro alone and I don't use Windows so I don't have the excel app...If that helps编辑 1:我还要补充一点,我单独使用 Manjaro 而我不使用 Windows,所以我没有 excel 应用程序......如果有帮助

Your code tries to find column named B2:B57001;您的代码尝试查找名为B2:B57001;B2:B57001; . . What you should do, IUUC:你应该怎么做,IUUC:

num = df.iloc[:, 0].tolist()

If you want to construct frequency table from column Age , then you can do this:如果你想从列Age构造频率表,那么你可以这样做:

frequency = pd.value_counts(df.Age).to_frame().reset_index()

If you just want a list of frequency, then do this:如果您只想要频率列表,请执行以下操作:

freq_list = pd.value_counts(df.Age).to_frame().reset_index()['Age'].tolist()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM