Extract specific row from xlsx file to a list using python

Question

I want to extract specific rows (assume for now that I already have the row number) from.xlsx file to a list. I addition, I don't know if it is possible but to take the first column as the list's name.

For example: the table I want to extract info from:

                   12/31/2020    12/31/2019    12/31/2018    12/31/2017
Revenue          1.823500e+11  1.614020e+11  1.369580e+11  1.110240e+11
Revenue Growth   1.298000e-01  1.785000e-01  2.336000e-01  2.373000e-01
Cost of Revenue  8.473200e+10  7.189600e+10  5.954900e+10  4.558300e+10
Gross Profit     9.761800e+10  8.950600e+10  7.740900e+10  6.544100e+10

If it is possible I want to get the info in this order: Revenue = ["1.8235E+11", "1.61402E+11", "1.36958E+11", "1.11024E+11"]

I have already tried using xlrd to get this job done but I always get a message

xlrd.biffh.XLRDError: Excel xlsx file; not supported

Thanks in advance and thank you for your help!

Answer 1

Install openpyxl then use read_excel :

# Python env: pip install openpyxl
# Anaconda env: conda install openpyxl

df = pd.read_excel('data.xlsx', index_col=0, engine='openpyxl')
print(df)

# Output:
                   12/31/2020    12/31/2019    12/31/2018    12/31/2017
Revenue          1.823500e+11  1.614020e+11  1.369580e+11  1.110240e+11
Revenue Growth   1.298000e-01  1.785000e-01  2.336000e-01  2.373000e-01
Cost of Revenue  8.473200e+10  7.189600e+10  5.954900e+10  4.558300e+10
Gross Profit     9.761800e+10  8.950600e+10  7.740900e+10  6.544100e+10

To extract the row Revenue , use:

Revenue = df.loc['Revenue']

Extract specific row from xlsx file to a list using python

Question

1 answers

solution1
0 ACCPTED 2021-12-15 10:11:35

Extract specific row from xlsx file to a list using python

Question

1 answers

solution1 0 ACCPTED 2021-12-15 10:11:35

solution1
0 ACCPTED 2021-12-15 10:11:35