简体   繁体   中英

Import rows from Excel file in to Python as a list for each row

My excel file looks like this:

             X_0          X_1  y
0    2917.211242  3289.522533  0
1    1888.937716   781.528356  0
2    4188.521414  1554.476261  0
3    8145.555339  9804.066728  0
4    9584.488981  6176.337189  0

I'm trying to read the data to look like this (ie each row in a list within a big list, don't mind the actual entries in the example):

[[3.393533211,2.331273381,0],
    [3.110073483,1.781539638,0],
    [1.343808831,3.368360954,0],
    [3.582294042,4.67917911,0],
    [2.280362439,2.866990263,0],
    [7.423436942,4.696522875,1],
    [5.745051997,3.533989803,1],
    [9.172168622,2.511101045,1],
    [7.792783481,3.424088941,1],
    [7.939820817,0.791637231,1]]

What I've tried so far is this:

import pandas as pd

df = pd.read_excel("nameOfMyDatabase.xlsx")
list1 = list(df['X_0'])
list2 = list(df['X_1'])
list3 = list(df['y'])

print(list1)
print(list2)
print(list3)

And this is the result I'm getting:

[2917.2112422149, 1888.93771568063, 4188.52141357231, 8145.55533888016, 9584.48898099062, 4039.63375666484, ....]

[3289.52253268053, 781.528355771809, 1554.47626074928, 9804.06672844197, 6176.33718886522, 167.607142550757, ....]

[0, 0, 0, 0, 0, 0, ....]

So not quite what i want yet but trial and error is not working for me. Any ideas?

If you just want to parse data from Excel and don't care about using the pandas DataFrame itself, you should be using openpyxl .

For example, for a spreadsheet like this:

在此处输入图像描述

and assuming that your desired spreadsheet is the first/only sheet in the workbook, you could do:

import openpyxl

wb = openpyxl.load_workbook("nameOfMyDatabase.xlsx")
ws = wb.active
cell_range = 'A2:C5'
data = [[cell.value for cell in row] for row in ws[cell_range]]
print(data)
# output: [[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

You could try using

big_list = []
for a, b in zip(df["x_0"], df["x_0"]):
   list = []
   list.append(a)
   list.append(b)
   big_list = big_list.append(list)

You can try:

import pandas as pd

df = pd.read_excel("nameOfMyDatabase.xlsx")

data = df.values.tolist()

This is my suggestion using numpy and openpyxl:

import openpyxl
import numpy as np

wb = openpyxl.load_workbook('test.xlsx')
ws = wb['Sheet1']

lst = []
for v in ws.values:
    lst.append(v)
print(lst)

values =  = np.array(lst)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM