簡體   English   中英

如何從自定義數據框中提取列?

[英]How to extract columns from a customized data frame?

所以我使用 python 從 excel 到 class function 制作了一個自定義數據框,這是我當前的代碼:

import os
import pandas as pd
import math
cwd = os.path.abspath('') 
files = os.listdir(cwd)  
df = pd.DataFrame()
for file in files:
    if file.endswith('.XLSX'):
        df = df.append(pd.read_excel(file), ignore_index=True)
        df =  df = df[['name', 'cost', 'used_by', 'prime']]

header = list(df.columns.values)
print(header) 

df = df.where(df.notnull(), None)
array = df.values.tolist()
print(array)
class Item():
    __name = ""
    __cost = 0
    __gender = ""
    __prime = ""

    def has_all_properties(self):
        return bool(self.__name and not math.isnan(self.__cost) and self.__gender and self.__prime)

    def clean(self,wanted_cost,wanted_gender,wanted_prime):
        return bool(self.__name and self.__gender == wanted_gender and self.__cost <= wanted_cost and self.__prime == wanted_prime)
    
    def __init__(self, name, cost, gender, prime):
        self.__name = name
        self.__cost = cost
        self.__gender = gender
        self.__prime = prime

    def __eq__(self, other):
        return (self.__name == other.__name and self.__cost == other.__cost and self.__gender == other.__gender and self.__prime == other.__prime)   
    def __hash__(self):
        return hash((self.__name, self.__cost, self.__gender, self.__prime))

    def __repr__(self):
        return f"Item({self.__name},{self.__cost},{self.__gender},{self.__prime})"

    def tuple(self): 
        return self.__name, self.__cost, self.__gender, self.__prime

mylist = {Item(*k) for k in array}
print(mylist)

filtered = {obj for obj in mylist if obj.has_all_properties()}
clean = {obj for obj in filtered if obj.clean(20,"male","yes")}
result = list(clean)
print(result)


t_list = [obj.tuple() for obj in result]
output = pd.DataFrame(t_list, columns = header)
output.to_excel('clean_data.xlsx', index = False, header = True)

我從中導入的 excel 看起來像這樣:

    product cost   used_by prime
    name    price  gender  yes or no
    name    price  gender  yes or no
    ... and so on 

Class 制作的Class Item看起來像這樣:

mylist = {Item(UNO,15.0,None,None), 
          Item(pen,5.0,female,yes), 
          Item(underwear,15.0,male,yes), 
          Item(google,25.0,male,no), 
          Item(mug,58.0,male,no), 
          Item(None,10.0,female,no),
          ... and so on}

我想要的是class中的def ,它能夠調用一列數據。

所以,我認為它看起來像這樣:

def get_value(self,title):
     this is the code

當我調用諸如get_value(product)之類的列時,我將獲得所有產品名稱的列表,女巫應該看起來像這樣:

list = [UNO, pen, underwear, google, mug, None, ...等等]

如果 function 中有針對class的構建可以執行此操作,我希望看到。

能給我點建議嗎,先謝謝了。

試試pandas.Series.unique

例子:

    product     cost    used_by     prime
0   comic       50.55   female      yes
1   paint       14.00   male        no
2   headphone   45.00   female      no
3   phone case  20.23   male        yes
4   pen         66.00   female      no

df['product'].unique()

  >> array(['comic', 'paint', 'headphone', 'phone case', 'pen'], dtype=object)

df['used_by'].unique()

  >> array(['female', 'male'], dtype=object)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM