如何從 Python 中的另一個文件調用 function 中定義的變量？

Question

我有以下名為calculo_indice.py的文件

import pandas as pd

def limites(df,n):
    n_sigma = n * df.valor_unitario.std()
    mean = df.valor_unitario.mean()
    lower_bound: float = mean - n_sigma
    upper_bound: float = mean + n_sigma
    return (lower_bound,upper_bound)


def indice(df):
    df['isOutlier'] = df['valor_unitario'].apply(lambda x: True if x < lower_bound or x > upper_bound else False)
    df = df[~df.isOutlier]
    df['indice'] = df['valor_unitario'].apply(lambda x: ((x-lower_bound)/(upper_bound-lower_bound))*2000)
    df = df.astype({'indice': 'int64'})

它旨在計算 dataframe 列的下限和上限（第一個 function 稱為limites ）然后計算這些邊界上的索引（函數稱為indice ）

運行calculo_indice.py文件一切正常，但是在運行調用這些函數的原始文件時，我得到一個NameError

我將該文件作為import calculo_indice as indice ，然后像這樣調用這些函數：

indice.limites(df, 2)

indice.indice(df)

我也嘗試print(lower_bound)這就是我嘗試返回的原因

Traceback (most recent call last):
  File "C:\Users\...\Indice.py", line 19, in <module>
    indice.indice(df)
  File "C:\Users\...\calculo_indice.py", line 12, in indice
    df['isOutlier'] = df['valor_unitario'].apply(lambda x: True if x < lower_bound or x > upper_bound else False)
  File "C:\ProgramData\Anaconda3\envs\Indice\lib\site-packages\pandas\core\series.py", line 4138, in apply
    mapped = lib.map_infer(values, f, convert=convert_dtype)
  File "pandas\_libs\lib.pyx", line 2467, in pandas._libs.lib.map_infer
  File "C:\Users\...\calculo_indice.py", line 12, in <lambda>
    df['isOutlier'] = df['valor_unitario'].apply(lambda x: True if x < lower_bound or x > upper_bound else False)
NameError: name 'lower_bound' is not defined

Process finished with exit code 1

我究竟做錯了什么？ 感謝您的幫助

Answer 1

lower_bound和upper_bound僅在您的限制limites的本地 scope中定義。 如果它們也需要在索引中定義，那么您必須將它們作為參數傳遞，這樣它們就在indice中：

我還修改了您的indice 。 首先，您需要返回 DataFrame，這樣您就可以將您的更改分配給一個變量並讓它們真正生效。 其次，您的大多數Series.apply調用效率低下，並且存在將作用於整個 Series 的矢量化替代方案。

calculo_indice.py

def limites(df,n):
    n_sigma = n * df.valor_unitario.std()
    mean = df.valor_unitario.mean()
    lower_bound: float = mean - n_sigma
    upper_bound: float = mean + n_sigma
    return (lower_bound, upper_bound)


def indice(df, lower_bound, upper_bound):
    # Vectorized check
    df['isOutlier'] = ~df['valor_unitario'].between(lower_bound, upper_bound)
    df = df[~df.isOutlier]
    
    # Vectorized calculation
    df['indice'] = (df['valor_unitario']-lower_bound)/(upper_bound-lower_bound)*2000
    df = df.astype({'indice': 'int64'})
    
    return df

然后，您將調用limites ，將返回值定義給某些變量（因為它返回下限和上限）並將這些變量傳遞給indice

import calculo_indice as indice

# Assign lower bound and upper bound to variables `lb` and `ub` respectively
lb,ub = indice.limites(df, 2)

df = indice.indice(df, lower_bound=lb, upper_bound=ub)

如何從 Python 中的另一個文件調用 function 中定義的變量？

問題描述

1 個解決方案

解決方案1
1 已采納 2021-05-03 15:17:41

calculo_indice.py

如何從 Python 中的另一個文件調用 function 中定義的變量？

問題描述

1 個解決方案

解決方案1 1 已采納 2021-05-03 15:17:41

calculo_indice.py

解決方案1
1 已采納 2021-05-03 15:17:41