简体   繁体   English

如何将 function 应用于 pandas dataframe 中一列的每一行?

[英]How to apply a function to each row of one column in a pandas dataframe?

I have a dataframe df of stock prices of length ~600k, which I downloaded from here .我有一个 dataframe df的股票价格,长度约为 600k,我从这里下载。

在此处输入图像描述

I have renamed the last column name from 'Name' to 'ticks', and created a new blank column called 'Name':我已将最后一列名称从“名称”重命名为“刻度”,并创建了一个名为“名称”的新空白列:

df = df.rename(columns={'Name': 'Ticker'})
df['Name'] = ''

I have written the following function to return the company name for a given ticker symbol:我编写了以下 function 以返回给定股票代码的公司名称:

! pip3 install yfinance
import yfinance as yf

def return_company_name(ticker):
    return yf.Ticker(ticker).info['longName']

return_company_name('MSFT')
>>> 'Microsoft Corporation'

Now, I want to populate the column 'Name' with the company name of the corresponding ticker symbols.现在,我想用相应股票代码的公司名称填充“名称”列。 For that, I have written the following lambda function:为此,我编写了以下 lambda function:

df.Name = df.Ticker.apply(lambda x: return_company_name(x))

But this last line of code just keeps on running.但是最后一行代码只是继续运行。 Is there something going wrong?有什么问题吗? If yes, how do I fix it?如果是,我该如何解决?

I tried the same with map instead of apply , but same result.我尝试使用map而不是apply ,但结果相同。

First, you don't need a lambda or apply .首先,您不需要lambdaapply

 df.Name = df.Ticker.map(return_company_name)

Is better.更好。 Second, as pointed out by others, this is grotesquely inefficient.其次,正如其他人所指出的,这是非常低效的。 You are making the call 600000 times, even though your number of tickers is much smaller.您拨打了 600000 次电话,即使您的代码数量要少得多。 The following sledgehammer approach will work:以下大锤方法将起作用:

class my_return():
     def __init__(self):
         self.tickdict = {}
     def __call__(self, ticker):
         ans = self.tickdict.get(ticker, None)
         if ans is not None:
             return ans
         else:
            self.tickdict[ticker] = return_company_name(ticker)
            return self.tickdict[ticker]

Then map my_return on your ticker column.然后 map my_return 在您的股票行情上。

Looking at the source from yfinance you can see here that the get_info method calls _get_fundamentals which in turn seems to do quite a few API calls to different sites to get the information it needs.查看来自 yfinance 的源代码,您可以在此处看到get_info方法调用_get_fundamentals ,这反过来似乎对不同站点进行了很多API 调用以获取所需的信息。

Since this is executed for every row you run into some trouble as the sites might rate limit you.由于这是针对每一行执行的,因此您会遇到一些麻烦,因为站点可能会限制您。 Maybe you could do a prestep of getting all the unique names and then looking them up once and saving them in some kind of lookup CSV or the like也许您可以先获取所有唯一名称,然后查找它们一次并将它们保存在某种查找 CSV 等中

You can use pandas.apply() to apply a function to each row/column in Dataframe.您可以使用pandas.apply()将 function 应用于 Dataframe 中的每一行/列。

You also can use lambda function to each column.您还可以对每一列使用 lambda function。 For example:例如:
modDfObj = dfObj.apply(lambda x: x + 10)


Another example (Here, it only applies the function to the column z ):另一个例子(这里,它只将 function 应用于z列):

modDfObj = dfObj.apply(lambda x: np.square(x) if x.name == 'z' else x)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 将函数应用于pandas数据框列中每一行的每个单词 - apply function to each word of every row in pandas dataframe column 如何将 function 应用于 pandas dataframe 中的每一行? - How can I apply a function to each row in a pandas dataframe? 按组将函数应用于 Pandas 数据框中的每一行 - Apply function to each row in Pandas dataframe by group 在pandas数据帧中的每列上应用函数 - Apply function on each column in a pandas dataframe Pandas DataFrame将特定功能应用于每列 - Pandas DataFrame apply Specific Function to Each column A pandas dataframe 列作为行级 function 的参数传递,以将列的每个值应用于其各自的行 - A pandas dataframe column to pass as an argument of row level function to apply each value of the column to its respective row In Pandas, how do I apply a function to a row of a dataframe, where each item in the row should be passed to the function as an argument? - In Pandas, how do I apply a function to a row of a dataframe, where each item in the row should be passed to the function as an argument? Pandas 将函数应用于数据帧的每一行以返回每个条目的一个或多个新行 - Pandas apply function to each row of a dataframe to return one or more new rows per entry 用于将函数应用于 Pandas DataFrame 中的每一行的应用函数的替代方法 - Alternative to apply function for applying a function to each row in Pandas DataFrame 如何创建一个函数并申请pandas中的每一行? - How to create a function and apply for each row in pandas?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM