[英]What is the correct way of selecting value from pandas dataframe using column name and row index?
what is the most efficient way of selecting value from pandas dataframe using column name and row index (by that I mean row number)?使用列名和行索引(我的意思是行号)从 pandas dataframe 中选择值的最有效方法是什么?
I have a case where I have to iterate through rows:我有一个必须遍历行的情况:
I have a working solution:我有一个可行的解决方案:
i = 0
while i < len(dataset) -1:
if dataset.target[i] == 1:
dataset.sum_lost[i] = dataset['to_be_repaid_principal'][i] + dataset['to_be_repaid_interest'][i]
dataset.ratio_lost[i] = dataset.sum_lost[i] / dataset['expected_returned_sum'][i]
else:
dataset.sum_lost[i] = 0
dataset.ratio_lost[i]= 0
i += 1
But this solution is so much RAM hungry.但是这个解决方案非常需要内存。 I am also getting the following warning:我还收到以下警告:
"A value is trying to be set on a copy of a slice from a DataFrame." “试图在 DataFrame 的切片副本上设置一个值。”
So I am trying to come up with another one:所以我试图想出另一个:
i = 0
while i < len(dataset) -1:
if dataset.iloc[i, :].loc['target'] == 1:
dataset.iloc[i, :].loc['sum_lost'] = dataset.iloc[i, :].loc['to_be_repaid_principal'] + dataset.iloc[i, :].loc['to_be_repaid_interest']
dataset.iloc[i, :].loc['ratio_lost'] = dataset.iloc[i, :].loc['sum_lost'] / dataset.iloc[i, :].loc['expected_returned_sum']
else:
dataset.iloc[i, :].loc['sum_lost'] = 0
dataset.iloc[i, :].loc['ratio_lost'] = 0
i += 1
But it does not work.但它不起作用。 I would like to come up with a faster/less ram hungry solution, because this will actually be web app a few users could use simultaneously.我想提出一个更快/更少内存消耗的解决方案,因为这实际上是 web 应用程序,一些用户可以同时使用。
Thanks a lot.非常感谢。
If you are thinking about "looping through rows", you are not using pandas right.如果您正在考虑“循环遍历行”,那么您没有正确使用 pandas。 You should think of terms of columns instead.您应该考虑列的术语。
Use np.where
which is vectorized (read: fast):使用矢量化的np.where
(阅读:快速):
cond = dataset['target'] == 1
dataset['sumlost'] = np.where(cond, dataset['to_be_repaid_principal'] + dataset['to_be_repaid_interest'], 0)
dataset['ratio_lost'] = np.where(cond, dataset['sumlost'] / dataset['expected_returned_sum'], 0)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.