简体   繁体   English

我可以使用lambda,map,apply或applymap来填充数据帧吗?

[英]Can I use lambda, map, apply, or applymap to fill a dataframe?

This is a simplified version of my data. 这是我数据的简化版本。 I have a dataframe of coordinates, and an empty dataframe which should be filled with the distance of each pair using the function provided. 我有一个坐标数据框和一个空数据框,应使用提供的函数填充每对的距离。

What is the quickest method to fill this dataframe? 填充此数据帧的最快方法是什么? As much as possible, I want to stay away from nested for loops (slow!). 尽可能地,我想远离嵌套for循环(慢!)。 Can I use apply or applymap? 我可以使用apply或applymap吗? You may modify the function or other parts accordingly. 您可以相应地修改功能或其他部分。 Thanks. 谢谢。

import pandas as pd

def get_distance(point1, point2):
    """Gets the coordinates of two points as two lists, and outputs their distance"""
    return (((point1[0] - point2[0]) ** 2 + (point1[1] - point2[1]) ** 2 + (point1[2] - point2[2]) ** 2) ** 0.5)

#Dataframe of coordinates.    
df = pd.DataFrame({"No.": [25, 36, 70, 95, 112, 101, 121, 201], "x": [1,2,3,4,2,3,4,5], "y": [2,3,4,5,3,4,5,6], "z": [3,4,5,6,4,5,6,7]})
df.set_index("No.", inplace = True)

#Dataframe to be filled with each pair distance.
df_dist = pd.DataFrame({'target': [112, 101, 121, 201]}, columns=["target", 25, 36, 70, 95])
df_dist.set_index("target", inplace = True)

If you don't want to use for loops, you can compute the distances between all the possible pairs in the following way. 如果您不想使用for循环,则可以通过以下方式计算所有可能对之间的距离。

You first need to do the cartesian product of df with itself to have all the possible pairs of point. 你首先需要自己做df的笛卡尔积,得到所有可能的点对。

i, j = np.where(1 - np.eye(len(df)))
df=df.iloc[i].reset_index(drop=True).join(
    df.iloc[j].reset_index(drop=True), rsuffix='_2')

Where i and j are the boolean indexes of the upper and lower triangles of a square matrix of size len(df) . 其中ij是大小为len(df)方阵的上下三角形的布尔索引。 After you did this you just need to apply your distance function 完成此操作后,您只需应用距离函数即可

df['distance'] = get_distance([df['x'],df['y'],df['z']], [df['x_2'],df['y_2'],df['z_2']])
df.head()

No. x   y   z   No._2   x_2 y_2 z_2 distance
0   25  1   2   3   36  2   3   4   1.732051
1   25  1   2   3   70  3   4   5   3.464102
2   25  1   2   3   95  4   5   6   5.196152
3   25  1   2   3   112 2   3   4   1.732051
4   25  1   2   3   101 3   4   5   3.464102

If you wanted to compute only the points from df_dist you can modify accordingly the matrix 1 - np.eye(len(df)) . 如果你只想计算df_dist中的点,你可以相应地修改矩阵1 - np.eye(len(df))

AFAIK there are no clear speed benefit of lambda over a for loop - and it's very hard to write a double lambda, usually that is reserved for straightforward row operations. AFAIK对于for循环没有明确的lambda速度优势 - 并且编写双lambda非常困难,通常是为简单的行操作保留的。

However with some engineering, we can reduce our code to a few simple and self explanatory lines: 但是通过一些工程,我们可以将代码简化为一些简单明了的解释:

import numpy as np

get = lambda i: df.loc[i,:].values
dist = lambda i, j: np.sqrt(sum((get(i) - get(j))**2))
# Fills your df_dist
for i in df_dist.columns:
    for j in df_dist.index:
        df_dist.loc[j,i] = dist(i, j)

The resulting df_dist : 由此产生的df_dist

              25        36        70        95
target                                        
112     1.732051  0.000000  1.732051  3.464102
101     3.464102  1.732051  0.000000  1.732051
121     5.196152  3.464102  1.732051  0.000000
201     6.928203  5.196152  3.464102  1.732051

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 我可以使用applymap来更改数据帧的变量名称 - can I use applymap to change variable names of dataframe 如何使用applymap,lambda和dataframe一起过滤/修改python中的数据帧? - How to use applymap, lambda and dataframe together to filter / modify dataframe in python? Pandas:如何使用 applymap/apply function 与 dataframe 争论而不循环 - Pandas: How to use applymap/apply function with arguements to a dataframe without looping 如何使用apply、map或applymap在pandas数据框中查找每一行和列数据类型? - How to find each row and column data type in pandas dataframe using apply, map or applymap? Pandas 中 map、applymap 和 apply 方法的区别 - Difference between map, applymap and apply methods in Pandas 为什么我对Pandas Series.apply和DataFrame.applymap获得不同的结果? - Why do I get different results for pandas Series.apply and DataFrame.applymap? 在DataFrame上使用applymap,但保留索引/列信息 - Use applymap on DataFrame but keep index/column info 如何使用 apply 或 applymap 在新列上转换 function 的结果并将结果存储在字符串中? - How do I use the apply or applymap to translate results of a function on a new column and store the results in a string? 使用.apply,.applymap,.groupby转换Pandas DataFrame中的异常值 - Transforming outliers in Pandas DataFrame using .apply, .applymap, .groupby Pandas 使用 pandas apply/map/Applymap 将值插入新列 - Pandas inserting values to new column using pandas apply/map/Applymap
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM