I have two dataframes. DF1 and DF2. I am comparing absolute distances between coordinate pairs from both. I want to populate a new dataframe that has rows for each df1 coordinate pair and a column for each df2 coordinate pair.
This would result in the absolute distance between each df1 pair and each df2 pair. This is my code so far and I'm struggling to figure out how to populate the new dataframe with each iteration.
`df_new = pd.DataFrame(index=df1.index.copy())
for idx_crime, x_crime in enumerate(df2['X_COORD']):
y_crime = df2['Y_COORD'].iloc[idx_crime]
for idx_subway, x_subway in enumerate(df1['X_COORD']):
y_subway = df1['Y_COORD'].iloc[idx_subway]
dist = np.sqrt((x_crime - x_subway)**2 + (y_crime - y_subway)**2)
append.df_new
return df_new`
It isn't running. Any ideas of how to fill out this new dataframe?
EDIT Sample Data
DF2 Coordinates:
X_COORD Y_COORD
0 1007314.0 241257.0
1 1043991.0 193406.0
2 999463.0 231690.0
3 1060183.0 177862.0
4 987606.0 208148.0
DF1 Coordinates:
X_COORD Y_COORD
0 1020671.0 248680.0
1 1019420.0 245867.0
2 1017558.0 245632.0
So df_new would look like this. Just the index numbers would work for column headings. I just wanted to show you how the data would look:
df2_coord0 df2_coord1 df2_coord2
df1_coord0 13356.72213 23318.81485 21207.59944
df1_coord1 12105.8096 24569.93244 19956.64481
Apparently, append.df_new
is wrong.If that's your pseudo code, then you need insert cells to a dataFrame.Here are two ways: using position indexing or using conditional indexing .
Sample code:
import pandas as pd
lst = [
{"a":1,"b":1},
{"a":2,"b":2}
]
df = pd.DataFrame(lst)
df.loc[2] = [3, 3] #2 here should be your desire index
df.loc[3] = {"a":4,"b":4} #3 here should be your desire index
print df
I had to break down df2 into smaller dfs to not throw a memory error. I changed the for loop to this and it works...just took a while to get there:
df_new = pd.DataFrame(index = df1.index.copy(),columns = df2.index.copy())
for idx_crime, x_crime in enumerate(df2['X_COORD']):
y_crime = df2['Y_COORD'].iloc[idx_crime]
for idx_subway, x_subway in enumerate(df1['X_COORD']):
y_subway = df1['Y_COORD'].iloc[idx_subway]
dist = np.sqrt((x_crime - x_subway)**2 + (y_crime - y_subway)**2)
df_new.iloc[idx_subway, idx_crime] = dist
return df_new
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.