[英]Use a value from one dataframe to lookup the value in another and return an adjacent cell value and update the first dataframe value
I have a 2 datasets (dataframes), one called source and the other crossmap.我有 2 个数据集(数据帧),一个称为源,另一个称为交叉图。 I am trying to find rows with a specific column value starting with "999", if one is found I need to look up the complete value of that column (ex "99912345") on the crossmap dataset (dataframe) and return the value from a column on that row in the cross-map.
我正在尝试查找具有以“999”开头的特定列值的行,如果找到一个,我需要在交叉图数据集(数据框)上查找该列(例如“99912345”)的完整值并返回值交叉图中该行的一列。
# Source Dataframe 0 1 2 3 4 ------ -------- -- --------- ----- 0 303290 544981 2 408300622 85882 1 321833 99910722 1 408300902 85897 2 323241 99902978 3 408056001 95564
# Cross Map Dataframe ID NDC ID DIN(NDC) GTIN NAME PRDID ------- ------ -------- -------------- ---------------------- ----- 44563 321833 99910722 99910722000000 SALBUTAMOL SULFATE (A) 90367 69281 321833 99910722 99910722000000 SALBUTAMOL SULFATE (A) 90367 6002800 323241 99902978 75402850039706 EPINEPHRINE (A) 95564 8001116 323241 99902978 99902978000000 EPINEPHRINE (A) 95564
The 'straw dog' logic I am working with is this:我正在使用的“稻草狗”逻辑是这样的:
df_source[df_source['Column1'].str.contains('999')]
It is these last two logic pieces where I am struggling with how to do this.这是最后两个逻辑部分,我正在努力解决如何做到这一点。 Appreciate any direction/guidance anyone can provide.
感谢任何人可以提供的任何方向/指导。
Is there maybe a better/easier means of doing this using python but not pandas/dataframes?使用 python 而不是 pandas/dataframes 是否有更好/更简单的方法?
So, as far as I understood you correctly: we are looking for the first digits of 999 in the 'Source Dataframe' in the first column of the value.因此,据我正确理解:我们正在值第一列的“源数据帧”中寻找 999 的第一位数字。 Next, we find these values in the 'Cross Map' column 'DIN(NDC)' and we get the values of the column 'PRDID' on these lines.
接下来,我们在“Cross Map”列“DIN(NDC)”中找到这些值,并在这些行上获取“PRDID”列的值。 If everything is correct, then I can't understand your further actions?
如果一切正确,那么我无法理解你的进一步行动?
import pandas as pd
import more_itertools as mit
Cross_Map = pd.DataFrame({'DIN(NDC)': [99910722, 99910722, 99902978, 99902978],
'PRDID': [90367, 90367, 95564, 95564]})
df = pd.DataFrame({0: [303290, 321833, 323241], 1: [544981, 99910722, 99902978], 2: [2, 1, 3],
3: [408300622, 408300902, 408056001], 4: [85882, 85897, 95564]})
m = [i for i in df[1] if str(i)[:3] == '999'] #find the values in column 1
index = list(mit.locate(list(Cross_Map['DIN(NDC)']), lambda x: x in m)) #get the indexes of the matched column values DIN(NDC)
print(Cross_Map['PRDID'][index])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.