[英]Attribuate a number from an other Dataframe - Series([], Name: id, dtype: float64)
I have two pandas Dataframes, the first one named source in which we have ID and Names (ID, Username) source.head()
我有两个 Pandas Dataframes,第一个名为 source,其中我们有 ID 和 Names (ID, Username)
source.head()
The second one named data_code, in which we have also just unsernames (0) column and an code column in which I will try to get IDs in it.第二个名为 data_code,其中我们也只有 unsernames (0) 列和一个代码列,我将尝试在其中获取 ID。
data_code.head()
What I did is to create a function that will look for sames Usenames in two Dataframes and get the ID of the username from source Dataframe, if does not exist it will generate a random ID.我所做的是创建一个函数,该函数将在两个 Dataframe 中查找相同的 Usenames,并从源 Dataframe 中获取用户名的 ID,如果不存在,它将生成一个随机 ID。 In my solution I tried to create a dictionary in which I will have only unique values.
在我的解决方案中,我尝试创建一个字典,其中只有唯一值。
uniqueIDs = data_code[0].unique()
FofToID= {}
Then I will fill the dictionary with Id using this function然后我将使用这个函数用 Id 填充字典
for i in range(len(uniqueIDs)):
if uniqueIDs[i] in list(source["username"]):
FofToID[uniqueIDs[i]]= np.float_(source[source["username"]==i]["id"])
else:
FofToID[uniqueIDs[i]]= int(random.random()*10000000)
the output was like below:输出如下:
My problem that all values existing in the
source
Dataframe get the value Series([], Name: id, dtype: float64).我的问题是
source
帧中存在的所有值都获得值 Series([], Name: id, dtype: float64)。 I tried to fix this problem but I failed.我试图解决这个问题,但我失败了。
Use merge
with left join, for non exist values id
use fillna
and last create Series
by set_index
with to_dict
:使用左连接
merge
,对于不存在的值id
使用fillna
并最后通过set_index
和to_dict
创建Series
:
source = pd.DataFrame({'id':[111111,222222,666666,888888], 'username':['aa','ss','dd','ff']})
data_code = pd.DataFrame({'code':[0]*4, 0:['ss','dd','rr','yy']})
FofToID = (data_code.merge(source, left_on=0, right_on='username', how='left')
.fillna({'id': int(random.random()*10000000)})
.set_index(0)['id']
.to_dict()
)
print (FofToID)
{'ss': 222222.0, 'dd': 666666.0, 'rr': 367044.0, 'yy': 367044.0}
I would like to thanks @jezrael for his contribution, here is the final solution I get it:我要感谢@jezrael 的贡献,这是我得到的最终解决方案:
for i in range(len(uniqueIDs)):
if uniqueIDs[i] in list(source["username"]): FofToID[uniqueIDs[i]]= int(source[source["username"]==uniqueIDs[i]]["id"])
else: FofToID[uniqueIDs[i]]= int(random.random()*10000000)
the output look like that输出看起来像这样
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.