简体   繁体   English

参考其他值从 pandas dataframe 中提取值

[英]extracting value from pandas dataframe in reference to other values

I have a dataframe:我有一个 dataframe:

d= {'page_number':[0,0,0,0,0,0,1,1,1,1], 'text':[aa,ii,cc,dd,ee,ff,gg,hh,ii,jj]}
df = pd.DataFrame(data=d)
df
 
   page_number   text
0     0           aa
1     0           ii
2     0           cc
3     0           dd
4     0           ee
5     0           ff
6     1           gg
7     1           hh
8     1           ii
9     1           jj

I want to spot the page_numer where 'gg' appears, now on the same page_number there can be many different substrings, but I'm interested in extracting the row number of where 'ii' appears on the same page_number of 'gg' (not interested in getting results of other 'ii' substrings appearances)我想找出'gg'出现的page_numer,现在在同一个page_number上可以有许多不同的子字符串,但我有兴趣提取'ii'出现在'gg'的同一page_number上的行号(不是有兴趣获得其他“ii”子串出现的结果)

idx=np.where(df['text'].str.contains(r'gg', na=True))[0][0]

won't necessarily help here as it retrieves the row number of 'gg' but not its 'page_number'.在这里不一定有帮助,因为它检索“gg”的行号而不是它的“page_number”。

Many thanks非常感谢

You first leave only 'ii' and 'gg' appearances:你首先只留下'ii'和'gg'外观:

df = df[df['text'].isin(['ii', 'gg'])

Then by groupby page number we can assume that when ever we got 2 then they are on the same page:然后通过 groupby 页码,我们可以假设当我们得到 2 时,它们在同一页上:

df2 = df.groupby('page_number').count()
df2[df2['text'] == 2]

You can use pandas to retrieve column value on the basis of another column value.您可以使用 pandas 根据另一个列值检索列值。 I hope this will retrieve what you are looking for.我希望这将检索您正在寻找的东西。 df[df['text']=='gg']['page_number']

In case you have several 'gg's and 'ii's on any page:如果您在任何页面上有多个 'gg' 和 'ii':

This will return a boolean Series:这将返回 boolean 系列:

df = df.groupby(by='page_number').agg(lambda x: True if 'gg' in x.values 
                                      and 'ii' in x.values else False)

And this will get you the numbers of pages这将为您提供页数

df[df.text].index

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM