简体   繁体   English

如何在 pandas DataFrame 中按索引仅保留一组特定行

[英]How to keep only a certain set of rows by index in a pandas DataFrame

I have a DataFrame I created by doing the following manipulations to a.fits file:我有一个 DataFrame 通过对 a.fits 文件进行以下操作创建:

data_dict= dict()
for obj in sortedpab:
    for key in ['FIELD', 'ID',  'RA' , 'DEC' , 'Z_50', 'Z_84','Z_16' , 'PAB_FLUX', 'PAB_FLUX_ERR']:
        data_dict.setdefault(key, list()).append(obj[key])

gooddf = pd.DataFrame(data_dict)
gooddf['Z_ERR']= ((gooddf['Z_84'] - gooddf['Z_50']) + (gooddf['Z_50'] - gooddf['Z_16'])) / (2 * 
gooddf['Z_50'])
gooddf['OBS_PAB'] = 12820 * (1 + gooddf['Z_50'])
gooddf.loc[gooddf['FIELD'] == "ERS" , 'FIELD'] = "ERSPRIME"
gooddf = gooddf[['FIELD' , 'ID' , 'RA' , 'DEC' , 'Z_50' , 'Z_ERR' , 'PAB_FLUX' , 'PAB_FLUX_ERR' , 
'OBS_PAB']]
gooddf = gooddf[gooddf.OBS_PAB <= 16500]

Which gives me a DataFrame with 351 rows and 9 columns.这给了我一个具有 351 行和 9 列的 DataFrame。 I would like to keep rows only according to certain indices, and I thought for example doing something of this sort:我想只根据某些索引保留行,我想例如做这样的事情:

indices = [5 , 6 , 9 , 10]
gooddf = gooddf[gooddf.index == indices]

where I would like it to keep only the rows with the index values listed in the array indices, but this is giving me issues.我希望它只保留数组索引中列出的索引值的行,但这给我带来了问题。

I found a way to do this with a for loop:我找到了一种使用 for 循环的方法:

good = np.array([5 , 6 , 9 , 12 , 14 , 15 , 18 , 21 , 24 , 29 , 30 , 35 , 36 , 37 , 46 , 48 ])

gooddf50 = pd.DataFrame()
for i in range(len(good)):
    gooddf50 = gooddf50.append(gooddf[gooddf.index == good[i]])

Any thoughts on how to do this in a better way, preferably using just pandas?关于如何以更好的方式做到这一点的任何想法,最好只使用 pandas?

This will do the trick:这可以解决问题:

gooddf.loc[indices]

An important note: .iloc and .loc are doing slightly different things, which is why you may be getting unexpected results.重要说明: .iloc.loc做的事情略有不同,这就是为什么您可能会得到意想不到的结果。

You can read deeper into the details of indexing here , but the key thing to understand is that .iloc returns rows according to the positions specified, whereas .loc returns rows according to the index labels specified.您可以在此处深入了解索引的详细信息,但要了解的关键是.iloc根据指定的位置返回行,而.loc根据指定的索引标签返回行。 So if your indices aren't sorted, .loc and .iloc will behave differently.因此,如果您的索引未排序, .loc.iloc行为会有所不同。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM