简体   繁体   English

如何提取特定类别之前的最后3个索引号

[英]How to extract the last 3 indices numbers before a specific category

UPDATE 更新

I have the following dataset, and I wish to get a list that includes the last three indices before 'YES' label. 我有以下数据集,我希望得到一个包含“ YES”标签前的最后三个索引的列表。 My dataset: 我的数据集:

i            category
0               NO
1               NO
2               NO
3               NO
4               NO
5               YES
6               YES
7               YES
8               NO
9               NO
10              NO
11              YES
12              YES

I expect the outcome to be: 我希望结果是:

list=[2,3,4,8,9,10] list = [2,3,4,8,9,10]

Please note that YES usually occur in consecutive range of samples (2-6 samples). 请注意,在连续的样本范围(2-6个样本)中通常会出现“是”。 I wish to get the the last three indices before the first YES in the range. 我希望得到该范围内第一个“是”之前的最后三个索引。

PS: The dataset was stored in a csv file and I imported by using pandas PS:数据集存储在一个csv文件中,我通过使用熊猫导入

Probably not the most pythonic way, but I couldn't think of a way to do this without aa for loop and some slicing, feels like a hacky method: 可能不是最pythonic的方式,但是我想不出没有aa for循环和一些切片的方法,感觉就像是一种hacky方法:

a = df[((df.category.ne(df.category.shift()))==True) & (df.category == 'YES')].index


indices = []
for x in a:
    indices.append(df.iloc[slice(max(0, x-3), min(x, len(df)))])
new_df = pd.concat(indices) # if you wanted this as a df.

list(new_df.index)


[2, 3, 4, 8, 9, 10]

Let's assume, as you stated on your comment, that there are always at least 3 items before every YES. 如您在评论中所述,让我们假设在每次“是”之前始终至少有3个项目。 A possible solution will be 一个可能的解决方案是

import pandas as pd

flatten = lambda l: [item for sublist in l for item in sublist]

df = pd.DataFrame({"category":['NO', 'NO', 'NO', 'NO', 'NO',
                               'YES', 'NO', 'NO', 'NO', 'NO',
                               'NO','YES','NO']})
# take only indices where YES occurs
idx = df[df["category"]=="YES"].index

# for every i in idx take the previuos 3 indices
lst = [list(range(i-3, i)) for i in idx]

# flatten lst
lst = flatten(lst)

Here's some code that's easy to read and does what you want. 这是一些易于阅读并且可以完成您想要的代码。 it iterates over the indices of the list and pulls out what you need. 它遍历列表的索引并提取您所需的内容。

the second for loops is to simply flatten the double list from the result list. 第二个for循环是简单地从结果列表中展平双精度列表。

li= ['1','2','3','4','YES','6','7','8','9','0','YES']
result = []
for x in range(len(li)):
  if li[x] is 'YES':
    result.append(li[x-3:x])


final= []
for x in result:
  for y in x:
    final.append(y)

final = ['2', '3', '4', '8', '9', '0'] 最终= ['2','3','4','8','9','0']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何在Python中从字符串中提取数字及其索引 - How to extract numbers and their indices from a string in Python 获取字符串中数字的索引并提取数字前后的单词(不同语言) - Get the indices of numbers in a string and extract words before and after the number (in different languages) 使用正则表达式提取特定单词前的数字 - Use regular expression to extract numbers before specific words 如何切片 numpy 数组以提取多维数组中的特定索引 - How to slice numpy array to extract specific indices in a multidimentional array 如何根据 python 中的条件提取特定数字 - How to extract specific numbers based on the conditions in python Python-如何从字符串中提取特定数字? - Python - How extract specific numbers from strings? 如何在python中的特定关键字之前提取文本? - How to extract text before a specific keyword in python? 如何提取Pandas中特定字符串前的数字? - How to extract number before specific strings in Pandas? 如何使用 Pandas 创建一个 if 语句,获取 csv 的最后一行并在特定 col 提取,如果 col 为空,请检查它之前的行? - How to create a if-statement using Pandas, get last row of csv and extract at a specific col, IF col is empty, check line before it? 如何从 Wikipedia 中提取特定类别(如 Person)的数据? - How to extract data from Wikipedia for a specific category like Person?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM