简体   繁体   English

根据一列的连续值获取数据框的行

[英]Get the rows of dataframe based on the consecutive values of one column

Are there way to get consecutive rows according to value of specific column?有没有办法根据特定列的值获取连续行? For example:例如:

column1第一列 column2列2 View看法
row1第 1 行 1 1 2 2 c C
row2行 2 3 3 4 4 a一种
row3第 3 行 5 5 6 6 p
row4第 4 行 7 7 8 8 p
row5第5行 9 9 10 10 n n

I need to get the rows that have the letter of word 'app' as View, so in this example I need to save row2, row3 and row4 in a list.我需要获取包含单词“app”字母的行作为视图,因此在本示例中,我需要将row2、row3 和 row4保存在列表中。

Here is a generalizable approach.这是一个通用的方法。 I use index_slice_by_substring() to generate a tuple of integers representing the beginning and ending row.我使用index_slice_by_substring()来生成代表开始和结束行的整数元组。 The function rows_by_consecutive_letters() takes your dataframe, the column name to check, and the string you want to look for, and for the return value it utilizes .iloc to grab a slice of the table by integer values.函数rows_by_consecutive_letters()获取您的数据rows_by_consecutive_letters() 、要检查的列名以及您要查找的字符串,对于返回值,它利用.iloc按整数值抓取表的一部分。

The key to getting the slice indices is joining the "View" column values together into a single string using ''.join(df[column]) and checking substrings of the same length as the condition string from left to right until there's a match获取切片索引的关键是使用''.join(df[column])将“View”列值连接到一个字符串中,并从左到右检查与条件字符串长度相同的子字符串,直到匹配为止

def index_slice_by_substring(full_string, substring) -> tuple:
    len_substring = len(substring)
    len_full_string = len(full_string)
    for x0, x1 in enumerate(range(len_substring,len_full_string)):
        if full_string[x0:x1] == substring:
            return (x0,x1)

def rows_by_consecutive_letters(df, column, condition) -> pd.DataFrame:
    row_begin, row_end = index_slice_by_substring(''.join(df[column]), condition)
    return df.iloc[row_begin:row_end,:]

print(rows_by_consecutive_letters(your_df,"View","app"))

Returns:返回:

   column1  column2 View
1        3        4    a
2        5        6    p
3        7        8    p

Not the pythonic way, but doing the work:不是pythonic方式,而是做工作:

keep = []
for i in range(len(df) - 2):
    if (df.View[i]=='a') & (df.View[i+1] =='p') & (df.View[i+2] =='p'):
        keep.append(df[i])
        keep.append(df[i+1])
        keep.append(df[i+2])

Result:结果:

在此处输入图片说明

You can use str.find but this only finds the first occurrence of your search term.您可以使用str.find但这只会找到您的搜索词的第一次出现。

search = 'app'
i = ''.join(df.View).find(search)
if i>-1:
    print(df.iloc[i: i+len(search)])

Output输出

      column1  column2 View                         
row2        3        4    a
row3        5        6    p
row4        7        8    p

To find none (without error checking), one and all occurrences you can use re.finditer .要查找无(没有错误检查),您可以使用re.finditer一次和所有出现。 The result is a list of dataframe slices.结果是数据帧切片列表。

import re
search='p'   # searched for 'p' to find more than one
[df.iloc[x.start():x.end()] for x in re.finditer(search, ''.join(df.View))]

Output输出

[      column1  column2 View                        
 row3        5        6    p,
       column1  column2 View                         
 row4        7        8    p]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用 Groupby 根据 Pandas 中列中的值从 DataFrame 中选择 CONSECUTIVE 行 - Select CONSECUTIVE rows from a DataFrame based on values in a column in Pandas with Groupby Python 数据框 - 基于列删除连续行 - Python dataframe - drop consecutive rows based on a column 基于 2 个连续行的值过滤 pandas Dataframe - Filter of pandas Dataframe based on values of 2 consecutive rows Python Pandas:如何在 Z6A8064B5DF4794555500553C47C55057DZ 的特定列中的两个非连续行中相减 - Python Pandas: How to subtract values in two non-consecutive rows in a specific column of a dataframe from one another Pandas:获取具有连续列值的行 - Pandas: Get rows with consecutive column values 根据另一个数据框中的值从DataFrame中选择行,并根据第二个DataFrame使用值更新其中一个列 - Select rows from a DataFrame based on a values in another dataframe and updating one of the column with values according to the second DataFrame 根据条件查找pandas Dataframe中行中的连续值 - Find consecutive values in rows in pandas Dataframe based on condition Pandas 数据框根据 groupby 随机打乱连续的值行 - Pandas dataframe randomly shuffle consecutive rows of values based on groupby 随机整理DataFrame的行,直到列中的所有连续值都不同? - Shuffle rows of a DataFrame until all consecutive values in a column are different? 包含列表的熊猫数据框列,获取两个连续行的交集 - Pandas dataframe column containing list, Get intersection of two consecutive rows
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM