简体   繁体   English

Python Dataframe:如何检查特定列的元素

[英]Python Dataframe: How to check specific columns for elements

I want to check whether all elements from a certain column contain the number 0?我想检查某个列中的所有元素是否都包含数字 0?

I have a dataset that I read with df=pd.read_table('ad-data')我有一个使用df=pd.read_table('ad-data')读取的数据集
From this I felt a data frame with elements从这里我感觉到了一个带有元素的数据框

[0] [1.] [2] [3] [4] [5] [6] [7] ....1559

[1.]  3   2   3   0   0   0   0

[2]  2   3   2   0   0   0   0

[3]  3   2   2   0   0   0   0

[4]  6   7   3   0   0   0   0

[5]  3   2   1   0   0   0   0

...
3220

I would like to check whether the data set from column 4 to 1559 contains only 0 or also other values.我想检查从第 4 列到 1559 的数据集是否只包含 0 或其他值。

在此处输入图像描述

You can check for equality with 0 element-wise and use all for rows:您可以使用 0 元素检查相等性并将all用于行:

df['all_zeros'] = (df.iloc[:, 4:1560] == 0).all(axis=1)

Small example to demonstrate it (based on columns 1 to 3 here):演示它的小示例(基于此处的第 1 到 3 列):

N = 5
df = pd.DataFrame(np.random.binomial(1, 0.4, size=(N, N)))
df['all_zeros'] = (df.iloc[:, 1:4] == 0).all(axis=1)
df

Output: Output:

   0  1  2  3  4  all_zeros
0  0  1  1  0  0      False
1  0  0  1  1  1      False
2  0  1  1  0  0      False
3  0  0  0  0  0       True
4  1  0  0  0  0       True

Update: Filtering non-zero values:更新:过滤非零值:

df[~df['all_zeros']]

Output: Output:

   0  1  2  3  4  all_zeros
0  0  1  1  0  0      False
1  0  0  1  1  1      False
2  0  1  1  0  0      False

Update 2: To show only non-zero values:更新 2:仅显示非零值:

pd.melt(
    df_filtered.iloc[:, 1:4].reset_index(),
    id_vars='index', var_name='column'
).query('value != 0').sort_values('index')

Output: Output:

   index column  value
0      0      1      1
3      0      2      1
4      1      2      1
7      1      3      1
2      2      1      1
5      2      2      1
df['Check']=df.loc[:,4:].sum(axis=1)

here is the way to check if all of values are zero or not: it's simple and doesn't need advanced functions as above answers.这是检查所有值是否为零的方法:它很简单,不需要上述答案的高级功能 only basic functions like filtering and if loops and variable assigning.只有基本功能,如过滤和 if 循环和变量分配。

first is the way to check if one column has only zeros or not and second is how to find if all the columns have zeros or not.首先是检查一列是否只有零的方法,其次是如何查找所有列是否都为零。 and it prints and answer statement.它打印和回答声明。

the method to check if one column has only zero values or not:检查一列是否只有零值的方法:

first make a series:先做一个系列:

 has_zero = df[4] == 0
 # has_zero is a series which contains bool values for each row eg. True, False.
 # if there is a zero in a row it result will be "row_number : True"

next:下一个:

rows_which_have_zero = df[has_zero]
# stores the rows which have zero as a data frame 

next:下一个:

if len[rows_which_have_zero] == total_number_rows:
    print("contains only zeros")
else: 
    print("contains other numbers than zero")
# substitute total_number_rows for 3220 

the above method only checks if rows_which_have_zero is equal to amount of the rows in the column.上述方法仅检查 rows_which_have_zero 是否等于列中的行数。

the method to see if all of the columns have only zero or not:查看所有列是否只有零的方法:

it uses the above function and puts it into a if loop.它使用上面的 function 并将其放入 if 循环中。

no_of_columns = 1559
value_1 = 1

if value_1 <= 1559
     has_zero = df[value_1] == 0
     rows_which_have_zero = df[has_zero]
     value_1 += 1
     if len[rows_which_have_zero] == 1559 
         no_of_rows_with_only_zero += 1
     else:
         return

to check if all rows have zero only or not:检查所有行是否只有零:

   #since it doesn't matter if first 3 columns have zero or not:
   no_of_rows_with_only_zero = no_of_rows_with_only_zero - 3
   if no_of_rows_with_only_zero == 1559:
       print("there are only zero values")
   else:
       print("there are numbers which are not zero")

above checks if no_of_rows_with_only_zero is equal to the amount of rows (which is 1559 minus 3 because only rows 4 - 1559 need to be checked)上面检查 no_of_rows_with_only_zero 是否等于行数(即 1559 减去 3,因为只需要检查第 4 - 1559 行)

update:更新:

  # convert the value_1 to str if the column title is a str instead of int 
  # when updating value_1 by adding: convert it back to int and then back to str 

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM