[英]Counting from a column in a Pandas Dataframe
I am attempting to count the number of instances of an element in a column of a Pandas Dataframe based on a set of criteria.我试图根据一组标准计算 Pandas 数据框列中元素的实例数。 I am running into difficulty in a few places.我在几个地方遇到了困难。
Here is what I have up to this point.这是我目前所拥有的。 It effectively reads the CSV, drops the duplicates, and sorts df2.它有效地读取 CSV,删除重复项,并对 df2 进行排序。 I am performing all of these steps in order to isolate the criteria I want to use in the future.我正在执行所有这些步骤,以便隔离我将来要使用的标准。 Frankly, this may even be an extra step I do not need.坦率地说,这甚至可能是我不需要的额外步骤。
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
# importing all required modules numpy, pyplot, and pandas
df= pd.read_csv('file.csv')
# reading the CSV file as a pandas dataframe
df2 = df.drop_duplicates(subset="MRCEmp")
df2 = df2.sort_values(["CLNum"])
# creating duplicate dataframe eliminating duplicate pairs
# sorting df2 in ascending order by column "CLNum"
clmax = df2["CLNum"].max()
clmin = df2["CLNum"].min()
# creating variables as int to define the maximum and minimum of the "CLNum: column
for n in df2["CLNum"]:
if n not in df2["CLNum"]:
n = n + 1
elif n in df2["CLNum"]:
print(df2.loc[df2["CLNum"] == n])
n = n + 1
I should note that not all integers are represented in df2["CLnum"]
that is why I inserted the first for loop.我应该注意,并非所有整数都在df2["CLnum"]
表示,这就是我插入第一个 for 循环的原因。
When running this script however, not all of the rows are displayed.但是,在运行此脚本时,并未显示所有行。 clmax = 728
and clmin = 1
, but the final row displayed holds an n value of 283. I cannot find why not all rows are displayed. clmax = 728
和clmin = 1
,但显示的最后一行的 n 值为 283。我找不到为什么不显示所有行。
尝试熊猫value_counts
函数
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.