简体   繁体   English

我的 matplotlib 图上绘制了多少数据点?

[英]How many data points are plotted on my matplotlib graph?

So I want to count the number of data points plotted on my graph to keep a total track of graphed data.所以我想计算在我的图表上绘制的数据点的数量,以保持对图表数据的总跟踪。 The problem is, my data table messes it up to where there are some NaN values in a different row in comparison to another column where it may or may not have a NaN value.问题是,与可能具有或不具有 NaN 值的另一列相比,我的数据表将它弄乱到在不同行中有一些 NaN 值的地方。 For example:例如:

# I use num1 as my y-coordinate and num1-num2 for my x-coordinate.
num1 num2 num3 
1    NaN  25 
NaN  7    45
3    8    63
NaN  NaN  23
5    10   42
NaN  4    44

#So in this case, there should be only 2 data point on the graph between num1 and num2. For num1 and num3, there should be 3. There should be 4 data points between num2 and num3.

I believe Matplotlib doesn't graph the rows of the column that contain NaN values since its null (please correct me if I'm wrong, I can only tell this due to no dots being on the 0 coordinate of the x and y axes).我相信 Matplotlib 不会绘制包含 NaN 值的列的行,因为它的 null (如果我错了,请纠正我,我只能告诉这一点,因为 x 和 y 轴的 0 坐标上没有点) . In the beginning, I thought I could get away with using.count() and find the smaller of the two columns and use that as my tracker, but realistically that won't work as shown in my example above because it can be even LESS than that since one may have the NaN value and the other will have an actual value.一开始,我认为我可以摆脱 using.count() 并找到两列中较小的一个并将其用作我的跟踪器,但实际上这不会像我上面的示例所示那样工作,因为它甚至可以更少因为一个可能有 NaN 值,而另一个可能有实际值。 Some examples of code I did:我做的一些代码示例:

# both x and y are columns within the DataFrame and are used to "count" how many data points are # being graphed.
def findAmountOfDataPoints(colA, colB):
    if colA.count() < colB.count():
         print(colA.count())           # Since its a smaller value, print the number of values in colA.
    else: 
         print(colB.count())              # Since its a smaller value, print the number of values in colB.

Also, I thought about using.value_count() but I'm not sure if thats the exact function I'm looking for to complete what I want.另外,我考虑过 using.value_count() 但我不确定这是否是我正在寻找的确切 function 来完成我想要的。 Any suggestions?有什么建议么?

Edit 1: Changed Data Frame names to make example clearer hopefully.编辑 1:更改数据框名称以使示例更清晰。

If I understood correctly your problem, assuming that your table is a pandas dataframe df , the following code should work:如果我正确理解了您的问题,假设您的表是 pandas dataframe df ,则以下代码应该可以工作:

sum((~np.isnan(df['num1']) & (~np.isnan(df['num2']))))

How it works:这个怎么运作:

np.isnan returns True if a cell is Nan.如果单元格是 Nan,则np.isnan返回 True。 ~np.isnan is the inverse, hence it returns True when it's not Nan. ~np.isnan是相反的,因此当它不是 Nan 时它返回 True。

The code checks where both the column "num1" AND the column "num2" contain a non-Nan value, in other words it returns True for those rows where both the values exist.代码检查列“num1”和列“num2”都包含非Nan 值的位置,换句话说,对于同时存在这两个值的那些行,它返回True。

Finally, those good rows are counted with sum , which takes into account only True values.最后,那些好的行用sum来计算,它只考虑 True 值。

The way I understood it is that the number of combiniations of points that are not NaN is needed.我理解它的方式是需要非NaN点的组合数量。 Using a function I found I came up with this:使用 function 我发现我想出了这个:

import pandas as pd
import numpy as np

def choose(n, k):
    """
    A fast way to calculate binomial coefficients by Andrew Dalke (contrib).
    https://stackoverflow.com/questions/3025162/statistics-combinations-in-python
    """
    if 0 <= k <= n:
        ntok = 1
        ktok = 1
        for t in range(1, min(k, n - k) + 1):
            ntok *= n
            ktok *= t
            n -= 1
        return ntok // ktok
    else:
        return 0


data = {'num1': [1, np.nan,3,np.nan,5,np.nan],
        'num2': [np.nan,7,8,np.nan,10,4],
        'num3': [25,45,63,23,42,44]
        }

df = pd.DataFrame(data)

df['notnulls'] = df.notnull().sum(axis=1)

df['plotted'] = df.apply(lambda row: choose(int(row.notnulls), 2), axis=1)
print(df)
print("Total data points: ", df['plotted'].sum())

With this result:有了这个结果:

   num1  num2  num3  notnulls  plotted
0   1.0   NaN    25         2        1
1   NaN   7.0    45         2        1
2   3.0   8.0    63         3        3
3   NaN   NaN    23         1        0
4   5.0  10.0    42         3        3
5   NaN   4.0    44         2        1
Total data points:  9

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM