为什么 n1 和 n2 之间存在差异？

Question

I read a csv data in two ways, get different results.我以两种方式读取 csv 数据，得到不同的结果。 one way is the directly extract 'value' column one time from a csv using pandas another way is to extract 'value' class by class and append them together.一种方法是使用 pandas 从 csv 中直接提取“值”列一次，另一种方法是逐类提取“值”并将它们附加在一起。 ideally, the two results should be the same, but I do see difference.理想情况下，这两个结果应该是相同的，但我确实看到了差异。 the sequence of class is U1 U2 U7 U8 U9 U10 U98 U5 U4 U3, not sure if the order will impact or not.类的顺序是U1 U2 U7 U8 U9 U10 U98 U5 U4 U3，不确定顺序会不会影响。 any idea?任何的想法？

input.csv in link https://drive.google.com/file/d/1qND1NM6BK3py2ZjYw294GjhJVDzIOlHj/view?usp=sharing input.csv 链接https://drive.google.com/file/d/1qND1NM6BK3py2ZjYw294GjhJVDzIOlHj/view?usp=sharing

inputfilename='input.csv'
data=[]
df=pd.read_csv(inputfilename)
classes=pd.unique(df['class'])
for c in classes:
    df2=df[df['class']==c]
    data+=list(df2['value'].values)
n1=np.array(data)
n2=df['value']
plt.plot(n1-n2)
plt.show()

Answer 1

The two arrays will only be the same if all the rows with the same class are grouped together in the CSV.仅当具有相同类别的所有行在 CSV 中分组在一起时，这两个数组才会相同。

n1 is created by grouping all the values with the same class together. n1是通过将具有同一类的所有值分组在一起创建的。 So it contains all U1 values, then all U2 values, and so on.所以它包含所有U1值，然后是所有U2值，依此类推。

n2 just has all the values in the order that they appear in the CSV. n2只是按照它们在 CSV 中出现的顺序包含所有值。

The classes are contiguous for U1, U2, U7, U8, U9, U10, and U98. U1、U2、U7、U8、U9、U10 和 U98 的类是连续的。 But U3, U4, and U5 are all mixed together.但是U3、U4、U5都混在一起了。 You have a sequence of rows starting like this:你有一系列这样开始的行：

U4,-0.6
U4,-0.8
U4,-0.1
U4,-0.6
U3,-0.2
U3,0.2
U5,-0.3
U5,0.1
U3,0
U5,0.2
U5,-0.2

These will be ordered differently in the two arrays.这些将在两个数组中以不同的方式排序。

You could solve this by sorting the dataframe by class first.您可以通过首先按类对数据框进行排序来解决此问题。

为什么 n1 和 n2 之间存在差异？

问题描述

1 个解决方案

解决方案1
0 已采纳 2022-12-14 23:42:11

为什么 n1 和 n2 之间存在差异？

问题描述

1 个解决方案

解决方案1 0 已采纳 2022-12-14 23:42:11

解决方案1
0 已采纳 2022-12-14 23:42:11