简体   繁体   English

交换 Pandas 数据框中的部分列

[英]Swap parts of columns in Pandas dataframe

I have to manage a dataset that looks like this when I plot it.当我绘制它时,我必须管理一个看起来像这样的数据集。 As seen from the graph, around x=17, the values of the "orange" column are taking the place of the data where the "green" column should take.从图中可以看出,在 x=17 左右,“橙色”列的值取代了“绿色”列应采用的数据。 Respectively, the "blue" data are taking the place of "orange" ones, and the "green" ones the place of "blue" ones.分别是“蓝色”数据代替“橙色”数据,“绿色”数据代替“蓝色”数据。 This swap occurs around x=17.这种交换发生在 x=17 左右。 Later on the graph (around x=24) the swap is different.稍后在图表上(大约 x=24)交换是不同的。 My question is how can I get the data in the right place (columns).我的问题是如何在正确的位置(列)获取数据。 The swap point is not always constant so I can't just swap parts of the columns iteratively.交换点并不总是恒定的,所以我不能只是迭代地交换列的一部分。 My thought on that is that I have to check for the difference between two points.我的想法是我必须检查两点之间的差异。 When the difference is bigger than some value, this is probably a swap point.当差异大于某个值时,这可能是一个交换点。 Although this is not always the case since most of the graphs have a non-linear behavior.尽管情况并非总是如此,因为大多数图形都具有非线性行为。 A typical dataset contains many more rows so I am looking for a solution to this as much as parameterized as possible.典型的数据集包含更多行,因此我正在寻找尽可能参数化的解决方案。 This is the dataset for the above graph:这是上图的数据集:

col1 = [8724.96757035, 8720.86855769, 8713.03560178, 8711.77188717,
       8723.40967556, 8717.95864342, 8719.46206709, 8716.15746255,
       8715.83456161, 8722.05038594, 8721.822529  , 8714.29076839,
       8721.68118216, 8714.94677413, 8706.33839393, 8719.94888389,
       8715.71175774, 8480.37544428, 9151.63757245, 9138.71268152,
       9127.43234993, 9146.51437639, 9148.00997757, 9130.06677617,
       9151.43128313, 8481.34668127, 8482.40548913, 8481.96440291,
       8481.39530663, 8482.7611363 , 8481.26267875, 8480.71911933,
       8481.02279341]

col2 = [8718.4606092 , 9150.29254687, 9130.86473512, 9140.34929925,
       9142.43843709, 9158.33993226, 9148.70914607, 9164.89441174,
       9145.08470894, 9147.82723909, 9132.61236281, 9200.58503831,
       9129.96054189, 9135.65207477, 9165.43826932, 9145.35463759,
       9134.02400092, 8481.58635709, 8480.90717793, 8479.96295137,
       8483.73891949, 8481.93224816, 8482.40478411, 8481.96627135,
       8481.34086757, 8722.99646005, 8736.61137791, 8724.85719973,
       8721.86321039, 8723.91810368, 8720.82987529, 8720.19864748,
       8720.00514769]

col3 = [9157.20772734, 8481.17028812, 8479.95897581, 8481.66854465,
       8481.12688288, 8481.30670312, 8480.84656953, 8483.54011535,
       8481.81742774, 8479.23373517, 8480.44659188, 8480.90515565,
       8481.35596211, 8479.94614036, 8480.12735803, 8482.70698043,
       8481.50464731, 8725.55716505, 8712.41651697, 8737.46352274,
       8719.20402175, 8710.77791026, 8721.07604204, 8718.88881952,
       8720.0611123 , 9158.13239686, 9158.70309418, 9185.89920375,
       9189.72527817, 9153.04424809, 9152.17774172, 9148.59275477,
       9133.33557359]

df = pd.DataFrame({"A":col1, "B":col2, "C":col3})

Any suggestions on that would be much appreciated.对此的任何建议将不胜感激。 Thanks in advance.提前致谢。

尝试

df.plot(y=["A", "B","C"])

One simple solution would be to sort the data, ie the smallest value is always in column "A" and the largest value is always in column "C".一种简单的解决方案是对数据进行排序,即最小值始终在“A”列中,而最大值始终在“C”列中。

df2 = pd.DataFrame(df.apply(sorted, axis=1).to_list()).rename(columns={0:'A', 1:'B', 2:'C'})
df2.plot()

The resulting plot would look like this:结果图如下所示:

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM