简体   繁体   English

Python通过特定列值将多行组合成一个向量行

[英]Python group multiple lines into one vector line by a specific column value

How to group multiple lines into one vector line by a specific column value. 如何通过特定列值将多行组合到一个矢量行中。 As an example, I have this data: 举个例子,我有这些数据:

| Latitude  | Longitude  | group |
|-----------|------------|-------|
| 46.852397 | -72.02586  | A     |
| 47.059016 | -70.907962 | A     |
| 46.897785 | -71.140082 | A     |
| 46.99328  | -70.986152 | A     |
| 46.64613  | -71.934034 | A     |
| 46.622638 | -71.994857 | A     |
| 46.968093 | -71.284281 | B     |
| 47.422739 | -70.32361  | B     |
| 46.878963 | -71.717918 | B     |
| 46.91002  | -71.108395 | C     |
| 47.465175 | -70.337958 | C     |
| 46.6936   | -71.862257 | C     |
| 47.40885  | -70.390739 | C     |
| 47.00737  | -71.232117 | C     |
| 47.013901 | -70.965815 | C     |
| 46.824111 | -71.554997 | C     |
| 47.003765 | -71.193865 | C     |
| 46.665319 | -72.15102  | C     |
| 47.129865 | -70.842406 | C     |
| 46.932361 | -71.994677 | C     |

that I would like to convert to this: 我想转换成这个:

| group | Latitude                                                                                                    | Longitude                                                                                                                 |
|-------|-------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|
| A     | [46.852397,47.059016,46.897785,46.99328,46.64613,46.622638]                                                 | [-72.02586,-70.907962,-71.140082,-70.986152,-71.934034,-71.994857]                                                        |
| B     | [46.968093,47.422739,46.878963]                                                                             | [-71.284281,-70.32361,-71.717918]                                                                                         |
| C     | [46.91002,47.465175,46.6936,47.40885,47.00737,47.013901,46.824111,47.003765,46.665319,,47.129865,46.932361] | [-71.108395,-70.337958,-71.862257,-70.390739,-71.232117,-70.965815,-71.554997,-71.193865,-72.15102,-70.842406,-71.994677] |

Lets say you have a data frame that looks like this: 假设您有一个如下所示的数据框:

>>> df
   v1  v2 v3
0   1   2  a
1   3   4  a
2   1   2  b
3   3   4  b

Then, you can have what you want with: 然后,你可以拥有你想要的东西:

>>> df.groupby('v3').agg(lambda m: list(m)).reset_index()
  v3      v1      v2
0  a  [1, 3]  [2, 4]
1  b  [1, 3]  [2, 4]

However, this is a bad idea because Pandas doesn't handle lists as values very well. 但是, 这是一个坏主意,因为Pandas不能很好地处理列表作为值。 It wasn't designed for that. 它不是为此而设计的。 However, if it works for you, go ahead and use it. 但是,如果它适合您,请继续使用它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM