简体   繁体   中英

How to reformat horizontal csv to a more vertical format

在此处输入图像描述

The above CSV is just a small snippet of the data, there are lots of data entries.

A simple transpose will not work

I need to get into the following format:

在此处输入图像描述

I have tried some methods with pandas and transpose but cannot figure it out. The CSV could be potentially thousands of lines long

You read the data by using pandas, and try the following code

df = pd.read_csv('name_file.csv')

(df.assign(idx=df.groupby('Entry').cumcount()).melt(['Entry', 'idx'])
   .pivot(index=['idx', 'variable'], columns='Entry', values='value')
   .droplevel('idx').rename_axis(index=None, columns=None)
)

You can use:

#if entry is index, remove "set_index('Entry')" field.
final=pd.concat([df[:4].set_index('Entry').T,df[4:].set_index('Entry').T])

Output :

|          | 0        | 1        | 2   |   3 |
|:---------|:---------|:---------|:----|----:|
| Blue     | 3/20/20  | 3:09 PM  | O   |  12 |
| Red      | 3/20/20  | 9:13 PM  | C   |   0 |
| Purple   | 11/26/22 | 3:09 PM  | O   |  34 |
| Green    | 3/20/20  | 3:09 PM  | O   |  24 |
| Black    | 3/20/20  | 3:09 PM  | O   | 133 |
| Orange   | 3/20/20  | 3:09 PM  | O   |  72 |
| Yellow   | 3/20/20  | 3:09 PM  | O   |   2 |
| Gold     | 3/20/20  | 3:00 PM  | O   |  13 |
| White    | 3/20/20  | 3:00 PM  | O   |  31 |
| Silver   | 3/20/20  | 8:49 PM  | O   |  43 |
| Bronze   | 3/20/20  | 2:22 PM  | C   |  13 |
| Platinum | 3/20/20  | 3:00 PM  | O   |  59 |
| Titanium | 3/20/20  | 3:00 PM  | O   |  63 |
| Blue     | 5/1/20   | 9:13 PM  | O   |  23 |
| Red      | 5/1/20   | 9:13 PM  | C   |   0 |
| Purple   | 5/1/20   | 5:24 PM  | O   |  45 |
| Green    | 5/1/20   | 12:09 PM | O   |  67 |
| Black    | 5/1/20   | 3:09 PM  | O   |  56 |
| Orange   | 5/1/20   | 3:09 PM  | O   | 754 |
| Yellow   | 5/1/20   | 3:09 PM  | O   |  23 |
| Gold     | 5/1/20   | 3:00 PM  | O   |  56 |
| White    | 5/1/20   | 3:00 PM  | O   | 121 |
| Silver   | 5/1/20   | 8:49 PM  | O   |  92 |
| Bronze   | 5/1/20   | 2:22 PM  | C   |  13 |
| Platinum | 5/1/20   | 3:00 PM  | O   |  59 |
| Titanium | 5/1/20   | 3:00 PM  | O   |  63 |

@Bushmaster's solution works fine. Another option is to transpose the column, then pivot with pivot_longer from pyjanitor :

# pip install pyjanitor
import janitor
import pandas as pd

df = pd.read_csv('Downloads/original.csv')

(df
.astype({"Entry":str})
.set_index('Entry')
.T
.pivot_longer(
    index=None, 
    ignore_index=False,
    names_to = '.value', 
    names_pattern='(.)')
)
                 0         1  2    3
Blue       3/20/20   3:09 PM  O   12
Red        3/20/20   9:13 PM  C    0
Purple    11/26/22   3:09 PM  O   34
Green      3/20/20   3:09 PM  O   24
Black      3/20/20   3:09 PM  O  133
Orange     3/20/20   3:09 PM  O   72
Yellow     3/20/20   3:09 PM  O    2
Gold       3/20/20   3:00 PM  O   13
White      3/20/20   3:00 PM  O   31
Silver     3/20/20   8:49 PM  O   43
Bronze     3/20/20   2:22 PM  C   13
Platinum   3/20/20   3:00 PM  O   59
Titanium   3/20/20   3:00 PM  O   63
Blue        5/1/20   9:13 PM  O   23
Red         5/1/20   9:13 PM  C    0
Purple      5/1/20   5:24 PM  O   45
Green       5/1/20  12:09 PM  O   67
Black       5/1/20   3:09 PM  O   56
Orange      5/1/20   3:09 PM  O  754
Yellow      5/1/20   3:09 PM  O   23
Gold        5/1/20   3:00 PM  O   56
White       5/1/20   3:00 PM  O  121
Silver      5/1/20   8:49 PM  O   92
Bronze      5/1/20   2:22 PM  C   13
Platinum    5/1/20   3:00 PM  O   59
Titanium    5/1/20   3:00 PM  O   63

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM