简体   繁体   English

Pandas:如何正确取消 df?

[英]Pandas: how to unpivot df correctly?

I have the following dataframe df :我有以下 dataframe df

  A  B  Var    Value
0 A1 B1 T1name T1
1 A2 B2 T1name T1
2 A1 B1 T2name T2
3 A2 B2 T2name T2
4 A1 B1 T1res  1
5 A2 B2 T1res  1
6 A1 B1 T2res  2
7 A2 B2 T2res  2

I now want to 'half' my dataframe because Var contains variables that should not go under the same column.我现在想“减半”我的 dataframe 因为Var包含不应在同一列下 go 的变量。 My intended outcome is:我的预期结果是:

  A  B  Name   Value
0 A1 B1 T1     1
1 A2 B2 T1     1
2 A1 B1 T2     2
3 A2 B2 T2     2

What should I use to unpivot this correctly?我应该用什么来正确地取消旋转?

then:然后:

df = df[~df['Var'].isin(['T1name','T2name'])]

output: output:

    A   B    Var Value
4  A1  B1  T1res     1
5  A2  B2  T1res     1
6  A1  B1  T2res     2
7  A2  B2  T2res     2

Just filter where the string contains res and assign a new column with the first two characters of the var columns只需过滤字符串包含res的位置,并使用 var 列的前两个字符分配一个新列

df[df['Var'].str.contains('res')].assign(Name=df['Var'].str[:2]).drop(columns='Var')

    A   B Value Name
4  A1  B1     1   T1
5  A2  B2     1   T1
6  A1  B1     2   T2
7  A2  B2     2   T2

Note that this creates a slice of the original DataFrame and not a copy请注意,这会创建原始 DataFrame 的切片,而不是副本

There are different options available looking at the df.查看 df 有不同的选项。 Regex seems to be on top of the list.正则表达式似乎位居榜首。 If regex doesn't work, maybe think of redefining your problem:如果正则表达式不起作用,也许考虑重新定义您的问题:

Filter Value by dtype, replace unwanted characters in df and rename columns.按 dtype 过滤Value ,替换 df 中不需要的字符并重命名列。 Code below下面的代码

df[df['Value'].str.isnumeric()].replace(regex=r'res$', value='').rename(columns={'Var':'Name'})

    A   B Name Value
4  A1  B1   T1     1
5  A2  B2   T1     1
6  A1  B1   T2     2
7  A2  B2   T2     2

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM