简体   繁体   English

根据熊猫数据框中的内容将一列分为两列

[英]Split one column to two columns depending one the content in pandas dataframe

I have a pandas DataFrame like this: 我有一个这样的熊猫DataFrame:

df = pd.DataFrame(['A',1,2,3,'B',4,5,'C',6,7,8,9])

    0
0   A
1   1
2   2
3   3
4   B
5   4
6   5
7   C
8   6
9   7
10  8
11  9

It's mix of strings and numbers. 它是字符串和数字的混合。 I want to split this DF into tow columns like this: 我想将此DF分成这样的两个列:

   name value
0   A   1
1   A   2
2   A   3
3   B   4
4   B   5
5   C   6
6   C   7
7   C   8
8   C   9

what's an efficient way to do this? 什么是有效的方法?

You can use: 您可以使用:

df = pd.DataFrame({0 :['A',1,2,3,'B',4,5,'C',6,7,8,9]})
#check strings 
mask = df[0].astype(str).str.isalpha()
#check if mixed values - numeric with strings 
#mask = df[0].apply(lambda x: isinstance(x, str))
#create column to first position, create NaNs filled by forward filling
df.insert(0, 'name', df[0].where(mask).ffill())
#remove rows with same values - with names, rename column
df = df[df['name'] != df[0]].rename(columns={0:'value'}).reset_index(drop=True)
print (df)
  name value
0    A     1
1    A     2
2    A     3
3    B     4
4    B     5
5    C     6
6    C     7
7    C     8
8    C     9

Or: 要么:

out = []
acc = None
for x in df[0]:
    #check if strings
    if isinstance(x, str):
        #assign to variable for tuples
        acc = x
    else:
        #append tuple to out
        out.append((acc, x))
print (out)

df = pd.DataFrame(out, columns=['name','value'])
print (df)
  name  value
0    A      1
1    A      2
2    A      3
3    B      4
4    B      5
5    C      6
6    C      7
7    C      8
8    C      9

This will give you the data structure to get what you want: 这将使您获得所需的数据结构:

input = ['A',1,2,3,'B',4,5,'C',6,7,8,9]
letter = None
output = []
for i in input:
    if type(i) is type(''):
        letter = i
    elif type(i) is type(0) and letter is not None:
        output.append((letter, i))
print(output)

Output now has a sequence of tuples, paired as you wish. 现在,输出具有一个元组序列,可以根据需要进行配对。 I don't use pandas , but I hope this is helpful to you. 我不使用熊猫 ,但我希望这对您有所帮助。

IIUC IIUC

df['New']=df[df.your.str.isalpha().fillna(False)]
df.ffill().loc[df.your!=df.New,:]
Out[217]: 
   your New
1     1   A
2     2   A
3     3   A
5     4   B
6     5   B
8     6   C
9     7   C
10    8   C
11    9   C

Data input 数据输入

df = pd.DataFrame({'your':['A',1,2,3,'B',4,5,'C',6,7,8,9]})

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM