简体   繁体   English

将pandas DataFrame中的列转换为多列

[英]Convert a column in a pandas DataFrame into multiple columns

I have a pandas DataFrame with a column which has the following values in a column: 我有一个带有列的pandas DataFrame,该列在列中具有以下值:

Identifier
[1;12;7;3;0]
[4;5;2;6;0]

I want to convert the values in square brackets in this column to 5 new columns. 我想将此列中方括号中的值转换为5个新列。 Essentially, I want to split those values into 5 new columns, while keeping the index for new columns same as the original column. 本质上,我想将这些值分成5个新列,同时使新列的索引与原始列相同。

Identifier,a,b,c,d,e
[1;12;7;3;0],1,12,7,3,0
[4;5;2;6;0],4,5,2,6,0

pattern = re.compile(r'(\d+)')
for g in raw_data["Identifier"]:
    new_id = raw_data.Identifier.str.findall(pattern) # this converts the Identifier into a list of the 5 values
raw_data.append({'a':new_id[0],'b':new_id[1],'c':new_id[2],'d':new_id[3],'d':new_id[4]}, ignore_index=True)

The above code adds the extracted values from the column "identifier" to the end of the DataFrame and not to the corresponding rows. 上面的代码将从“标识符”列中提取的值添加到DataFrame的末尾,而不是添加到相应的行。 How can I add the values extracted to the same row/index as the original column ('Identifier')? 如何将提取的值添加到与原始列相同的行/索引(“标识符”)?

One way would be to use str methods to get the numbers, make a new dataframe from that, and then join (or concatentate) the results. 一种方法是使用str方法获取数字,从中获取一个新的数据框,然后合并(或合并)结果。 For example, 例如,

id_data = df.Identifier.str.strip("[]").str.split(";").tolist()
df_id = pd.DataFrame(id_data, columns=list("abcde"), index=df.index, dtype=int)
df2 = df.join(df_id)

produces something like 产生类似

      Identifier  a   b  c  d  e
10  [1;12;7;3;0]  1  12  7  3  0
20   [4;5;2;6;0]  4   5  2  6  0

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM