简体   繁体   English

Python Pandas数据框:使用列中的值创建新列

[英]Python Pandas Dataframe: Using Values in Column to Create New Columns

I've searched several books and sites and I can't find anything that quite matches what I'm trying to do. 我搜索了几本书和网站,但找不到与我要尝试的内容完全匹配的内容。 I would like to create itemized lists from a dataframe and reconfigure the data like so: 我想从一个数据框中创建逐项列出的列表,然后像这样重新配置数据:

      A     B                A     B     C     D  
0     1     aa          0    1     aa  
1     2     bb          1    2     bb  
2     3     bb          2    3     bb    aa  
3     3     aa     --\  3    4     aa    bb    dd  
4     4     aa     --/  4    5     cc  
5     4     bb  
6     4     dd  
7     5     cc  

I've experimented with grouping, stacking, unstacking, etc. but nothing that I've attempted has produced the desired result. 我已经尝试过分组,堆叠,拆堆等操作,但是没有任何尝试产生想要的结果。 If it's not obvious, I'm very new to python and a solution would be great but an understanding of the process I need to follow would be perfect. 如果不是很明显,那么我对python还是很陌生,一个解决方案会很棒,但是对我需要遵循的过程的理解将是完美的。

Thanks in advance 提前致谢

Using pandas you can query all results eg where A=4. 使用熊猫,您可以查询所有结果,例如A = 4。

A crude but working method would be to iterate through the various index values and gather all 'like' results into a numpy array and convert this into a new dataframe. 一种粗略但可行的方法是迭代各种索引值,并将所有“ like”结果收集到一个numpy数组中,然后将其转换为新的数据帧。

Pseudo code to demonstrate my example: (will need rewriting to actually work) 伪代码演示我的示例:(将需要重写才能真正起作用)

l= [0]*df['A'].max() 
for item in xrange(df['A'].max() ):
    l[item] = df.loc[df['A'].isin(item)]

df = pd.DataFrame(l)
# or something of the sort

I hope that helps. 希望对您有所帮助。

Update from comments: 评论更新:

animal_list=[]

for animal in ['cat','dog'...]:
    newdf=df[[x.is('%s'%animal) for x in df['A']]]

    body=[animal]    
    for item in newdf['B']
        body.append(item)

    animal_list.append(body)

df=pandas.DataFrame(animal_list)

A quick and dirty method that will work with strings. 一种适用于字符串的快速而肮脏的方法。 Customize the column naming as per needs. 根据需要自定义列命名。

data =  {'A': [1, 2, 3, 3, 4, 4, 4, 5],
         'B': ['aa', 'bb', 'bb', 'aa', 'aa', 'bb', 'dd', 'cc']}
df = pd.DataFrame(data)

maxlen = df.A.value_counts().values[0]  # this helps with creating 
                                    # lists of same size

newdata = {}
for n, gdf in df.groupby('A'):
    newdata[n]= list(gdf.B.values) + [''] * (maxlen - len(gdf.B)) 

# recreate DF with Col 'A' as index; experiment with other orientations
newdf = pd.DataFrame.from_dict(newdict, orient='index') 

# customize this section
newdf.columns = list('BCD')
newdf['A'] = newdf.index
newdf.index = range(len(newdf))
newdf = newdf.reindex_axis(list('ABCD'), axis=1) # to set the desired order

print newdf

The result is: 结果是:

A   B   C   D
0  1  aa        
1  2  bb        
2  3  bb  aa    
3  4  aa  bb  dd
4  5  cc

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 熊猫数据框创建新列并使用第一列的值填充值 - pandas dataframe create new columns and fill values by using the values of the first column 根据其他列中的“NaN”值在 Pandas Dataframe 中创建一个新列 - Create a new column in Pandas Dataframe based on the 'NaN' values in other columns Pandas dataframe 创建新列,指示其他列中的重叠值 - Pandas dataframe create new column indicating overlapping values in other columns python pandas dataframe从其他列的单元格创建新列 - python pandas dataframe create new column from other columns' cells 在python pandas DataFrame上使用其他列的信息创建新列 - create new columns with info of other column on python pandas DataFrame 如何使用熊猫中其他行和列的值和分组创建新的数据框列? - How to create a new dataframe column using values and groupings from other rows and columns in pandas? 使用 pandas 对两列进行排序并为 dataframe 中的排序值创建新列 - Sort Two column and create new columns for sorted values from dataframe using pandas 使用 pandas/python 从 DataFrame 中的两个现有文本列创建一个新列 - Create a new column from two existing text columns in a DataFrame using pandas/python Python 如何使用多个 pandas dataframe 列中的值作为元组键和单个列作为值来创建字典 - Python how to create a dictionary using the values in multiple pandas dataframe columns as tuple keys and a single column as value 通过解析列值为数据框创建新列,并使用来自另一列python的值填充新列 - Create new columns for a dataframe by parsing column values and populate new columns with values from another column python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM