简体   繁体   English

使用DataFrame Python以矩阵格式显示列

[英]Display columns in matrix format using dataframe python

I have the following table 我有下表

在此处输入图片说明

I want to convert int into a matrix using python, to look something like below: 我想使用python将int转换为矩阵,如下所示:

在此处输入图片说明

Can I get some direction as to where to start with this? 我可以从哪里开始获得一些指导吗? I have used pandas to read two dataframes and merge them to create the initial table I have shown(one having two columns). 我用熊猫读取了两个数据框,并将它们合并以创建我显示的初始表(一个有两列)。

Code I am using is below is below: 我在下面使用的代码如下:

import pandas as pd
from pyexcelerate import Workbook
import numpy as np
import time
start = time.process_time()
excel_file = 'Test.xlsx'
df = pd.read_excel(excel_file, sheet_name=0, index_col=0)
print(df.columns)
print(df.index)

newdf= (df.pivot(index='ColumnB',columns='ColumnA', values='ColumnB'))
myNewDF = newdf.transform(lambda x: np.where(x.isnull(), '', 'yes'))
aftercalc = time.process_time()
print(aftercalc - start)

myNewDF.to_excel("1.xlsx")
print(time.process_time() - aftercalc)

The ouput of the prints are : 印刷品的输出是:

Index(['ColumnB'], dtype='object') Index(['TypeA', 'TypeA', 'TypeA', 'TypeA', 'TypeA', 'TypeB', 'TypeB', 'TypeC', 'TypeC', 'TypeC', 'TypeD'], dtype='object', name='ColumnA') Index(['ColumnB'],dtype ='object')Index(['TypeA','TypeA','TypeA','TypeA','TypeA','TypeB','TypeB','TypeC',' TypeC','TypeC','TypeD'],dtype ='object',name ='ColumnA')

The error I get while running this is : 我在运行此程序时遇到的错误是:

Traceback (most recent call last): File "C:_data\\learn\\Miniconda\\lib\\site-packages\\pandas\\core\\indexes\\base.py", line 2657, in get_loc return self._engine.get_loc(key) File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'ColumnA' 追溯(最近一次通话):文件“ C:_data \\ learn \\ Miniconda \\ lib \\ site-packages \\ pandas \\ core \\ indexes \\ base.py”,行2657,在get_loc返回self._engine.get_loc(key)文件pandas._libs.index.IndexEngine.get_loc文件中的第108行中的“ pandas / _libs / index.pyx”,pandas._libs.index.IndexEngine.get_loc文件“ pandas”中第132行中的“ pandas / _libs / index.pyx” /_libs/hashtable_class_helper.pxi”,第1601行,在pandas._libs.hashtable.PyObjectHashTable.get_item中,文件“ pandas / _libs / hashtable_class_helper.pxi”,第1608行,在pandas._libs.hashtable.PyObjectHashTable:get_item

During handling of the above exception, another exception occurred: 在处理上述异常期间,发生了另一个异常:

Traceback (most recent call last): File "test.py", line 10, in newdf= (df.pivot(index='ColumnB',columns='ColumnA', values='ColumnB')) File "C:_data\\learn\\Miniconda\\lib\\site-packages\\pandas\\core\\frame.py", line 5628, in pivot return pivot(self, index=index, columns=columns, values=values) File "C:_data\\learn\\Miniconda\\lib\\site-packages\\pandas\\core\\reshape\\pivot.py", line 379, in pivot index = MultiIndex.from_arrays([index, data[columns]]) File "C:_data\\learn\\Miniconda\\lib\\site-packages\\pandas\\core\\frame.py", line 2927, in getitem indexer = self.columns.get_loc(key) File "C:_data\\learn\\Miniconda\\lib\\site-packages\\pandas\\core\\indexes\\base.py", line 2659, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 132, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1601, in pandas._libs.hashtable.PyObjectHashTable.g 追溯(最近一次通话最近):文件“ test.py”,第10行,在newdf =中(df.pivot(index ='ColumnB',columns ='ColumnA',values ='ColumnB')))文件“ C:_data \\ learn \\ Miniconda \\ lib \\ site-packages \\ pandas \\ core \\ frame.py“,第5628行,在数据透视表返回数据透视表(自身,索引=索引,列=列,值=值)中,文件“ C:_data \\ learn \\ Miniconda \\ lib \\ site-packages \\ pandas \\ core \\ reshape \\ pivot.py“,第379行,位于枢轴索引= MultiIndex.from_arrays([index,data [columns]])文件” C:_data \\ learn \\ Miniconda \\ lib \\ site-packages \\ pandas \\ core \\ frame.py“,第2927行,位于getitem索引器= self.columns.get_loc(key)文件“ C:_data \\ learn \\ Miniconda \\ lib \\ site-packages \\ pandas \\ core \\ indexes \\ base.py”,第2659行,在get_loc中返回self._engine.get_loc(self._maybe_cast_indexer(key))文件“ pandas / _libs / index.pyx”,第108行,在pandas._libs.index.IndexEngine.get_loc文件中pandas._libs.index.IndexEngine.get_loc中的第132行“ pandas / _libs / index.pyx”,pandas._libs.hashtable.PyObjectHashTable.g中的第1601行“ pandas / _libs / hashtable_class_helper.pxi” et_item File "pandas/_libs/hashtable_class_helper.pxi", line 1608, in pandas._libs.hashtable.PyObjectHashTable.get_item et_item文件“ pandas / _libs / hashtable_class_helper.pxi”,行1608,在pandas._libs.hashtable.PyObjectHashTable.get_item中

Does this solve? 这可以解决吗?

newdf= (df.pivot(index='ColumnB',columns='ColumnA', values='ColumnB'))

newdf
Out[28]: 
ColumnA TypeA TypeB TypeC TypeD
ColumnB                        
A           A     A   NaN     A
B           B   NaN     B   NaN
C           C   NaN     C   NaN
D           D   NaN   NaN   NaN
E           E   NaN   NaN   NaN
F         NaN     F   NaN   NaN
Z         NaN   NaN     Z   NaN

newdf.transform(lambda x: np.where(x.isnull(), '', 'yes'))
Out[29]: 
ColumnA TypeA TypeB TypeC TypeD
ColumnB                        
A         yes   yes         yes
B         yes         yes      
C         yes         yes      
D         yes                  
E         yes                  
F               yes            
Z                     yes      

Modified Code 修改后的代码

import pandas as pd
#from pyexcelerate import Workbook
import time
import numpy as np
start = time.process_time()
excel_file = 'C:\\Users\\ss\\Desktop\\check.xlsx'
df = pd.read_excel(excel_file, sheet_name=0, index_col=0)
print(df.columns)
print(df.index)

newdf= (df.pivot(index='ColumnB',columns='ColumnA', values='ColumnB'))
myNewDF = newdf.transform(lambda x: np.where(x.isnull(), '', 'yes'))
aftercalc = time.process_time()
print(aftercalc - start)

myNewDF.to_excel("C:\\Users\\ss\\Desktop\\output.xlsx")

我们可以做的

pd.crosstab(df.ColumnA,df.ColumnB).astype(bool)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python - 矩阵到数据框,重命名列 - Python - matrix to dataframe, rename columns 如何使用列的格式字符串显示浮点数的 Pandas DataFrame? - How to display pandas DataFrame of floats using a format string for columns? 如何使用.format在python DataFrame中创建新列 - How to create new columns in a python DataFrame using .format 如何使 Pandas DataFrame (Python) 以二维 (2-D) 矩阵格式显示每个单元格 - how to make Pandas DataFrame (Python) to display each cell in a two dimensional (2-D) matrix format 如何将pandas dataframe转换为Python中的矩阵格式? - How to convert a pandas dataframe to matrix format in Python? 在Python中删除CSR格式的矩阵列 - Delete columns of matrix of CSR format in Python Pandas dataframe 将行值重塑为新列(矩阵类型格式) - Pandas dataframe reshape row values into new columns (matrix type format) Python - 将数据帧列格式化为不同的数据类型 - Python - format dataframe columns as different datatypes 用于动态访问列的 Python Dataframe 到 Columnar 格式 - Python Dataframe to Columnar format for accessing the columns dynamically 可以从包含两列的数据框形成矩阵并使用python通过将两者相除来获取值吗? - Can a matrix be formed from dataframe containing two columns and getting the values by dividing the two ,using python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM