[英]Transforming Database with Pandas Dataframe
I'm trying to work with some databases.我正在尝试使用一些数据库。 This is a pic of the database
这是数据库的图片
Code:代码:
canastas= pd.concat([cba,cbt])
canastas.index = np.arange(12)
canastas = canastas.stack()
Aiio Mes GSA Pampeana Noroeste Noreste Cuyo Patagonia Canasta
2016 7 1666.48 1660.19 1458.24 1496.21 1494.04 1713.67 CBA
2016 8 1675.05 1662.99 1459.38 1501.91 1496.09 1723.86 CBA
2016 9 1711.22 1705.31 1498.62 1540.68 1536.29 1767.89 CBA
2016 10 1739.34 1731.84 1516.82 1559.51 1559.00 1797.44 CBA
2016 11 1762.65 1753.27 1532.67 1573.18 1577.67 1819.64 CBA
2016 12 1766.62 1754.08 1526.86 1571.59 1574.22 1822.96 CBA
2016 7 4032.88 4017.66 3281.04 3396.40 3854.62 4712.59 CBT
2016 8 4036.87 4007.81 3269.01 3394.32 3844.95 4723.38 CBT
2016 9 4089.82 4075.69 3326.94 3451.12 3917.54 4790.98 CBT
2016 10 4191.81 4173.73 3397.68 3524.49 4006.63 4924.99 CBT
2016 11 4247.99 4225.38 3417.85 3539.66 4038.84 4967.62 CBT
2016 12 4257.55 4227.33 3420.17 3551.79 4045.75 4994.91 CBT
And I need to get a database like this我需要一个这样的数据库
I was using .stack
and .pivot_table
functions but it doesn´t work.我正在使用
.stack
和.pivot_table
函数,但它不起作用。 What pandas function or what do you recommend?什么大熊猫功能或你推荐什么?
Use DataFrame.melt
with some processing for Trimestre
with integers division by 3
:使用
DataFrame.melt
一些处理Trimestre
用整数除法3
:
df = df.melt(['Aiio','Mes','Canasta'], var_name='Region', value_name='Valor')
df['Trimestre'] = (df['Mes'] - 1) // 3 + 1
df['Periodo'] = df['Aiio'] + df['Trimestre'] / 10
print (df)
Aiio Mes Canasta Region Valor Trimestre Periodo
0 2016 7 CBA GSA 1666.48 3 2016.3
1 2016 8 CBA GSA 1675.05 3 2016.3
2 2016 9 CBA GSA 1711.22 3 2016.3
3 2016 10 CBA GSA 1739.34 4 2016.4
4 2016 11 CBA GSA 1762.65 4 2016.4
.. ... ... ... ... ... ... ...
67 2016 8 CBT Patagonia 4723.38 3 2016.3
68 2016 9 CBT Patagonia 4790.98 3 2016.3
69 2016 10 CBT Patagonia 4924.99 4 2016.4
70 2016 11 CBT Patagonia 4967.62 4 2016.4
71 2016 12 CBT Patagonia 4994.91 4 2016.4
[72 rows x 7 columns]
Here is another solution using stack, basically they performed similar, please pay attention to the order of output.这是另一种使用堆栈的解决方案,它们的性能基本相似,请注意输出顺序。
s = df.set_index(['Aiio','Mes','Canasta']).stack()
s.name = 'Valor'
df = s.reset_index().rename(columns={"level_3":"Region"})
df['Trimestre'] = df['Mes'].sub(1) // 3 + 1
df['Periodo'] = df['Aiio'] + df['Trimestre'] / 10
df
Aiio Mes Canasta Region Valor Trimestre Periodo
0 2016 7 CBA GSA 1666.48 3 2016.3
1 2016 7 CBA Pampeana 1660.19 3 2016.3
2 2016 7 CBA Noroeste 1458.24 3 2016.3
3 2016 7 CBA Noreste 1496.21 3 2016.3
4 2016 7 CBA Cuyo 1494.04 3 2016.3
... ... ... ... ... ... ... ...
67 2016 12 CBT Pampeana 4227.33 4 2016.4
68 2016 12 CBT Noroeste 3420.17 4 2016.4
69 2016 12 CBT Noreste 3551.79 4 2016.4
70 2016 12 CBT Cuyo 4045.75 4 2016.4
71 2016 12 CBT Patagonia 4994.91 4 2016.4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.