I have an excel file containing a single column (Row's number is not fixed). Using Python 3, I want to,
I have tried the following code;
import pandas as pd
import numpy as np
df = pd.read_excel('sample.xlsx')
first_column = pd.DataFrame(df.iloc[:,0])
arr = np.array(first_column)
newArr = arr.reshape(10, -1)
However i am facing the following error:
newArr = arr.reshape(arr, (10, -1)) TypeError: only integer scalar arrays can be converted to a scalar index
Looking for someone to help me out achieving this in Python 3.
To read the excel file in python it would be better to first save the file as csv then read it in python. You can save the excel file as csv using Save as
option in excel.
>>> import pandas as pd
>>> df = pd.read_csv('fazool.csv')
Then to print the head of the dataframe/table in python
>>> df.head()
kMEblue kMEgreen kMEturquoise kMEblack kMEbrown kMEred kMEyellow data$X count moduleColors
0 -0.762233 -0.115623 0.836647 -0.418418 -0.688068 -0.078625 0.316798 VWA5A 1 turquoise
1 -0.714720 -0.145856 0.802115 -0.420983 first_column_split.csv-0.670826 -0.039813 0.424616 EIF4G2 1 turquoise
2 -0.785788 -0.259762 0.777330 -0.301520 -0.585565 0.021812 0.412960 CFL1 1 turquoise
3 -0.736677 -0.296203 0.776179 -0.266430 -0.517727 0.109923 0.526707 NSUN2 1 turquoise
4 -0.697293 0.030126 0.772833 -0.621229 -0.733419 -0.341270 0.088465 ANXA2 1 turquoise
>>> first_column_df = pd.DataFrame(df.iloc[:,0])
>>> first_column_df.head()
kMEblue
0 -0.762233
1 -0.714720
2 -0.785788
3 -0.736677
4 -0.697293
>>> first_column_df.columns # shows the column name
Index(['kMEblue'], dtype='object')
>>> import numpy as np
>>> n = 10 # number to be used as chunk size for the first column
>>> first_column_df_split = pd.concat([pd.Series(j, name='y' + str(i)) for i,j in enumerate(np.split( first_column_df['kMEblue'].to_numpy(), range(n, len(first_column_df['kMEblue']), n)))], axis=1)
>>> first_column_df_split.head()
y0 y1 y2 y3 y4 y5 ... y478 y479 y480 y481 y482 y483
0 -0.762233 -0.639253 -0.673571 -0.652639 -0.703227 -0.666183 ... 0.633533 0.628803 0.716792 0.783900 0.725757 0.791240
1 -0.714720 -0.680753 -0.696416 -0.686810 -0.636661 -0.613642 ... 0.678854 0.807758 0.736286 0.627988 0.853333 0.887149
2 -0.785788 -0.638530 -0.607706 -0.613452 -0.701420 -0.583315 ... 0.663671 0.649068 0.741015 0.847084 0.718821 0.786994
3 -0.736677 -0.728837 -0.665220 -0.613386 -0.596789 -0.614878 ... 0.722638 0.587891 0.658215 0.668980 0.794392 0.835687
4 -0.697293 -0.731756 -0.627547 -0.653920 -0.641218 -0.679153 ... 0.618696 0.740690 0.737382 0.679931 0.706449 0.919852
[5 rows x 484 columns]
dataFrame.to_csv()
>>> first_column_df_split.to_csv("first_column_split.csv")
Adopted from here .
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.