I am working on a dataframe of shape 146 rows x 48 columns. The columns are
['Region','Rank 2015','Score 2015','Economy 2015','Family 2015','Health 2015','Freedom 2015','Generosity 2015','Trust 2015','Rank 2016','Score 2016','Economy 2016','Family 2016','Health 2016','Freedom 2016','Generosity 2016','Trust 2016','Rank 2017','Score 2017','Economy 2017','Family 2017','Health 2017','Freedom 2017','Generosity 2017','Trust 2017','Rank 2018','Score 2018','Economy 2018','Family 2018','Health 2018','Freedom 2018','Generosity 2018','Trust 2018','Rank 2019','Score 2019','Economy 2019','Family 2019','Health 2019','Freedom 2019','Generosity 2019','Trust 2019','Score Mean','Economy Mean','Family Mean','Health Mean','Freedom Mean','Generosity Mean','Trust Mean']
I want to access a particular row and want to convert it to to the following dataframe
Year Rank Score Family Health Freedom Generosity Trust
0 2015 NaN NaN NaN NaN NaN NaN NaN
1 2016 NaN NaN NaN NaN NaN NaN NaN
2 2017 NaN NaN NaN NaN NaN NaN NaN
3 2018 NaN NaN NaN NaN NaN NaN NaN
4 2019 NaN NaN NaN NaN NaN NaN NaN
Any help is welcomed & Thank you in advance.
An alternate way:
cols=['Region','Rank 2015','Score 2015','Economy 2015','Family 2015','Health 2015','Freedom 2015','Generosity 2015', 'Trust 2015','Rank 2016','Score 2016','Economy 2016','Family 2016','Health 2016','Freedom 2016','Generosity 2016','Trust 2016', 'Rank 2017','Score 2017','Economy 2017','Family 2017','Health 2017','Freedom 2017','Generosity 2017','Trust 2017','Rank 2018','Score 2018','Economy 2018','Family 2018','Health 2018','Freedom 2018','Generosity 2018','Trust 2018','Rank 2019','Score 2019','Economy 2019','Family 2019','Health 2019','Freedom 2019','Generosity 2019','Trust 2019','Score Mean','Economy Mean','Family Mean','Health Mean','Freedom Mean','Generosity Mean','Trust Mean']
# source dataframe
df1 = pd.DataFrame(columns=cols)
df1.loc[0] = [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]
#target dataframe
df2 = pd.DataFrame(columns=['Year','Rank','Score','Family','Health','Freedom','Generosity','Trust','Economy'])
df2['Year']=['2015','2016','2017','2018','2019','Mean']
df2.set_index('Year', inplace=True)
idx = 0 # source row to copy
for col in df1.columns[1:]:
c,r = col.split(" ")
df2.at[r,c] = df1.at[idx, col]
print (df2)
Rank Score Family Health Freedom Generosity Trust Economy
Year
2015 1 1 1 1 1 1 1 1
2016 1 1 1 1 1 1 1 1
2017 1 1 1 1 1 1 1 1
2018 1 1 1 1 1 1 1 1
2019 1 1 1 1 1 1 1 1
Mean NaN 1 1 1 1 1 1 1
Here's a solution utilizing list comprehension:
The input:
cols = ['Region','Rank 2015','Score 2015','Economy 2015','Family 2015','Health 2015','Freedom 2015','Generosity 2015','Trust 2015','Rank 2016','Score 2016','Economy 2016','Family 2016','Health 2016','Freedom 2016','Generosity 2016','Trust 2016','Rank 2017','Score 2017','Economy 2017','Family 2017','Health 2017','Freedom 2017','Generosity 2017','Trust 2017','Rank 2018','Score 2018','Economy 2018','Family 2018','Health 2018','Freedom 2018','Generosity 2018','Trust 2018','Rank 2019','Score 2019','Economy 2019','Family 2019','Health 2019','Freedom 2019','Generosity 2019','Trust 2019','Score Mean','Economy Mean','Family Mean','Health Mean','Freedom Mean','Generosity Mean','Trust Mean']
df = pd.DataFrame(np.random.randint(1,10,(3,48)))
df.columns = cols
print(df.iloc[:, :4])
Region Rank 2015 Score 2015 Economy 2015
0 7 9 9 9
1 8 7 2 3
2 3 3 4 5
And the new dataframe would be:
target_cols = ['Rank', 'Score', 'Family', 'Health', 'Freedom', 'Generosity', 'Trust']
years = ['2015', '2016', '2017', '2018', '2019']
newdf = pd.DataFrame([df.loc[1, [x + ' ' + year for x in target_cols]].values for year in years])
newdf.columns = target_cols
newdf['year'] = years
print(newdf)
Rank Score Family Health Freedom Generosity Trust year
0 7 2 6 9 3 4 9 2015
1 2 8 1 1 7 6 1 2016
2 7 4 2 5 1 7 4 2017
3 9 7 1 4 7 5 2 2018
4 5 4 4 9 1 6 2 2019
Assuming that you have only the target years are those spanning between 2015 and 2019; and that the target columns are known.
I would procede as follows: (1) define the target columns and years target_columns = ['Rank', 'Score', 'Family', 'Health', 'Freedom', 'Generosity', 'Trust'] target_years = ['2015', '2016', '2017', '2018', '2019']
(2) retrieve the particular row, I assume your starting dataframe to be initial_dataframe
particular_row = initial_dataframe.iloc[0]
(3) retrieve and reshape the information from the particular_row
reshaped_row = { 'Year': target_years }
reshaped_row.update({ column_name: [ particular_row[column_name + ' ' + year_name] for year_name in target_years ] for column_name in target_columns })
(4) assign the reshaped row to the output_dataframe
output_dataframe = pd.Dataframe(reshaped_row)
Have you tried using a 2D array? I would find that to be the easiest. Otherwise, you could also use a dictionary. https://www.w3schools.com/python/python_dictionaries.asp
I didn't get your question properly but I can give you hint how to translate the data.
df = pd.DataFrame(li)
df = df[0].str.split("(\d{4})", expand=True)
df = df[df[2]==""]
col_name = df[0].unique()
df_new = df.pivot(index=1, columns=0, values=2)
df_new.drop(df_new.index[0], inplace=True)
df_new:
Economy Family Freedom Generosity Health Rank Score Trust
1
2016
2017
2018
2019
You can write your own logic.
It needs a lot of manipulation, a simple idea is to modify to required dict
and then make df
In [61]: dicts = {}
In [62]: for t in text[1:]:
...: n,y = t.split(" ")
...: if n not in dicts:
...: dicts[n]=[]
...: if y !="Mean":
...: if n == 'Rank':
...: dicts[n].append(y)
...: else:
...: dicts[n].append(pd.np.NaN)
...:
In [63]: df = pd.DataFrame(dicts)
In [64]: df['Year'] = df['Rank']
In [65]: df['Rank'] = df['Family']
In [66]: df
Out[66]:
Rank Score Economy Family Health Freedom Generosity Trust Year
0 NaN NaN NaN NaN NaN NaN NaN NaN 2015
1 NaN NaN NaN NaN NaN NaN NaN NaN 2016
2 NaN NaN NaN NaN NaN NaN NaN NaN 2017
3 NaN NaN NaN NaN NaN NaN NaN NaN 2018
4 NaN NaN NaN NaN NaN NaN NaN NaN 2019
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.