Python : Remove all data in a column of a dataframe and keep the last value in the first row

Question

Let's say that I have a simple Dataframe.

import pandas as pd

data1 = [12,34,'fsdf',678,'','','dfs','','']
df1 = pd.DataFrame(data1, columns= ['Data'])
print(df1)

   Data
0    12
1    34
2  fsdf
3   678
4
5
6   dfs
7
8

I want to delete all the data except the last value found in the column that I want to keep in the first row. It can be an column with thousands of rows. So I would like the result:

And I have to keep the shape of this dataframe, so not removing rows.

What are the simplest functions to do that efficiently?

Thank you

Answer 1

Get index of last not empty string value and pass to first value of column:

s = df1.loc[df1['Data'].iloc[::-1].ne('').idxmax(), 'Data']
print (s)
dfs

df1['Data'] = ''
df1.loc[0, 'Data'] = s
print (df1)
  Data
0  dfs
1     
2     
3     
4     
5     
6     
7     
8

If empty strings are missing values:

data1 = [12,34,'fsdf',678,np.nan,np.nan,'dfs',np.nan,np.nan]
df1 = pd.DataFrame(data1, columns= ['Data'])
print(df1)
   Data
0    12
1    34
2  fsdf
3   678
4   NaN
5   NaN
6   dfs
7   NaN
8   NaN

s = df1.loc[df1['Data'].iloc[::-1].notna().idxmax(), 'Data']
print (s)
dfs

df1['Data'] = ''
df1.loc[0, 'Data'] = s
print (df1)
  Data
0  dfs
1     
2     
3     
4     
5     
6     
7     
8

Answer 2

You can replace '' with NaN using df.replace , now use df.last_valid_index

val = df1.loc[df1.replace('', np.nan).last_valid_index(), 'Data']

# Below two lines taken from @jezrael's answer
df1.loc[0, 'Data'] = val
df1.loc[1:, 'Data'] = ''

Or

You can use np.full with fill_value set to np.nan here.

val = df1.loc[df1.replace("", np.nan).last_valid_index(), "Data"]
df1 = pd.DataFrame(np.full(df1.shape, np.nan), 
                   index=df.index,
                   columns=df1.columns)

df1.loc[0, "Data"] = val

Answer 3

A simple pandas condition check like this can help,

df1['Data'] = [df1.loc[df1['Data'].ne(""), "Data"].iloc[-1]] + [''] * (len(df1) - 1)

Python : Remove all data in a column of a dataframe and keep the last value in the first row

Question

3 answers

solution1
2 ACCPTED 2020-10-28 13:45:57

solution2
1 2020-10-28 13:49:54

solution3
1 2020-10-28 13:50:14

Python : Remove all data in a column of a dataframe and keep the last value in the first row

Question

3 answers

solution1 2 ACCPTED 2020-10-28 13:45:57

solution2 1 2020-10-28 13:49:54

solution3 1 2020-10-28 13:50:14

solution1
2 ACCPTED 2020-10-28 13:45:57

solution2
1 2020-10-28 13:49:54

solution3
1 2020-10-28 13:50:14