I have python pandas data frame like this with 200k to 400k rows
Index value
1 a
2
3 v
4
5
6 6077
7
8 h
and I want this dataframe value column to be filled all below rows with the specific value based on number of string values(like here in this table we have 1 number of string value). I want my dataframe to be like this.
Index value
1 a
2 a
3 v
4 v
5 v
6 v
7 v
8 h
If need repeat strings with length 1
you can use Series.str.match
by regex [a-zA-Z]{1}
for check if strings with length 1
, replace not matched values to NaN
s by Series.where
and last forward filling missing values by ffill
:
df['value'] = df['value'].where(df['value'].str.match('^[a-zA-Z]{1}$', na=False)).ffill()
print (df)
Index value
0 1 a
1 2 a
2 3 v
3 4 v
4 5 v
5 6 v
6 7 v
7 8 h
Another idea:
m1 = df['value'].str.len().eq(1)
m2 = df['value'].str.isalpha()
df['value'] = df['value'].where(m1 & m2).ffill()
The forward fill
method in fillna
is exactly for this. This should work for you:
df.fillna(method='ffill')
try this,
import pandas as pd
df['value'].replace('\d+', pd.np.nan, regex=True).ffill()
0 a
1 a
2 v
3 v
4 v
5 v
6 v
7 h
Name: value, dtype: object
Once you have removed all numbers, do this:
df[df['value']==""] = np.NaN
df.fillna(method='ffill')
Assuming that any value that is not an empty string or number should be forward filled, then the regular expression r'^\\d*$'
will match both an empty string or number. These values can be replaced by np.nan
and then ffill
can be called:
import numpy as np
df['value'].replace(r'^\d*$', np.nan, regex=True, inplace=True)
df['value'].ffill(inplace=True)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.