I have a dataframe
name col1
satya 12
satya abc
satya 109.12
alex apple
alex 1000
So now i need to display the rows where column 'col1' has int value in it.O/p looks like
name col1
satya 12
alex 1000
if search for string value
name col1
satya abc
alex apple
Like wise..please suggest some code lines(may be using reg).
Let's start with a simple regex that will evaluate to True
if you have an integer and False
otherwise:
import re
regexp = re.compile('^-?[0-9]+$')
bool(regexp.match('1000'))
True
bool(regexp.match('abc'))
False
Once you have such a regex you can proceed as follows:
mask = df['col1'].map(lambda x: bool(regexp.match(x)) )
df.loc[mask]
name col1
0 satya 12
4 alex 1000
To search for strings you'll do:
regexp_str = re.compile('^[a-zA-Z]+$')
mask_str = df['col1'].map(lambda x: bool(regexp_str.match(x)))
df.loc[mask_str]
name col1
1 satya abc
3 alex apple
EDIT
The above code would work if dataframe were created by:
df = pd.read_clipboard()
(or, alternatively, all variables were supplied as strings).
If the regex approach works depends on how the df
was created. Eg, if it were created with:
df = pd.DataFrame({'name': ['satya','satya','satya', 'alex', 'alex'],
'col1': [12,'abc',109.12,'apple',1000] },
columns=['name','col1'])
the above code would fail with TypeError: expected string or bytes-like object
To make it work in any case, one would need to explicitly coerce type to str
:
mask = df['col1'].astype('str').map(lambda x: bool(regexp.match(x)) )
df.loc[mask]
name col1
0 satya 12
4 alex 1000
and the same for strings:
regexp_str = re.compile('^[a-zA-Z]+$')
mask_str = df['col1'].astype('str').map(lambda x: bool(regexp_str.match(x)))
df.loc[mask_str]
name col1
1 satya abc
3 alex apple
EDIT2
To find a float:
regexp_float = re.compile('^[-\+]?[0-9]*(\.[0-9]+)$')
mask_float = df['col1'].astype('str').map(lambda x: bool(regexp_float.match(x)))
df.loc[mask_float]
name col1
2 satya 109.12
In pandas
you would do something like this:
mask = df.col1.apply(lambda x: type(x) == int)
print df[mask]
Which would yield your expected output.
You can check whether the value contains only digits:
In [104]: df
Out[104]:
name col1
0 satya 12
1 satya abc
2 satya 109.12
3 alex apple
4 alex 1000
Integers:
In [105]: df[~df.col1.str.contains(r'\D')]
Out[105]:
name col1
0 satya 12
4 alex 1000
Non-integers:
In [106]: df[df.col1.str.contains(r'\D')]
Out[106]:
name col1
1 satya abc
2 satya 109.12
3 alex apple
if you want to filter all numeric values (integers/float/decimal) you can use pd.to_numeric(..., errors='coerce') :
In [75]: df
Out[75]:
name col1
0 satya 12
1 satya abc
2 satya 109.12
3 alex apple
4 alex 1000
In [76]: df[pd.to_numeric(df.col1, errors='coerce').notnull()]
Out[76]:
name col1
0 satya 12
2 satya 109.12
4 alex 1000
In [77]: df[pd.to_numeric(df.col1, errors='coerce').isnull()]
Out[77]:
name col1
1 satya abc
3 alex apple
def is_integer(element):
try:
int(element) #if this is str then there will be error
return 1
except:
return 0
You can simply define a function as below then list your items with for loop.
def list_str(list_of_data):
str_list=[]
for item in list_of_data: #list_of_data = [[names],[col1s]] if just col1s replace item[2] with item[1]
if not is_integer(item[2]):
str_list.append(item)
return str_list
def list_int(list_of_data):
int_list=[]
for item in list_of_data:
if is_integer(item[2]):
int_list.append(item)
return int_list
Hope this can help you
You can use df.applymap(np.isreal)
df = pd.DataFrame({'col1': [12,'abc',109.12,'apple',1000], 'name': ['satya','satya','satya', 'alex', 'alex']})
df
col1 name
0 12 satya
1 abc satya
2 109.12 satya
3 apple alex
4 1000 alex
df2 = df[df.applymap(np.isreal)]
df2
col1 name
0 12 NaN
1 NaN NaN
2 109.12 NaN
3 NaN NaN
4 1000 NaN
df2 = df2[df2.col1.notnull()]
df2
col1 name
0 12 NaN
2 109.12 NaN
4 1000 NaN
index_list = df2.index.tolist()
index_list
[0, 2, 4]
df = df.iloc[index_list]
df
col1 name
0 12 satya
2 109.12 satya
4 1000 alex
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.