I'm an environmental geologist and I'm just learning Python/Pandas. I have a dataframe of analytical data in Pandas similar to the example below:
I only want to remove numbers from Total_dl leaving the detection limits (numbers with <). This would be the final dataframe I'm looking for:
Since the column is strings I'm not sure how to parse the column. Any help would be appreciated.
Thanks
The following should do the trick:
import numpy as np
mask = df.Total_dll < 1.
df.loc[mask, 'Total_dll'] = np.nan
If Total_dll
is of type string
you can try the following:
import numpy as np
df.str.startswith('<')
df.loc[df.Total_dll.str.startswith('<'), np.nan]
One way to do it. Not sure how good a solution it is:
df['Total_dl'] = df['Total_dl'].apply(lambda o: o if '<' in str(o) else np.nan)
Using a function that does the same instead:
>>> df
SampleID Total_dl
0 A-1-0' 2.5
1 A-1-0.5' <0.021
>>> df.dtypes
SampleID object
Total_dl object
dtype: object
>>> def foo(o):
... if '<' in str(o):
... return o
... else:
... return np.nan
...
>>> df['Total_dl'] = df['Total_dl'].apply(foo)
>>> df
SampleID Total_dl
0 A-1-0' NaN
1 A-1-0.5' <0.021
>>>
Say your data frame is called df
, then this will do the trick
import numpy as np
nan_condition = df[~df["Total_dl"].str.contains(">")]
df.loc[nan_condition,"Total_dl"] = np.nan
你可以用这个
data = data.loc[data[column] > x]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.