I have data like:
In [1]: d = {'ID': [14, 14, 14, 14, 14, 14, 14, 15, 15],
'NAME': ['KWI', 'NED', 'RICK', 'NICH', 'DIONIC', 'RICHARD', 'ROCKY', 'CARLOS', 'SIDARTH'],
'ID_COUNTRY':[1, 2, 3,4,5,6,7,8,9],
'COUNTRY':['MEXICO', 'ITALY', 'CANADA', 'ENGLAND', 'GERMANY', 'UNITED STATES', 'JAPAN', 'SPAIN', 'BRAZIL'],
'ID_CITY':[np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
'CITY':[np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
'STATUS': ['OK', 'OK', 'OK', 'OK', 'OK', 'NOT', 'OK', 'NOT', 'OK']}
df = pd.DataFrame(data=d)
Out[2]:
ID NAME ID_COUNTRY COUNTRY ID_CITY CITY STATUS
0 14 KWI 1 MEXICO NaN NaN OK
1 14 NED 2 ITALY NaN NaN OK
2 14 RICK 3 CANADA NaN NaN OK
3 14 NICH 4 ENGLAND NaN NaN OK
4 14 DIONIC 5 GERMANY NaN NaN OK
5 14 RICHARD 6 UNITED STATES NaN NaN NOT
6 14 ROCKY 7 JAPAN NaN NaN OK
7 15 CARLOS 8 SPAIN NaN NaN NOT
8 15 SIDHART 9 BRAZIL NaN NaN OK
Then I need to set the dtypes of each column for future uses using:
df.iloc[:, [0, 2, 4]] = df.iloc[:, [0, 2, 4]].astype("Int64")
df.iloc[:, [1, 3, 5, 6]] = df.iloc[:, [1, 3, 5, 6]].astype("string")
After doing this I want to drop the columns that have completely nan
values and get the names of the columns dropped to be remmoved in another dataframe with the same column names like this:
In [3]: d1 = {'ID': [14, 14, 14],
'NAME': ['KWI', 'NED', 'RICK'],
'ID_COUNTRY':[1, 2, 3],
'COUNTRY':['MEXICO', 'ITALY', 'CANADA'],
'ID_CITY':[20, 22, 24],
'CITY':['MX', 'AT', 'CA'],
'STATUS': ['OK', 'OK', 'OK']}
df1 = pd.DataFrame(data=d1)
Out [4]:
ID NAME ID_COUNTRY COUNTRY ID_CITY CITY STATUS
0 14 KWI 1 MEXICO 20 MX OK
1 14 NED 2 ITALY 22 AT OK
2 14 RICK 3 CANADA 24 CA OK
The issue here is when I try df['CITY'].isna()
because is giving me False
for all the values in the column. I do not why is giving me that and when I try with df['ID_CITY'].isna()
is giving me True
. I guess is because one is Int64
and the other object
. Examples:
In [5]: df4['ID_CITY'].isna()
Out[6]:
0 True
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
Name: ID_CITY, dtype: bool
In [7]: df4['CITY'].isna()
Out[8]:
0 False
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
Name: CITY, dtype: bool
After correcting what I mention before the desired output for df
and df1
will be:
Out[9]:
ID NAME ID_COUNTRY COUNTRY STATUS
0 14 KWI 1 MEXICO OK
1 14 NED 2 ITALY OK
2 14 RICK 3 CANADA OK
3 14 NICH 4 ENGLAND OK
4 14 DIONIC 5 GERMANY OK
5 14 RICHARD 6 UNITED STATES NOT
6 14 ROCKY 7 JAPAN OK
7 15 CARLOS 8 SPAIN NOT
8 15 SIDHART 9 BRAZIL OK
Out [10]:
ID NAME ID_COUNTRY COUNTRY STATUS
0 14 KWI 1 MEXICO OK
1 14 NED 2 ITALY OK
2 14 RICK 3 CANADA OK
Thaks for reading me.
Assuming that your input is (Instead of using column index, I have just used column names for clarifications):
d = {'ID': [14, 14, 14, 14, 14, 14, 14, 15, 15],
'NAME': ['KWI', 'NED', 'RICK', 'NICH', 'DIONIC', 'RICHARD', 'ROCKY', 'CARLOS', 'SIDARTH'],
'ID_COUNTRY':[1, 2, 3,4,5,6,7,8,9],
'COUNTRY':['MEXICO', 'ITALY', 'CANADA', 'ENGLAND', 'GERMANY', 'UNITED STATES', 'JAPAN', 'SPAIN', 'BRAZIL'],
'ID_CITY':[np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
'CITY':[np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan],
'STATUS': ['OK', 'OK', 'OK', 'OK', 'OK', 'NOT', 'OK', 'NOT', 'OK']}
df = pd.DataFrame(data=d)
You can cast a pd object to a specified dtype
. For that, you can use Int64
and str
(instead of string in your code) [see the link] .
df[['ID', 'ID_COUNTRY', 'ID_CITY']] = df[['ID', 'ID_COUNTRY', 'ID_CITY']].astype("Int64")
df[['NAME', 'COUNTRY', 'CITY', 'STATUS']] = df[['NAME', 'COUNTRY', 'CITY', 'STATUS']].astype("str")
With a temporary typecasting, you can determine NaN values. For this, take into account that float accepts the strings nan
with an optional prefix +
or -
for Not a Number (NaN).
df['CITY'].astype("float").isna()
The output:
0 True
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
Name: CITY, dtype: bool
Either
df['ID_CITY'].isna()
or
df['ID_CITY'].astype("float").isna()
will result:
0 True
1 True
2 True
3 True
4 True
5 True
6 True
7 True
8 True
Name: ID_CITY, dtype: bool
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.