I have a DataFrame
(fairly large, hard to reproduce &c), for which I observe this behavior:
>>> df.info(verbose=True,memory_usage=True,null_counts=True)
<class 'pandas.core.frame.DataFrame'>
Int64Index: 49841 entries, 0 to 49878
Data columns (total 70 columns):
...
channel 25101 non-null object
...
dtypes: bool(10), datetime64[ns](6), float64(2), int64(32), object(20)
memory usage: 23.7+ MB
>>> df.channel.fillna("Unknown",inplace=True)
>>> df.info(verbose=True,memory_usage=True,null_counts=True)
<class 'pandas.core.frame.DataFrame'>
Int64Index: 49841 entries, 0 to 49878
Data columns (total 70 columns):
...
channel 25101 non-null object
...
dtypes: bool(10), datetime64[ns](6), float64(2), int64(32), object(20)
memory usage: 23.7+ MB
IOW, it appears that df.channel.fillna("Unknown",inplace=True)
has no effect.
How can that be? Is this a bug? What am I doing wrong?!
PS. Summary from the comments:
df.is_copy
is None
df._is_view
is False
channel
is a column, not an attribute, because it is listed by info
From the documentation :
You can use attribute access to modify an existing element of a Series or column of a DataFrame, but be careful; if you try to use attribute access to create a new column, it fails silently, creating a new attribute rather than a new column.
We suspect you assigned df.channel
first, then df['channel']
and this creates the unexpected behavior.
The reason turned out to be the following sqlalchemy
query:
select *
from table1
join table2
on table1.id = table2.id
The resulting DF has two columns named id
and a total havoc ensues.
Solution:
select *
from table1
join (select id as id2, ... from table2) t2
on table1.id = t2.id2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.