I have a Pandas dataFrame object train_df with say a column called "ColA" and a column "ColB". It has been loaded from a csv file with columns header using read_csv
I obtain the same results when I code:
pd.crosstab(train_df['ColA'], train_df['ColB'])
or
pd.crosstab(train_df.ColA, train_df.ColB)
Is there any difference in these 2 ways of selecting columns?
When I request to print the type it's the same: pandas.core.series.Series
No difference
pd.crosstab(train_df['ColA'], train_df['ColB'])
is recommended to prevent possible errors.
For example, if you have a column named count
and if you type train_df.count
it will give an error. train_df['count']
won't give an error.
If you only want to select a single column, there is no difference between the two ways.
However, the dot notation doesn't allow you to select multiple columns, whereas you can use dataframe[['col1', 'col2']]
to select multiple columns (which returns a pandas.core.frame.DataFrame
instead of a pandas.core.series.Series
).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.