简体   繁体   中英

pandas dtypes column coercion

What would cause pandas to set a column type to 'object' when the values I have checked are strings? I have explicitly set that column to "string" in the dtypes dictionary settings in the read_excel method call that loads in the data. I have checked for NaN or NULL etc, but haven't found any as I know that may cause an object type to be set. I recall reading string types need to set a max length but I was under the impression that pandas sets that to the max length of the column.

Edit 1: this seems to only happen in fields holding email addresses. While I don't think this has an effect, would the @ character be triggering this behavior?

The dtype object comes from NumPy, it describes the type of element in a ndarray. Every element in an ndarray must have the same size in bytes. For int64 and float64, they are 8 bytes. But for strings, the length of the string is not fixed. So instead of saving the bytes of strings in the ndarray directly, Pandas uses an object ndarray, which saves pointers to objects; because of this the dtype of this kind ndarray is object.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM