简体   繁体   English

如何删除整数类型列中的最后两位数?

[英]How to remove last the two digits in a column that is of integer type?

How can I remove the last two digits of a DataFrame column of type int64? 如何删除int64类型的DataFrame列的最后两位数?

For example df['DATE'] includes: 例如, df['DATE']包括:

DATE
20110708
20110709
20110710
20110711
20110712
20110713
20110714
20110815
20110816
20110817

What I would like is: 我想要的是:

DATE
201107
201107
201107
201107
201107
201107
201107
201108
201108
201108

What is the simplest way of achieving this? 实现这一目标的最简单方法是什么?

Convert the dtype to str using astype then used vectorised str method to slice the str and then convert back to int64 dtype again: 使用astypeastype转换为str,然后使用astype str方法对str进行切片,然后再次转换回int64 dtype:

In [184]:
df['DATE'] = df['DATE'].astype(str).str[:-2].astype(np.int64)
df

Out[184]:
     DATE
0  201107
1  201107
2  201107
3  201107
4  201107
5  201107
6  201107
7  201108
8  201108
9  201108

In [185]:    
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 9
Data columns (total 1 columns):
DATE    10 non-null int64
dtypes: int64(1)
memory usage: 160.0 bytes

Hmm... 嗯...

Turns out there is a built in method floordiv : 原来有一个内置的方法floordiv

In [191]:
df['DATE'].floordiv(100)

Out[191]:
0    201107
1    201107
2    201107
3    201107
4    201107
5    201107
6    201107
7    201108
8    201108
9    201108
Name: DATE, dtype: int64

update 更新

For a 1000 row df, the floordiv method is considerably faster: 对于1000行df, floordiv方法要快得多:

%timeit df['DATE'].astype(str).str[:-2].astype(np.int64)
%timeit df['DATE'].floordiv(100)

100 loops, best of 3: 2.92 ms per loop
1000 loops, best of 3: 203 µs per loop

Here we observe ~10x speedup 在这里我们观察到~10倍的加速

You could use floor division // to drop the last two digits and preserve the integer type: 您可以使用分区//删除最后两位数并保留整数类型:

>>> df['DATE'] // 100
     DATE
0  201107
1  201107
2  201107
3  201107
4  201107
5  201107
6  201107
7  201108
8  201108
9  201108

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM