I have a pandas data frame like df:
df=pd.DataFrame([[111, 7,8], [409,6,4], [333, 9,0],[111,3,2],[111,0,0], [409,7,0]], columns=['A','B','C'])
df
A B C
0 111 7 8
1 409 6 4
2 333 9 0
3 111 3 2
4 111 0 0
5 409 7 0
How to map column A to 10-digit random integers such that the same value in columns A (such as 111) has the same 10-digit random integer in the new array. For example, I want something like this
A B C
0 8765479834 7 8
1 7653780954 6 4
2 9400211346 9 0
3 8765479834 3 2
4 8765479834 0 0
5 7653780954 7 0
Thank you!
One way via hashlib
:
import hashlib
df['A'] = df['A'].apply(lambda s: int(hashlib.sha1(str(s).encode("utf-8")).hexdigest(), 16) % (10 ** 8))
A B C
0 22445762 7 8
1 63857454 6 4
2 61248669 9 0
3 22445762 3 2
4 22445762 0 0
5 63857454 7 0
NOTE: If you want values of random length you can also use:
df['A'] = pd.util.hash_pandas_object(df['A'], index =False)
You can use map and numpy
# find unique values in A
unique = df['A'].unique()
# use numpy to generate a random int
data = np.random.randint(1000000000, 9999999999, len(unique))
# zip the random int with your unique values and map to col A
df['A'] = df['A'].map(dict(zip(unique, data)))
A B C
0 8059444826 7 8
1 2465745168 6 4
2 8408792865 9 0
3 8059444826 3 2
4 8059444826 0 0
5 2465745168 7 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.