I am generating some random numbers and I am try to apply a condition that if values > 80 , then put None, but I am unable to get the results. My code is as follow
import pandas as pd
import numpy as np
from numpy import random
total = 200
rand_numbers = np.random.randint(0, 100, total)
corrupt_values = np.random.randint(0, 100, total) > 80
flag = False
if flag:
rand_numbers = [v for flag, v in zip(corrupt_values, rand_numbers)]
else:
rand_numbers = None
print 'rand_numbers: ', rand_numbers
I am trying to get results like
rand_numbers [20, 50, Nan, Nan, 40, 10] so that values greater than 80 are replaced by Nan
I am try to generate 200 random numbers in rand_numbers and then I am making a condition that if values exceeded than 80, then it will put NaN instead of values otherwise value. I am trying to zip these two arrays and making that condition to work, but I am struggling with it. I am new to coding. Any help would be greatly appreciated.
It seems you need first cast values to float
(because NaN
is float
) and then change values by condition:
np.random.seed(100)
total = 100
rand_numbers = np.random.randint(0, 100, total)
corrupt_values = rand_numbers > 80
print (rand_numbers)
[ 8 24 67 87 79 48 10 94 52 98 53 66 98 14 34 24 15 60 58 16 9 93 86 2 27
4 31 1 13 83 4 91 59 67 7 49 47 65 61 14 55 71 80 2 94 19 98 63 53 27
56 30 48 47 39 38 44 18 64 56 34 53 74 17 72 13 30 17 53 68 50 91 91 83 53
78 0 13 57 76 3 70 3 84 79 10 87 60 3 48 52 43 36 5 71 38 86 94 98 42]
print (corrupt_values)
[False False False True False False False True False True False False
True False False False False False False False False True True False
False False False False False True False True False False False False
False False False False False False False False True False True False
False False False False False False False False False False False False
False False False False False False False False False False False True
True True False False False False False False False False False True
False False True False False False False False False False False False
True True True False]
rand_numbers = rand_numbers.astype(float)
rand_numbers[corrupt_values] = None
print (rand_numbers)
[ 8. 24. 67. nan 79. 48. 10. nan 52. nan 53. 66. nan 14. 34.
24. 15. 60. 58. 16. 9. nan nan 2. 27. 4. 31. 1. 13. nan
4. nan 59. 67. 7. 49. 47. 65. 61. 14. 55. 71. 80. 2. nan
19. nan 63. 53. 27. 56. 30. 48. 47. 39. 38. 44. 18. 64. 56.
34. 53. 74. 17. 72. 13. 30. 17. 53. 68. 50. nan nan nan 53.
78. 0. 13. 57. 76. 3. 70. 3. nan 79. 10. nan 60. 3. 48.
52. 43. 36. 5. 71. 38. nan nan nan 42.]
Similar solution (from deleted answer) with numpy.where
:
rand_numbers = rand_numbers.astype(float)
rand_numbers = np.where(corrupt_values, np.nan, rand_numbers)
print (rand_numbers)
[ 8. 24. 67. nan 79. 48. 10. nan 52. nan 53. 66. nan 14. 34.
24. 15. 60. 58. 16. 9. nan nan 2. 27. 4. 31. 1. 13. nan
4. nan 59. 67. 7. 49. 47. 65. 61. 14. 55. 71. 80. 2. nan
19. nan 63. 53. 27. 56. 30. 48. 47. 39. 38. 44. 18. 64. 56.
34. 53. 74. 17. 72. 13. 30. 17. 53. 68. 50. nan nan nan 53.
78. 0. 13. 57. 76. 3. 70. 3. nan 79. 10. nan 60. 3. 48.
52. 43. 36. 5. 71. 38. nan nan nan 42.]
you can use list comprehension
import numpy as np
total = 200
rand_numbers = np.random.randint(0, 100, total)
result=[i if i<=80 else float('NaN') for i in rand_numbers]
that will give you :
>>> result
[64, 23, 12, 8, 70, nan, 13, 19, 73, 18, 78, 25, 77, 45, nan, 6, 15, nan, nan, 47, nan, 39, 5, 9, 22, 59, 57, 71, 8, 24, 76, 33, 66, nan, 21, 39, 48, 23, 40, nan, nan, 75, 68, 17, 52, nan, 71, 55, 10, 53, 51, 21, 35, 6, 67, 10, 34, nan, 24, 11, 42, 72, 74, 40, 63, 8, 57, 10, nan, 45, nan, 18, nan, 80, 6, 21, 22, 2, 51, 54, 80, 50, 63, 40, nan, 26, 43, 65, 7, 13, 54, 69, 12, nan, nan, 40, 44, nan, 78, 45, 55, 72, 6, 46, 43, 33, 24, 69, 77, 51, 52, 51, nan, 32, 22, 54, 53, 25, 61, 32, 8, nan, 75, 9, 22, nan, nan, 54, 32, 49, nan, 8, 59, 44, 14, 62, 61, 37, 60, 56, 12, 23, 50, 76, 5, 14, 46, nan, 58, 18, 53, 18, 39, 10, 1, 17, 36, 31, 42, 71, 61, 39, 27, 79, nan, 44, 76, nan, 26, 3, 26, 19, 64, 6, 41, 65, 76, 31, nan, 12, nan, 77, 8, 49, nan, nan, nan, 5, 40, 15, nan, 42, 14, 12, 75, 54, 47, 65, 9, 12]
EDIT
It's also possible :
import numpy as np
total = 200
rand_numbers = np.random.randint(0, 100, total)
corrupt_values = rand_numbers > 80
result=[i[0] if i[1]==False else float('nan') for i in zip(rand_numbers, corrupt_values)]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.