[英]Determine Prime Number in Python DataFrame
I have dataframe like this:我有这样的 dataframe:
Name Email Trx
0 John john.doe@gmail.com 30
1 Sarah sarah@gmail.com 7
2 Bob bob@yahoo.com 11
3 Chad chad@outlook.com 21
4 Karen karen@outlook.com 20
5 Dmitri dmitri@rocketmail.com 17
and I need to know whether the respective customer eligible for a voucher or not.我需要知道相应的客户是否有资格获得代金券。 The criteria is if the trx is a prime number, the customer is eligible, else it's not eligible.
标准是如果 trx 是素数,则客户符合条件,否则不符合条件。 The dataframe should be like this:
dataframe 应该是这样的:
Name Email Trx Voucher
0 John john.doe@gmail.com 30 not eligible
1 Sarah sarah@gmail.com 7 eligible
2 Bob bob@yahoo.com 11 eligible
3 Chad chad@outlook.com 21 not eligible
4 Karen karen@outlook.com 20 not eligible
5 Dmitri dmitri@rocketmail.com 17 eligible
I know how to determine prime number but not in a dataframe. Thank you in advance我知道如何确定素数,但不知道 dataframe。提前谢谢
I copy and pasted a function to find out if a number is prime from here:我复制并粘贴了一个 function 以从这里找出一个数字是否为质数:
Python Prime number checker Python 素数检查器
Then I use .apply() to apply this function to every value in column 'Trx':然后我使用.apply()将这个 function 应用到“Trx”列中的每个值:
def isprime(n):
'''check if integer n is a prime'''
# make sure n is a positive integer
n = abs(int(n))
# 0 and 1 are not primes
if n < 2:
return False
# 2 is the only even prime number
if n == 2:
return True
# all other even numbers are not primes
if not n & 1:
return False
# range starts with 3 and only needs to go up
# the square root of n for all odd numbers
for x in range(3, int(n**0.5) + 1, 2):
if n % x == 0:
return False
return True
df['Voucher'] = df['Trx'].apply(isprime)
Resulting dataframe:结果 dataframe:
Name Email Trx Voucher
0 John john.doe@gmail.com 30 False
1 Sarah sarah@gmail.com 7 True
2 Bob bob@yahoo.com 11 True
3 Chad chad@outlook.com 21 False
4 Karen karen@outlook.com 20 False
5 Dmitri dmitri@rocketmail.com 17 True
Why not use Sympy's isprime()
function.为什么不使用 Sympy 的
isprime()
function。
def is_prime(num):
from sympy import isprime
return "eligible" if isprime(num) else "not eligible"
df['Voucher'] = df['Trx'].apply(is_prime)
A little more faster way will be to create a dictionary of prime numbers within the min
and max
of your df.Txn
and map the dictionary to df.Txn
, the filling na更快一点的方法是在
df.Txn
的min
和max
内创建一个素数字典,并将map字典创建到df.Txn
, 填充 na
def isPrime(n):
if n==2: return True
if n==1 or n%2 == 0: return False
else:
for i in range(2, int(n**0.5)+1):
if n % i == 0:
return False
return True
def get_primes_within(lower,upper):
prime ={}
for num in range(lower, upper + 1):
if isPrime(num):
prime[num] = 'eligible'
return prime
prime_dict = get_primes_within(df.Trx.min(),df.Trx.max())
>>> print(prime_dict)
{7: 'eligible',
11: 'eligible',
13: 'eligible',
17: 'eligible',
19: 'eligible',
23: 'eligible',
29: 'eligible'}
df['Voucher'] = df.Trx.map(prime_dict).fillna('not eligible')
>>> print(df)
Name Email Trx Voucher
0 John john.doe@gmail.com 30 not eligible
1 Sarah sarah@gmail.com 7 eligible
2 Bob bob@yahoo.com 11 eligible
3 Chad chad@outlook.com 21 not eligible
4 Karen karen@outlook.com 20 not eligible
5 Dmitri dmitri@rocketmail.com 17 eligible
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.