[英]How can I generate a random number for each row with condition?
I'm new at Python and I'll appreciate your help.我是 Python 的新手,非常感谢您的帮助。
I have a data frame with 2000 rows and 2 columns: Row and Pct.我有一个包含 2000 行和 2 列的数据框:行和百分比。 Basically, I want to create a third column that will be based on the following logic:基本上,我想创建基于以下逻辑的第三列:
Hope I managed to explain myself:)希望我设法解释自己:)
Thanks!谢谢!
Edit: For your questions:编辑:对于您的问题:
data = {
'Pct': [0.8,0.4,0.3,0.7,0.3,1,0.23,0.75,0.93,0.6],
'Row': [1,2,3,4,5,6,7,8,9,10]
}
df = pd.DataFrame(data, columns = ['Row','Pct'])
df
Row Pct
0 1 0.80
1 2 0.40
2 3 0.30
3 4 0.70
4 5 0.30
5 6 1.00
6 7 0.23
7 8 0.75
8 9 0.93
9 10 0.60
You can do something like this:你可以这样做:
def generate_random_values(row):
pct_value = float(row['Pct'])
# 1 . Generate random no bw 0 and 1
x = np.random.random()
# 2. Init value of new column
new_col = 0
# 3. while x > pct_value, add 1 to new_col and generate new random no
while x > pct_value:
new_col += 1
x = np.random.random()
# 4. Here x < = pct_value, add 1 to new col and return for the current row
new_col += 1
return new_col
And then:接着:
df['new_column'] = df.apply(func=generate_random_values, axis=1)
print (df)
>>>
Row Pct new_column
0 1 0.80 1
1 2 0.40 2
2 3 0.30 1
3 4 0.70 1
4 5 0.30 8
5 6 1.00 1
6 7 0.23 1
7 8 0.75 1
8 9 0.93 1
9 10 0.60 2
Also might be a good idea to check for a minimum threshold for the 'Pct' column before running the above function as you don't want to run into an infinite loop...在运行上述 function 之前检查“Pct”列的最小阈值也是一个好主意,因为您不想陷入无限循环......
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.