简体   繁体   English

在 Python 中转换 R function

[英]Convert R function in Python

I have two variables consisting of data from a data frame我有两个变量,由数据框中的数据组成

x = table_1[' Profit ']
y = table_1['diff_date']

where x is其中 x 是

0      820.0
1      306.0
2      139.0
3      105.0
4      140.0
5      149.0
6       96.0
7       80.0
8      124.0
9      102.0
10      72.0
11      54.0
12      66.0
13     124.0
14      64.0
15      93.0
16      58.0
17      59.0
18      62.0
19      65.0
20      74.0
21      67.0
22      80.0
23      91.0
24      81.0
25      56.0
26      43.0

and y is并且 y 是

0       0
1       1
2       2
3       3
4       4
5       5
6       6
7       7
8       8
9       9
10     10
11     11
12     12
13     13
14     14
15     15
16     16
17     17
18     18
19     19
20     20
21     21
22     22
23     23
24     24
25     25
26     26

I have a function in R which I'm trying to convert in Python, I'm done with most of the task except small condition in R. I have a function in R which I'm trying to convert in Python, I'm done with most of the task except small condition in R.

The function in R is R 中的 function 是

my_sum <- function(x, y){
  a <- NULL
  for (i in 1:max(y)) {
    a[i] <- sum(x[which(y == (i-1))])
  }
  a[1] <- a[1] - 7000 
  a[2] <- a[2] + 900 
  return(cumsum(a)) 
} 

I want to convert this function in Python, what I have done so far is我想在 Python 中转换这个 function,到目前为止我所做的是

 def my_sum(x,y):
    a = 0
    for i in range (1,max(y)):
       a[i] = sum(x[np.where (y == (i-1))])
                
    a[1] = a[1] - 7000
    a[2] = a[2] + 900
    return(np.cumsum(a))

What I'm not sure is of how to convert sum(x[which(y == (i-1))]) to Python, I have read that we can use np.where and I tried converting it to something like that sum(x[np.where (y == (i-1))]) but it's throwing me the error我不确定如何将sum(x[which(y == (i-1))])转换为 Python,我读过我们可以使用np.where并尝试将其转换为类似的东西sum(x[np.where (y == (i-1))])但它给我带来了错误

ValueError: Can only tuple-index with a MultiIndex ValueError:只能使用 MultiIndex 进行元组索引

not sure where is the issue in my code不确定我的代码中的问题在哪里

You need to define a before you use it:您需要在使用之前定义a

import numpy as np

x = np.array([820.0, 306.0, 139.0, 105.0, 140.0])
y = np.arange(len(x))

def my_sum(x,y):
    a = np.zeros((len(y),))
    for i in range (1,max(y)):
       a[i] = sum(x[np.where(y == (i-1))])
                
    a[1] = a[1] - 7000
    a[2] = a[2] + 900
    return(np.cumsum(a))

s = my_sum(x,y)
def my_sum(x,y):
    a = [sum(x[np.where(y == (i-1))]) for i in range(1,max(y))]
                
    a[1] -= 7000
    a[2] += 900
    return(np.cumsum(a))

I am not quite sure of what you are trying to accomplish, although it seems you are doing a grouped sum:我不太确定您要完成什么,尽管您似乎正在做一个分组求和:

in R you could do:在 R 你可以这样做:

my_sum1 <- function(x, y){
  a <- unname(tapply(x, y, sum))
  a[1:2] <- a[1:2] + c(-7000, 900)
  cumsum(a)
}

in python you could do:在 python 你可以这样做:

import numpy as np
def my_sum1(x,y):
    a = np.array([(x[y == i]).sum() for i in np.unique(y)])
    a[0:2] = a[0:2] + np.r_[-7000, 900]
    return a.cumsum()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM