在 Python 中转换 R function

Question

I have two variables consisting of data from a data frame我有两个变量，由数据框中的数据组成

x = table_1[' Profit ']
y = table_1['diff_date']

where x is其中 x 是

0      820.0
1      306.0
2      139.0
3      105.0
4      140.0
5      149.0
6       96.0
7       80.0
8      124.0
9      102.0
10      72.0
11      54.0
12      66.0
13     124.0
14      64.0
15      93.0
16      58.0
17      59.0
18      62.0
19      65.0
20      74.0
21      67.0
22      80.0
23      91.0
24      81.0
25      56.0
26      43.0

and y is并且 y 是

I have a function in R which I'm trying to convert in Python, I'm done with most of the task except small condition in R. I have a function in R which I'm trying to convert in Python, I'm done with most of the task except small condition in R.

The function in R is R 中的 function 是

my_sum <- function(x, y){
  a <- NULL
  for (i in 1:max(y)) {
    a[i] <- sum(x[which(y == (i-1))])
  }
  a[1] <- a[1] - 7000 
  a[2] <- a[2] + 900 
  return(cumsum(a)) 
}

I want to convert this function in Python, what I have done so far is我想在 Python 中转换这个 function，到目前为止我所做的是

 def my_sum(x,y):
    a = 0
    for i in range (1,max(y)):
       a[i] = sum(x[np.where (y == (i-1))])
                
    a[1] = a[1] - 7000
    a[2] = a[2] + 900
    return(np.cumsum(a))

What I'm not sure is of how to convert sum(x[which(y == (i-1))]) to Python, I have read that we can use np.where and I tried converting it to something like that sum(x[np.where (y == (i-1))]) but it's throwing me the error我不确定如何将sum(x[which(y == (i-1))])转换为 Python，我读过我们可以使用np.where并尝试将其转换为类似的东西sum(x[np.where (y == (i-1))])但它给我带来了错误

ValueError: Can only tuple-index with a MultiIndex ValueError：只能使用 MultiIndex 进行元组索引

not sure where is the issue in my code不确定我的代码中的问题在哪里

Answer 1

You need to define a before you use it:您需要在使用之前定义a ：

import numpy as np

x = np.array([820.0, 306.0, 139.0, 105.0, 140.0])
y = np.arange(len(x))

def my_sum(x,y):
    a = np.zeros((len(y),))
    for i in range (1,max(y)):
       a[i] = sum(x[np.where(y == (i-1))])
                
    a[1] = a[1] - 7000
    a[2] = a[2] + 900
    return(np.cumsum(a))

s = my_sum(x,y)

Answer 2

def my_sum(x,y):
    a = [sum(x[np.where(y == (i-1))]) for i in range(1,max(y))]
                
    a[1] -= 7000
    a[2] += 900
    return(np.cumsum(a))

Answer 3

I am not quite sure of what you are trying to accomplish, although it seems you are doing a grouped sum:我不太确定您要完成什么，尽管您似乎正在做一个分组求和：

in R you could do:在 R 你可以这样做：

my_sum1 <- function(x, y){
  a <- unname(tapply(x, y, sum))
  a[1:2] <- a[1:2] + c(-7000, 900)
  cumsum(a)
}

in python you could do:在 python 你可以这样做：

import numpy as np
def my_sum1(x,y):
    a = np.array([(x[y == i]).sum() for i in np.unique(y)])
    a[0:2] = a[0:2] + np.r_[-7000, 900]
    return a.cumsum()

在 Python 中转换 R function

问题描述

3 个解决方案

解决方案1
0 已采纳 2020-08-14 13:17:25

解决方案2
0 2020-08-14 13:21:26

解决方案3
0 2020-08-14 13:49:05

在 Python 中转换 R function

问题描述

3 个解决方案

解决方案1 0 已采纳 2020-08-14 13:17:25

解决方案2 0 2020-08-14 13:21:26

解决方案3 0 2020-08-14 13:49:05

解决方案1
0 已采纳 2020-08-14 13:17:25

解决方案2
0 2020-08-14 13:21:26

解决方案3
0 2020-08-14 13:49:05