简体   繁体   中英

Python: Calculate area under the curve

I have Pandas DataFrame with 2 columns 'x', 'y':

tmp = pd.DataFrame()
tmp['x'] = [1, 2, 5, 9, 12, 14]
tmp['y'] = [0, 1, -2, 2, -1, 1] 
tmp.plot(x = 'x', y = 'y')

ax = plt.gca()
ax.set_aspect('equal')
ax.grid(True, which='both')


ax.axhline(y=0, color='k')
ax.axvline(x=0, color='k')
plt.show()

Here is the plot:

在此处输入图片说明

I want to get areas of triangles which are above y=0 and below y=0 separately.

Is there an easy way to do this?

I've tried to use integration, but seems I'm doing something wrong:

pos = tmp[tmp['y']>=0]
neg = tmp[tmp['y']<=0]
print(integrate.trapezoid(pos['y'], pos['x'])) # this gives 18.5 instead of desired 5.5
print(integrate.trapezoid(neg['y'], neg['x'])) # this gives -14.5 instead of desired 5
# seems this does correct calculation, but calculates areas as negative and positive and in total it gives small area
print(integrate.trapezoid(tmp['y'], tmp['x'])) # this gives 0.5 instead of desired 10.5

This is a smaller/simpler code to show what I want to do. Actually my data is quite large. I think I can achieve desired results by adding corresponding (x, y) values where y = 0, but I'm wondering if there are existing functions that will do similar job.

I hope that by area you meant, area between x axis and the curve, because there cannot be an area of an open curve, so based on this assumption I am also assuming x would always be an increasing number
If you check the pos and neg dataframe, they're becoming a completely different shape compared to what you wanted. To calculate the area, you'd need to calculate the area of all the figures above and below the x axis seperately, ie identifying all x intersections and finding area between the intersections

I have made a general code, you'd still have to add in edge cases, where signal does not start or end as intercept

tmp['change'] = tmp['y']*tmp['y'].shift(1)<0 ## points before which intercept would be found
tmp['slope']=(tmp['y']-tmp['y'].shift(1))/(tmp['x']-tmp['x'].shift(1))  ## identify slope
tmp['intersection']=-tmp['y']/tmp['slope']+tmp['x']  ## identify point of intersection
intersections=tmp[tmp['change']==True]  ## only take intersection from points where sign of 'y has changed

intersections=intersections[['intersection']]
intersections.rename(columns={"intersection":"x"}, inplace=True)
intersections['y']=int(0)

tmp = tmp[['x','y']]
tmp=tmp.append(intersections)
tmp=tmp.sort_values(by='x')
tmp=tmp.reset_index(drop=True)

crossing = tmp[tmp['y']==0].index  ## points between which area is to be identified
area=0
for i in range(len(crossing)-1):
    area_tmp=integrate.trapz(tmp[crossing[i]:crossing[i+1]+1]['y'],tmp[crossing[i]:crossing[i+1]+1]['x'])
    area+=abs(area_tmp)
    # print(area_tmp)
print(area)

This gives an answer of 10, you still need to add edge case for the last triangle

PS: unable to comment on the question

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM