[英]Python Loop with IF condition on pandas dataframe gives me incomplete result or KeyError
Given a dataframe:给定一个 dataframe:
d = {'A': [2, 1, 4, 5, 7, 8, 7, 5], 'B': [5, 7, 7, 6, 10, 9, 12, 10]}
testdf = pd.DataFrame(data=d)
A B
0 2 5
1 1 7
2 4 7
3 5 6
4 7 10
5 8 9
6 7 3
7 5 2
I'm comparing both columns and I expect to append 'Inside' to array if A > A-1 AND B < B-1, otherwise append 'Broken'.我正在比较两列,如果 A > A-1 AND B < B-1,我希望 append 'Inside' 到数组,否则 append 'Broken'。
array = []
for i in range(1,len(testdf)):
if testdf.A[i] > testdf.A[i-1]:
if testdf.B[i] < testdf.B[i-1]:
array.append('INSIDE')
else:
array.append('BROKEN')
The result is:结果是:
['BROKEN', 'INSIDE', 'BROKEN', 'INSIDE']
But I expect:但我期望:
['BROKEN', 'BROKEN', 'INSIDE', 'BROKEN', 'INSIDE', 'BROKEN', 'BROKEN']
I tried different variations with the starting point of the loop我在循环的起点尝试了不同的变化
for i in range(len(testdf)-1):
but it causes only key errors但它只会导致关键错误
How to improve the code to get it running as expected?如何改进代码以使其按预期运行?
For expected output need to append else
statement:对于预期的 output 需要 append
else
语句:
array = []
for i in range(1,len(testdf)):
if testdf.A[i] > testdf.A[i-1]:
if testdf.B[i] < testdf.B[i-1]:
array.append('INSIDE')
else:
array.append('BROKEN')
else:
array.append('BROKEN')
Non loop solution, there is also tested first value, so same length like original, if need same output is removed first value by indexing [1:]
:非循环解决方案,还测试了第一个值,因此与原始长度相同,如果需要相同的 output 通过索引
[1:]
删除第一个值:
mask = testdf['A'].gt(testdf['A'].shift()) & testdf['B'].lt(testdf['B'].shift())
out = np.where(mask, 'INSIDE', 'BROKEN').tolist()
print (out)
['BROKEN', 'BROKEN', 'BROKEN', 'INSIDE', 'BROKEN', 'INSIDE', 'BROKEN', 'BROKEN']
out1 = np.where(mask, 'INSIDE', 'BROKEN')[1:].tolist()
print (out1)
['BROKEN', 'BROKEN', 'INSIDE', 'BROKEN', 'INSIDE', 'BROKEN', 'BROKEN']
Here you go:这里是 go:
import numpy as np
import pandas as pd
d = {'A': [2, 1, 4, 5, 7, 8, 7, 5], 'B': [5, 7, 7, 6, 10, 9, 12, 10]}
testdf = pd.DataFrame(data=d)
mask1 = testdf.A > testdf.A.shift()
mask2 = testdf.B < testdf.B.shift()
res = np.where(mask1 & mask2, 'INSIDE', 'BROKEN')[1:]
print(res)
Output: Output:
['BROKEN' 'BROKEN' 'INSIDE' 'BROKEN' 'INSIDE' 'BROKEN' 'BROKEN']
You can put the whole dataframe into an array like this Inside will come only once as the 6th element in the B column is less than the 5th element您可以将整个 dataframe 放入这样的数组中,里面只会出现一次,因为 B 列中的第 6 个元素小于第 5 个元素
import pandas as pd
d = {'A': [2, 1, 4, 5, 7, 8, 7, 5], 'B': [5, 7, 7, 6, 10, 9, 12, 10]}
testdf = pd.DataFrame(data=d)
dataframearray = [[],[]]
array = []
for number in d['A']:
dataframearray[0].append(number)
for number in d['B']:
dataframearray[1].append(number)
x = 1
while x < len(dataframearray[0])-1:
x += 1
if dataframearray[0][x] > dataframearray[0][x-1] and dataframearray[1][x] > dataframearray[1][x-1]:
array.append('INSIDE')
else:
array.append('BROKEN')
Hope this helps希望这可以帮助
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.