[英]Pandas get column header name for column which contains bracket around value of another column
Consider this sample df考虑这个样本 df
sample_df = pd.DataFrame({'our_value':[14,24], 'opt1':['10 - 20','10 - 20'],'opt2':['21 - 30','21 - 30'],'opt3':['31 - 40','31 - 40'],'opt4':['41 - 50','41 - 50']})
our_value opt1 opt2 opt3 opt4
0 14 10 - 20 21 - 30 31 - 40 41 - 50
1 24 10 - 20 21 - 30 31 - 40 41 - 50
I am attempting to create a column that contains the name of the column header if the value in the 'our_value' column is between the two values in the respective column.如果“our_value”列中的值介于相应列中的两个值之间,我正在尝试创建一个包含列标题名称的列。 So, intended outcome is:
所以,预期的结果是:
our_value opt1 opt2 opt3 opt4 our_opt
0 14 10 - 20 21 - 30 31 - 40 41 - 50 opt1
1 24 10 - 20 21 - 30 31 - 40 41 - 50 opt2
Tried various approaches including a dictionary of keys of the column headers and values being a list with two items, the lower and upper bound of the column like this:尝试了各种方法,包括列标题的键和值的字典,这是一个包含两个项目的列表,列的下限和上限如下:
test_dict = {'opt1':['10, 20], 'opt2':[21,30]....}
Using map to apply that did not work.使用地图来应用那不起作用。
These approaches, likewise, did not work这些方法同样无效
for k,v in test_dict.items():
df[k] = df['our_value'].map(lambda x: v[0] if x < v[1] else v[1])
if df['our_value'].between(v[0], v[1]).all():
df['our_opt'] = k
I am sure I am missing something fundamental at this point.我确信我在这一点上遗漏了一些基本的东西。 But, like writing your own novel or article, I am unable to proofread this code and find what is missing.
但是,就像写自己的小说或文章一样,我无法校对这段代码并找到缺失的内容。 Thanks for taking a look.
谢谢参观。
Let us get the low and high for the range , then we can do让我们得到范围的低和高,然后我们可以做
s = sample_df.filter(like='opt')
low = s.apply(lambda x : x.str.split(' - ').str[0].astype(int))
high = s.apply(lambda x : x.str.split(' - ').str[1].astype(int))
sample_df['out'] = (low.le(sample_df['our_value'],axis=0) & high.ge(sample_df['our_value'],axis=0)).dot(s.columns)
Out[63]:
0 opt1
1 opt2
dtype: object
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.