[英]Set Column Value Based on Calculate Condition from Each Row
I have a empty dataframe as我有一个空的 dataframe 作为
columns_name = list(str(i) for i in range(10))
dfa = pd.DataFrame(columns=columns_name, index=['A', 'B', 'C', 'D'])
dfa['Count'] = [10, 6, 9, 4]
0 ![]() |
1 ![]() |
2 ![]() |
3 ![]() |
4 ![]() |
5 ![]() |
6 ![]() |
7 ![]() |
8 ![]() |
9 ![]() |
Count![]() |
|
---|---|---|---|---|---|---|---|---|---|---|---|
A![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
10 ![]() |
B![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
6 ![]() |
C ![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
9 ![]() |
D ![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
4 ![]() |
I want to replace Nan
values with a symbol with the difference of max(Count) - Current(max)
.我想用具有
max(Count) - Current(max)
差异的符号替换Nan
值。 So, the final result will look like.所以,最终的结果会是这样的。
0 ![]() |
1 ![]() |
2 ![]() |
3 ![]() |
4 ![]() |
5 ![]() |
6 ![]() |
7 ![]() |
8 ![]() |
9 ![]() |
Count![]() |
|
---|---|---|---|---|---|---|---|---|---|---|---|
A![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
10 ![]() |
B![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
- ![]() |
- ![]() |
- ![]() |
- ![]() |
6 ![]() |
C ![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
- ![]() |
9 ![]() |
D ![]() |
NaN![]() |
NaN![]() |
NaN![]() |
NaN![]() |
- ![]() |
- ![]() |
- ![]() |
- ![]() |
- ![]() |
- ![]() |
4 ![]() |
I am stuck at我被困在
dfa.at[dfa.index, [str(col) for col in list(range(dfa['Count'].max() - dfa['Count']))]] = '-'
and getting KeyError: 'Count'
并得到
KeyError: 'Count'
Actually, your this part of the code dfa.at[dfa.index, [str(col) for col in list(range(dfa['Count'].max() - dfa['Count']))]] = '-'
has issue.实际上,您的这部分代码
dfa.at[dfa.index, [str(col) for col in list(range(dfa['Count'].max() - dfa['Count']))]] = '-'
有问题。
Just try to create the list which you are trying to use inside comprehension只需尝试创建您尝试在理解中使用的列表
list(range(dfa['Count'].max() - dfa['Count']))
It'll throw TypeError
它会抛出
TypeError
If you notice, you'll figure out that (dfa['Count'].max() - dfa['Count'])
will give following series
:如果您注意到,您会发现
(dfa['Count'].max() - dfa['Count'])
将给出以下series
:
A 0
B 4
C 1
D 6
And since you're trying to pass a series
to python's range
function, it will throw the error.而且由于您试图将
series
传递给python的range
function,它会抛出错误。
One possible solution might be:一种可能的解决方案可能是:
for index, cols in zip(dfa.index, [list(map(str, col)) for col in (dfa).apply(lambda x: list(range(x['Count'], dfa['Count'].max())), axis=1).values]):
dfa.loc[index, cols] = '-'
OUTPUT : OUTPUT :
Out[315]:
0 1 2 3 4 5 6 7 8 9 Count
A NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 10
B NaN NaN NaN NaN NaN NaN - - - - 6
C NaN NaN NaN NaN NaN NaN NaN NaN NaN - 9
D NaN NaN NaN NaN - - - - - - 4
Broadcasting is also an option:广播也是一种选择:
import pandas as pd
import numpy as np
columns_name = list(str(i) for i in range(10))
dfa = pd.DataFrame(columns=columns_name, index=['A', 'B', 'C', 'D'])
dfa['Count'] = [10, 6, 9, 4]
# Broadcast based on column index (Excluding Count)
m = (
dfa['Count'].to_numpy()[:, None] == np.arange(0, dfa.shape[1] - 1)
).cumsum(axis=1).astype(bool)
# Grab Columns To Update
non_count_columns = dfa.columns[dfa.columns != 'Count']
# Update based on mask
dfa[non_count_columns] = dfa[non_count_columns].mask(m, '-')
print(dfa)
Output: Output:
0 1 2 3 4 5 6 7 8 9 Count
A NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 10
B NaN NaN NaN NaN NaN NaN - - - - 6
C NaN NaN NaN NaN NaN NaN NaN NaN NaN - 9
D NaN NaN NaN NaN - - - - - - 4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.