[英](Python, DataFrame): Add a Column and insert the nth smallest value in the row
How do I find the nth smallest number in a row, within a DataFrame, and add that value as an entry in a new column (because I would ultimately like to export the data). 如何在DataFrame中找到连续第n个最小的数字,并将该值添加为新列中的条目(因为我最终希望导出数据)。 Example Data
示例数据
Setup 设定
np.random.seed([3,14159])
df = pd.DataFrame(np.random.randint(10, size=(4, 5)), columns=list('ABCDE'))
A B C D E
0 4 8 1 1 9
1 2 8 1 4 2
2 8 2 8 4 9
3 4 3 4 1 5
In all of the following solutions, I assume n = 3
在以下所有解决方案中,我假设
n = 3
Solution 1 解决方案1
function prt
below 下面的功能
prt
Use np.partition
to place smallest to the left of a partition and the largest to the right. 使用
np.partition
将最小的分区放置在分区的左侧,将最大的分区放置在右侧。 Then take all to the left and find the max. 然后把所有的都放在左边,找到最大值。
df.assign(nth=np.partition(df.values, 3, axis=1)[:, :3].max(1))
A B C D E nth
0 4 8 1 1 9 4
1 2 8 1 4 2 2
2 8 2 8 4 9 8
3 4 3 4 1 5 4
Solution 2 解决方案2
function srt
below 下面的功能
srt
More intuitive but more costly time complexity with np.sort
使用
np.sort
更直观但更昂贵的时间复杂度
df.assign(nth=np.sort(df.values, axis=1)[:, 2])
A B C D E nth
0 4 8 1 1 9 4
1 2 8 1 4 2 2
2 8 2 8 4 9 8
3 4 3 4 1 5 4
Solution 3 解决方案3
function rnk
below 下面的功能
rnk
Using pd.DataFrame.rank
使用
pd.DataFrame.rank
Concise version that upcast to float 简洁版本浮出水面
df.assign(nth=df.where(df.rank(1, method='first').eq(3)).stack().values)
A B C D E nth
0 4 8 1 1 9 4.0
1 2 8 1 4 2 2.0
2 8 2 8 4 9 8.0
3 4 3 4 1 5 4.0
Solution 4 解决方案4
function whr
below 功能
whr
以下
Using np.where
and pd.DataFrame.rank
使用
np.where
和pd.DataFrame.rank
i, j = np.where(df.rank(1, method='first') == 3)
df.assign(nth=df.values[i, j])
A B C D E nth
0 4 8 1 1 9 4
1 2 8 1 4 2 2
2 8 2 8 4 9 8
3 4 3 4 1 5 4
Timing 定时
Notice that srt
is quickest but comparable to prt
for a bit, then for larger number of columns, the more efficient algorithm of prt
kicks in. 请注意,
srt
是最快的,但与prt
相比却有些许,然后,对于更多的列,更有效的prt
算法开始了。
res.plot(loglog=True)
prt = lambda df, n: df.assign(nth=np.partition(df.values, n, axis=1)[:, :n].max(1))
srt = lambda df, n: df.assign(nth=np.sort(df.values, axis=1)[:, n - 1])
rnk = lambda df, n: df.assign(nth=df.where(df.rank(1, method='first').eq(n)).stack().values)
def whr(df, n):
i, j = np.where(df.rank(1, method='first').values == n)
return df.assign(nth=df.values[i, j])
res = pd.DataFrame(
index=[10, 30, 100, 300, 1000, 3000, 10000],
columns='prt srt rnk whr'.split(),
dtype=float
)
for i in res.index:
num_rows = int(np.log(i))
d = pd.DataFrame(np.random.rand(num_rows, i))
for j in res.columns:
stmt = '{}(d, 3)'.format(j)
setp = 'from __main__ import d, {}'.format(j)
res.at[i, j] = timeit(stmt, setp, number=100)
Here is a method that finds nth smallest item in a list: 这是一种在列表中找到第n个最小项的方法:
def find_nth_in_list(list, n):
return sorted(list)[n-1]
The usage: 用法:
list =[10,5,7,9,8,4,6,2,1,3]
print(find_nth_in_list(list, 2))
Output: 输出:
2
You can give the row items as a list to this function. 您可以将行项目作为此功能的列表。
EDIT 编辑
You can find rows with this function: 您可以使用此功能查找行:
#Returns all rows as a list
def find_rows(df):
rows=[]
for row in df.iterrows():
index, data = row
rows.append(data.tolist())
return rows
Example usage: 用法示例:
rows = find_rows(df) #all rows as a list
smallest_3th = find_nth_in_list(rows[2], 3) #3rd row, 3rd smallest item
You can do this as follows: 您可以按照以下步骤进行操作:
df.assign(nth=df.apply(lambda x: np.partition(x, nth)[nth], axis='columns'))
Example: 例:
In[72]: df = pd.DataFrame(np.random.rand(3, 3), index=list('abc'), columns=[1, 2, 3])
In[73]: df
Out[73]:
1 2 3
a 0.436730 0.653242 0.843014
b 0.643496 0.854859 0.531652
c 0.831672 0.575336 0.517944
In[74]: df.assign(nth=df.apply(lambda x: np.partition(x, 1)[1], axis='columns'))
Out[74]:
1 2 3 nth
a 0.436730 0.653242 0.843014 0.653242
b 0.643496 0.854859 0.531652 0.643496
c 0.831672 0.575336 0.517944 0.575336
generate some random data 产生一些随机数据
dd=pd.DataFrame(data=np.random.rand(7,3))
find minumum value per row using numpy 使用numpy查找每行的最小值
dd['minPerRow']=dd.apply(np.min,axis=1)
export results 出口结果
dd['minPerRow'].to_csv('file.csv')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.