简体繁体 English

用小因子按大因子和小数量表示大数的算法

[英]Algorithm to scale large numbers by large factor and small numbers by small factor

原文 2018-01-24 21:22:26 6 1 algorithm/ math/ transform/ scale/ venn-diagram

I'm looking for an algorithm that can scale a large outlier by a large factor and scale small numbers only a bit (or event keep them the same as before). 我正在寻找一种算法，它可以通过一个很大的因子来扩展一个大的异常值，并且只对一些小的数字进行缩放（或者事件保持它们与以前相同）。 We don't have to keep exact proportions, but just an idea that large number is still larger than small number. 我们不必保持准确的比例，但只是一个想法，大数仍然大于小数。

Eg I have a set 10, 15, 200. Let's define min and max to be 0 and 100 respectively and the scaled values should be within that set (min and max are not predefined and can be adjusted). 例如，我有一个10,15,200的集合。让我们将min和max分别定义为0和100，并且缩放值应该在该集合内（min和max不是预定义的并且可以调整）。 With the algorithm we could scale them to 5, 6, 20. 使用该算法，我们可以将它们缩放到5,6,20。

Any ideas on the formula on how to scale such numbers? 关于如何扩展这些数字的公式的任何想法？

My use case is data for Venn Diagrams for 3 overlapping sets. 我的用例是3个重叠集的维恩图的数据。 I would like to preserve the fact that large set is larger than smaller set, but the large circle shouldn't be 20 times bigger than the smaller one. 我想保留大集合大于较小集合的事实，但是大圆圈不应该比较小集合大20倍。

1 个解决方案

You haven't given enough detail to give a specific suggestion, but the general idea is that you want some significant magnitude reduction. 你没有提供足够的细节来给出具体的建议，但总的想法是你想要一些显着的减少。 In general, we handle this with something like the following: 通常，我们使用以下内容处理此问题：

square root (or other fractional root) 平方根（或其他分数根）
log (base doesn't really matter; scale as needed) log（base并不重要;根据需要扩展）
arcTan (limits the result to the range 0-1) arcTan（将结果限制在0-1范围内）

Play with some of your unusual cases to see which you like. 与您的一些不寻常的案例一起玩，看看你喜欢哪些。 The example you posted is closest to the sqrt idea. 您发布的示例最接近sqrt的想法。

UPDATE AFTER COMMENTS 评论后更新

If this is used to choose the radii of circles in a Venn diagram, then sqrt is, indeed, the natural choice to preserve the cognitive interpretation of size (from area). 如果这用于选择维恩图中的圆的半径，那么sqrt确实是保持大小（来自区域）的认知解释的自然选择。 This goes for any 2D scaling -- although note that doing this for a picture with shading ( implied 3D) suggests that cube root would be the proper scale. 这适用于任何2D缩放 - 尽管注意到对具有着色（隐含 3D）的图片执行此操作表明立方根将是适当的比例。 (ref: How to Lie with Statistics). （参考：如何欺骗统计数据）。

This is sometimes not possible, in cases where the inputs are of very different magnitudes. 在输入具有非常不同的幅度的情况下，这有时是不可能的。 For instance given (1, 1000, 1000000), you might want to use a higher root, or change to log, only to make the smallest shape tractable. 例如，给定（1,1000,1000000），您可能希望使用更高的根，或更改为日志，只是为了使最小的形状易于处理。