简体   繁体   English

在Java中实现数字系统:可变与不可变

[英]Implementing a Number System in Java: Mutable vs. Immutable

I am implementing classes for rational numbers, but the problems and issues are essentially the same for complex numbers as well as other classes intended to be used in applications with a significant number of calculations performed on a given mathematical object. 我正在实现有理数的类,但是对于复数以及打算在具有给定数学对象的大量计算的应用程序中使用的其他类,问题和问题基本上相同。

In the libraries distributed with the JRE and in many third-party libraries, number classes are immutable. 在随JRE分发的库中以及许多第三方库中,数字类是不可变的。 This has the advantage that "equals" and "hashcode" can be reliably implemented together as intended. 这样做的优点是,“ equals”和“ hashcode”可以按预期可靠地一起实现。 This will enable instances to be used as both keys and values in various collections. 这将使实例可以用作各种集合中的键和值。 In fact, immutability of an instance throughout its lifetime as a key value in a collection must be maintained for reliable operations on the collection. 实际上,必须在实例的整个生命周期中将其作为集合中的键值保持不变,以确保对该集合的可靠操作。 This is much more robustly maintained if the class prevents operations which may alter the internal state on which the hashcode method relies once an instance is created than if it is left up to the developer and subsequent maintainers of the code to abide by the convention of removing instances from collections before modifying their state followed by adding the instances back to whichever collections they must belong. 如果该类阻止创建实例后创建哈希实例所依赖的内部状态的操作要比将其留给开发人员和代码的后续维护者遵守删除约定的方式进行维护,则将更加健壮。集合的实例,然后再修改其状态,然后将实例添加回它们必须属于的任何集合。

Yet, if the class design enforces -- within the limits of the language -- immutability, mathematical expressions become burdened with excessive object allocation and subsequent garbage collection when performing even simple mathematical operations. 但是,如果类设计在语言的范围内实现了不变性,则即使执行简单的数学运算,数学表达式也将负担过多的对象分配和后续的垃圾回收。 Consider the following as an explicit example of what occurs repeatedly in complex computations: 考虑以下内容作为在复杂计算中反复发生的显式示例:

Rational result = new Rational( 13L, 989L ).divide( new Rational( -250L, 768L ) );

The expression includes three allocations -- two of which are quickly discarded. 该表达式包括三个分配-其中两个分配很快被丢弃。 To avoid some of the overhead, classes typically preallocate commonly used "constants" and may even maintain a hash table of frequently used "numbers." 为了避免某些开销,类通常会预先分配常用的“常量”,甚至可以维护常用的“数字”的哈希表。 Of course, such a hash table would likely be less performant than simply allocating all of the necessary immutable objects and relying on the Java compiler and JVM to manage the heap as efficiently as possible. 当然,与仅分配所有必需的不可变对象并依靠Java编译器和JVM尽可能有效地管理堆相比,这种哈希表的性能可能会更低。

The alternative is to create classes which support mutable instances. 另一种方法是创建支持可变实例的类。 By implementing the methods of the classes in a fluent style, it is possible to evaluate concise expressions functionally similar to the above without allocating a third object to be returned from the "divide" method as the "result." 通过以流畅的方式实现类的方法,可以在功能上类似于上述的情况下评估简洁的表达式,而无需将要从“ divide”方法返回的第三对象分配为“ result”。 Again, this is not particularly significant for this one expression. 同样,这对于这一表达并不特别重要。 However, solving complex linear algebra problems by operating on matrices is a more realistic case for mathematical objects which are better processed as mutable objects rather than having to operate on immutable instances. 但是,对于数学对象而言,通过对矩阵进行运算来解决复杂的线性代数问题是一种更为现实的情况,因为数学对象可以更好地处理为可变对象,而不必对不变实例进行运算。 And for matrices of rational numbers, a mutable rational number class would seem to be much more easily justified. 对于有理数矩阵,可变的有理数类似乎更容易辩解。

With all that, I have two related questions: 综上所述,我有两个相关的问题:

  1. Is there anything about the Sun/Oracle Java compiler, JIT, or JVM which would conclusively recommend immutable rational or complex number classes over mutable classes? 关于Sun / Oracle Java编译器,JIT或JVM,有什么可以肯定地建议在可变类上使用不可变的有理数或复数类吗?

  2. If not, how should "hashcode" be handled when implementing mutable classes? 如果没有,实现可变类时应如何处理“哈希码”? I am inclined to "fail-fast" by throwing an unsupported operation exception rather than providing either an implementation prone to misuse and unnecessary debugging sessions or one which is robust even when the state of immutable objects change, but which essentially turns hash tables into linked lists. 我倾向于通过抛出不受支持的操作异常来“快速失败”,而不是提供易于滥用和不必要的调试会话的实现,或者提供即使在不可变对象的状态发生变化时也很健壮的实现,但实际上将哈希表转换为链接的实现列表。

Test Code: 测试代码:

For those wondering whether immutable numbers matter when performing calculations roughly similar to those I need to implement: 对于那些想知道在执行与我需要实现的计算大致相似的计算时不变数字是否重要的​​人:

import java.util.Arrays;

public class MutableOrImmutable
{
    private int[] pseudomatrix = { 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
                                   0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
                                   0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
                                   0, 0, 0, 1, 0, 0, 0, 0, 0, 0,
                                   0, 0, 0, 0, 1, 0, 0, 0, 0, 0,
                                   0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
                                   0, 0, 0, 0, 0, 0, 1, 0, 0, 0,
                                   0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
                                   0, 0, 0, 0, 0, 0, 0, 0, 1, 0,
                                   0, 0, 0, 0, 0, 0, 0, 0, 0, 1,
                                   1, 2, 0, 0, 0, 0, 0, 0, 0, 0,
                                   0, 0, 3, 4, 0, 0, 0, 0, 0, 0,
                                   0, 0, 0, 0, 5, 5, 0, 0, 0, 0,
                                   0, 0, 0, 0, 0, 0, 4, 3, 0, 0,
                                   0, 0, 0, 0, 0, 0, 0, 0, 2, 1 };

    private int[] scalars = { 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };

    private static final int ITERATIONS = 500;

    private void testMutablePrimitives()
    {
        int[] matrix = Arrays.copyOf( pseudomatrix, pseudomatrix.length );

        long startTime = System.currentTimeMillis();

        for ( int iteration = 0 ; iteration < ITERATIONS ; ++iteration )
        {
            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] *= scalar;
                }
            }

            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] /= scalar;
                }
            }
        }

        long stopTime    = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println( "Elapsed time for mutable primitives: " + elapsedTime );

        assert Arrays.equals( matrix, pseudomatrix ) : "The matrices are not equal.";
    }

    private void testImmutableIntegers()
    {
        // Integers are autoboxed and autounboxed within this method.

        Integer[] matrix = new Integer[ pseudomatrix.length ];

        for ( int index = 0 ; index < pseudomatrix.length ; ++index )
        {
            matrix[ index ] = pseudomatrix[ index ];
        }

        long startTime = System.currentTimeMillis();

        for ( int iteration = 0 ; iteration < ITERATIONS ; ++iteration )
        {
            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] = matrix[ index ] * scalar;
                }
            }

            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] = matrix[ index ] / scalar;
                }
            }
        }

        long stopTime    = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println( "Elapsed time for immutable integers: " + elapsedTime );

        for ( int index = 0 ; index < matrix.length ; ++index )
        {
            if ( matrix[ index ] != pseudomatrix[ index ] )
            {
                // When properly implemented, this message should never be printed.

                System.out.println( "The matrices are not equal." );

                break;
            }
        }
    }

    private static class PseudoRational
    {
        private int value;

        public PseudoRational( int value )
        {
            this.value = value;
        }

        public PseudoRational multiply( PseudoRational that )
        {
            return new PseudoRational( this.value * that.value );
        }

        public PseudoRational divide( PseudoRational that )
        {
            return new PseudoRational( this.value / that.value );
        }
    }

    private void testImmutablePseudoRationals()
    {
        PseudoRational[] matrix = new PseudoRational[ pseudomatrix.length ];

        for ( int index = 0 ; index < pseudomatrix.length ; ++index )
        {
            matrix[ index ] = new PseudoRational( pseudomatrix[ index ] );
        }

        long startTime = System.currentTimeMillis();

        for ( int iteration = 0 ; iteration < ITERATIONS ; ++iteration )
        {
            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] = matrix[ index ].multiply( new PseudoRational( scalar ) );
                }
            }

            for ( int scalar : scalars )
            {
                for ( int index = 0 ; index < matrix.length ; ++index )
                {
                    matrix[ index ] = matrix[ index ].divide( new PseudoRational( scalar ) );
                }
            }
        }

        long stopTime    = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println( "Elapsed time for immutable pseudo-rational numbers: " + elapsedTime );

        for ( int index = 0 ; index < matrix.length ; ++index )
        {
            if ( matrix[ index ].value != pseudomatrix[ index ] )
            {
                // When properly implemented, this message should never be printed.

                System.out.println( "The matrices are not equal." );

                break;
            }
        }
    }

    private static class PseudoRationalVariable
    {
        private int value;

        public PseudoRationalVariable( int value )
        {
            this.value = value;
        }

        public void multiply( PseudoRationalVariable that )
        {
            this.value *= that.value;
        }

        public void divide( PseudoRationalVariable that )
        {
            this.value /= that.value;
        }
    }

    private void testMutablePseudoRationalVariables()
    {
        PseudoRationalVariable[] matrix = new PseudoRationalVariable[ pseudomatrix.length ];

        for ( int index = 0 ; index < pseudomatrix.length ; ++index )
        {
            matrix[ index ] = new PseudoRationalVariable( pseudomatrix[ index ] );
        }

        long startTime = System.currentTimeMillis();

        for ( int iteration = 0 ; iteration < ITERATIONS ; ++iteration )
        {
            for ( int scalar : scalars )
            {
                for ( PseudoRationalVariable variable : matrix )
                {
                    variable.multiply( new PseudoRationalVariable( scalar ) );
                }
            }

            for ( int scalar : scalars )
            {
                for ( PseudoRationalVariable variable : matrix )
                {
                    variable.divide( new PseudoRationalVariable( scalar ) );
                }
            }
        }

        long stopTime    = System.currentTimeMillis();
        long elapsedTime = stopTime - startTime;

        System.out.println( "Elapsed time for mutable pseudo-rational variables: " + elapsedTime );

        for ( int index = 0 ; index < matrix.length ; ++index )
        {
            if ( matrix[ index ].value != pseudomatrix[ index ] )
            {
                // When properly implemented, this message should never be printed.

                System.out.println( "The matrices are not equal." );

                break;
            }
        }
    }

    public static void main( String [ ] args )
    {
        MutableOrImmutable object = new MutableOrImmutable();

        object.testMutablePrimitives();
        object.testImmutableIntegers();
        object.testImmutablePseudoRationals();
        object.testMutablePseudoRationalVariables();
    }
}

Footnote: 脚注:

The core problem with mutable vs. immutable classes is the -- highly questionable --"hashcode" method on Object : 可变类与不可变类的核心问题是Object上的“高度可疑”的“ hashcode”方法:

The general contract of hashCode is: hashCode的一般约定为:

  • Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. 在Java应用程序的执行过程中,只要在同一对象上多次调用它,则hashCode方法必须一致地返回相同的整数,前提是未修改该对象的equals比较中使用的信息。 This integer need not remain consistent from one execution of an application to another execution of the same application. 从一个应用程序的执行到同一应用程序的另一执行,此整数不必保持一致。

  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. 如果根据equals(Object)方法,两个对象相等,则在两个对象中的每个对象上调用hashCode方法必须产生相同的整数结果。

  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. 根据equals(java.lang.Object)方法,如果两个对象不相等,则不需要在两个对象中的每个对象上调用hashCode方法必须产生不同的整数结果。 However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables. 但是,程序员应该意识到,为不相等的对象生成不同的整数结果可能会提高哈希表的性能。

But once an object is added to a collection dependent on the value of its hash code derived from its internal state used to determine "equality," it is no longer properly hashed into the collection when it's state changes. 但是,一旦将对象添加到依赖于其内部代码(用于确定“相等性”的内部状态)的哈希码值而添加到集合中时,状态更改时,就不再将其适当地哈希到集合中。 Yes, the burden is on the programmer to ensure that mutable objects are not improperly stored in collections, but the burden is even greater on the maintenance programmer unless improper use of a mutable class is not prevented in the first place. 是的,程序员有责任确保不会将可变对象正确地存储在集合中,但是维护程序员的负担更大,除非首先避免了不正确使用可变类的情况。 This is why I believe the right "answer" for "hashcode" on mutable objects is to always throw an UnsupportedOperationException while still implementing "equals" to determine object equality -- think of matrices which you want to compare for equality, but would never think to add to a Set. 这就是为什么我认为可变对象上“哈希码”的正确“答案”是总是抛出UnsupportedOperationException,同时仍然实现“等于”来确定对象是否相等-想想要比较相等的矩阵,但永远不会想到添加到集合中。 However, there may be an argument that throwing an exception is a violation of the above "contract" with dire consequences of its own. 但是,可能有一个论点,认为抛出异常是对上述“合同”的违反,并带来其自身的可怕后果。 In that case, hashing all instances of a mutable class to the same value may be the "correct" way to maintain the contract despite the very poor nature of the implementation. 在那种情况下,尽管实现的性质很差,但是将可变类的所有实例散列为相同的值可能是维护合同的“正确”方法。 Is returning a constant value -- perhaps generated from hashing the class name -- recommended over throwing an exception? 是否建议返回一个常量值(可能是通过对类名进行哈希处理而生成的),而不是抛出异常?

You have written: "mathematical expressions become burdened with excessive object allocation and subsequent garbage collection when performing even simple mathematical operations." 您已经写道:“即使执行简单的数学运算,数学表达式也会负担过多的对象分配和后续的垃圾回收工作。” and "The expression includes three allocations -- two of which are quickly discarded". 和“该表达式包括三个分配-其中两个被迅速丢弃”。

Modern Garbage collectors are actually optimised for this pattern of allocation, so your (implicit) assumption that allocation and subsequent garbage collection is expensive, is wrong. 现代垃圾收集器实际上已针对这种分配模式进行了优化,因此您(隐式)的假设是分配和后续垃圾收集的成本很高,这是错误的。

For example, see this white-paper: http://www.oracle.com/technetwork/java/whitepaper-135217.html#garbage Under "Generational Copying Collection", it states: 例如,请参见此白皮书: http : //www.oracle.com/technetwork/java/whitepaper-135217.html#garbage在“ Generational Copying Collection”下,它指出:

"... First, because new objects are allocated contiguously in stack-like fashion in the object nursery, allocation becomes extremely fast, since it merely involves updating a single pointer and performing a single check for nursery overflow. Secondly, by the time the nursery overflows, most of the objects in the nursery are already dead, allowing the garbage collector to simply move the few surviving objects elsewhere, and avoid doing any reclamation work for dead objects in the nursery." “ ...首先,由于在对象苗圃中以堆栈状方式连续分配新对象,因此分配变得非常快,因为它只涉及更新单个指针并对苗圃溢出执行单个检查。其次,托儿所溢出,托儿所中的大多数对象已经死了,垃圾收集器可以将尚存的一些物体简单地移到其他地方,并避免对托儿所中的死物进行任何填埋工作。”

Thus I would suggest that the answer to your real question is that you should use Immutable objects, because the perceived costs are not really costs at all, but the perceived benefits (eg simplicity, code readability) are real benefits. 因此,我建议您对真实问题的答案是,您应该使用不可变对象,因为感知成本根本不是真正的成本,但是感知收益(例如,简单性,代码可读性)才是真正的收益。

One pattern which can be useful is to define an abstract type or interface for a "readable" thing, and then have both mutable and immutable forms of it. 一种有用的模式是为“可读”事物定义抽象类型或接口,然后具有可变形式和不可变形式。 This pattern can be particularly nice if the base or interface types includes AsMutable , AsNewMutable , and AsImmutable methods which can be overridden in suitable fashion in a derived object. 如果基本或接口类型包括AsMutableAsNewMutableAsImmutable方法,可以在派生对象中以适当的方式覆盖AsImmutable方法,则此模式特别好。 Such an approach allows one to achieve the benefits of mutability when they are desired, and yet also receive the benefits of using an immutable type. 这种方法允许人们在需要时获得可变性的好处,并且还获得使用不可变类型的好处。 Code which wants to persist a value but not mutate it would have to use "defensive copying" if it worked with a mutable type, but could instead use AsImmutable if it receives a "readable" thing. 想要保留一个值但不对其进行突变的代码,如果它与可变类型一起工作,则必须使用“防御性复制”,但是如果接收到“可读”的东西,则可以使用AsImmutable If the thing happens to be mutable, it would make a copy, but if it's immutable, no copy would be necessary. 如果事物碰巧是可变的,它将进行复制,但是如果它是不可变的,则不需要复制。

Incidentally, if one is designing an immutable type with relatively few fields other than a reference to a large object that holds the acutal data, and if things of the type will frequently be compared for equality, it may be helpful to have each type hold a unique sequence number as well as a reference to the oldest instance, if any, to which it is known to be equal (or null if no older instance is known to exist). 顺便说一句,如果要设计一个不可变类型,而除了要引用一个保存有实际数据的大型对象之外,它的字段要相对较少,并且如果经常比较该类型的事物是否相等,那么让每种类型都保持一个不变可能会有所帮助。唯一序号,以及对已知与之相等的最旧实例的引用(如果有)(如果不存在较旧的实例,则为null)。 When comparing two instances for equality, determine the oldest instance which is known to matches each (recursively check the oldest known instance until it is null). 比较两个实例是否相等时,请确定已知的最旧实例与每个实例匹配(递归检查最旧的已知实例,直到它为空)。 If both instances are known to match the same instance, they are equal. 如果已知两个实例都匹配相同的实例,则它们相等。 If not, but they turn out to be equal, then whichever "older instance" was younger should regard the other as an older instance to which it is equal. 如果不是,但结果是相等的,则无论哪个“较旧的实例”较年轻,都应将另一个视为与之相等的较旧的实例。 The approach will yield accelerated comparisons much as interning would, but without the use of a separate interning dictionary, and without having to hash values. 该方法将产生与intern相当的加速比较,但是无需使用单独的interning字典,也不必哈希值。

Currently, I am implementing the rational numbers with immutable objects. 目前,我正在使用不可变对象实现有理数。 This allows heavy reuse of ZERO and ONE objects which occur frequently in the computations I need to perform. 这允许大量重用零和一个对象,而这在我需要执行的计算中经常发生。 However, the Matrix class which is implemented with rational number elements is mutable -- going so far as to use null internally for "virtual" zeros. 但是,用有理数元素实现的Matrix类是可变的-甚至可以在内部将null用作“虚拟”零。 With a more pressing need to handle "small" rational numbers and arbitrary-precision "big" rational numbers seamlessly, the immutable implementation is acceptable for now until I have time to profile the library of problems I have available for that purpose so as to determine whether mutable objects or a larger set of "common" immutable objects will win the day. 迫切需要无缝处理“小”有理数和任意精度“大”有理数,因此直到我有时间描述为此目的可用的问题库之前,不可变的实现现在是可以接受的无论是可变对象还是更大的“公共”不可变对象集都将赢得胜利。

Of course, if I end up needing to implement "equals" to test Matrix equality, I will be back to the same issue with Matrix "hashcode" when the possible need for the method is highly unlikely. 当然,如果最终我需要实现“等于”来测试Matrix相等性,那么当该方法的可能需求极不可能时,我将回到与Matrix“哈希码”相同的问题。 Which takes me back again to the rather useless complaint that "hashcode" -- and probably "equals" as well -- should never have been part of the java.lang.Object contract in the first place... 这又使我再次回想起毫无用处的抱怨,即“散列码”(可能还有“等于”)从一开始就不应该成为java.lang.Object合同的一部分...

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM