简体   繁体   English

实现同步的addAll到Java中的列表

[英]achieving synchronized addAll to a list in java

Updated the question.. please check secodn part of question 更新了问题..请检查问题的secodn部分

I need to build up a master list of book ids. 我需要建立一个书ID的主列表。 I have multiple threaded tasks which brings up a subset of book ids. 我有多个线程任务,这些任务带来了书ID的子集。 As soon as each task execution is completed, I need to add them to the super list of book ids. 每个任务执行完成后,我需要将它们添加到书籍ID的超级列表中。 Hence I am planning to pass below aggregator class instance to all of my execution tasks and have them call the updateBookIds() method. 因此,我打算将以下聚合器类实例传递给我的所有执行任务,并让它们调用updateBookIds()方法。 To ensure it's thread safe, I have kept the addAll code in synchronized block. 为了确保其线程安全,我将addAll代码保留在同步块中。

Can any one suggest is this same as Synchronized list? 有人可以建议与同步列表相同吗? Can I just say Collections.newSynchronizedList and call addAll to that list from all thread tasks? 我可以只说Collections.newSynchronizedList并从所有线程任务中向该列表调用addAll吗? Please clarify. 请澄清。

public class SynchronizedBookIdsAggregator {
    private List<String> bookIds;

    public SynchronizedBookIdsAggregator(){
        bookIds = new ArrayList<String>();
    }

    public void updateBookIds(List<String> ids){
        synchronized (this) {
            bookIds.addAll(ids);
        }
    }

    public List<String> getBookIds() {
        return bookIds;
    }

    public void setBookIds(List<String> bookIds) {
        this.bookIds = bookIds;
    }
}

Thanks, Harish 谢谢,哈里斯

Second Approach 第二种方法

So after below discussions, I am currently planning to go with below approach. 因此,在进行以下讨论之后,我目前计划采用以下方法。 Please let me know if I am doing anything wrong here:- 如果我在这里做错了任何事情,请告诉我:-

public class BooksManager{
    private static Logger logger = LoggerFactory.getLogger();

    private List<String> fetchMasterListOfBookIds(){    
        List<String> masterBookIds = Collections.synchronizedList(new ArrayList<String>());
        List<String> libraryCodes = getAllLibraries();

        ExecutorService libraryBookIdsExecutor = Executors.newFixedThreadPool(BookManagerConstants.LIBRARY_BOOK_IDS_EXECUTOR_POOL_SIZE);
        for(String libraryCode : libraryCodes){
            LibraryBookIdsCollectionTask libraryTask = new LibraryBookIdsCollectionTask(libraryCode, masterBookIds);
            libraryBookIdsExecutor.execute(libraryTask);
        }
        libraryBookIdsExecutor.shutdown();

        //Now the fetching of master list is complete.
        //So I will just continue my processing of the master list

    }
}

public class LibraryBookIdsCollectionTask implements Runnable {
    private String libraryCode;
    private List<String> masterBookIds;

    public LibraryBookIdsCollectionTask(String libraryCode,List<String> masterBookIds){
        this.libraryCode = libraryCode;
        this.masterBookIds = masterBookIds;
    }

    public void run(){
        List<String> bookids = new ArrayList<String>();//TODO get this list from iconnect call
        synchronized (masterBookIds) {
            masterBookIds.addAll(bookids);
        }
    }
}

Thanks, Harish 谢谢,哈里斯

Can I just say Collections.newSynchronizedList and call addAll to that list from all thread tasks? 我可以只说Collections.newSynchronizedList并从所有线程任务中向该列表调用addAll吗?

If you're referring to Collections.synchronizedList , then yes, that would work fine. 如果您引用的是Collections.synchronizedList ,那么可以,可以正常工作。 That will give you a object that implements the List interface where all of the methods from that interface are synchronized, including addAll . 这将为您提供一个实现List接口的对象,该接口中来自该接口的所有方法(包括addAll都将同步。

Consider sticking with what you have, though, since it's arguably a cleaner design. 不过,请考虑坚持自己拥有的东西,因为它可以说是一种更清洁的设计。 If you pass the raw List to your tasks, then they get access to all of the methods on that interface, whereas all they really need to know is that there's an addAll method. 如果将原始List传递给任务,则他们可以访问该接口上的所有方法,而他们真正需要知道的是有一个addAll方法。 Using your SynchronizedBookIdsAggregator keeps your tasks decoupled from design dependence on the List interface, and removes the temptation for them to call something other than addAll . 使用您的SynchronizedBookIdsAggregator可以使您的任务与List接口上的设计依赖性脱钩,并消除了他们调用addAll以外的东西的诱惑。

In cases like this, I tend to look for a Sink interface of some sort, but there never seems to be one around when I need it... 在这种情况下,我倾向于寻找某种Sink接口,但是在我需要它的时候似乎从来没有一个...

The code you have implemented does not create a synchronization point for someone who accesses the list via getBookIds() , which means they could see inconsistent data. 您实现的代码不会为通过getBookIds()访问列表的人创建同步点,这意味着他们可能会看到不一致的数据。 Furthermore, someone who has retrieved the list via getBookIds() must perform external synchronization before accessing the list. 此外,通过getBookIds()检索列表的人必须执行外部同步,然后才能访问列表。 Your question also doesn't show how you are actually using the SynchronizedBookIdsAggregator class, which leaves us with not enough information to fully answer your question. 您的问题也没有显示您实际上如何使用SynchronizedBookIdsAggregator类,这使我们没有足够的信息来完全回答您的问题。

Below would be a safer version of the class: 下面是该类的安全版本:

public class SynchronizedBookIdsAggregator {
    private List<String> bookIds;

    public SynchronizedBookIdsAggregator() {
        bookIds = new ArrayList<String>();
    }

    public void updateBookIds(List<String> ids){
        synchronized (this) {
            bookIds.addAll(ids);
        }
    }

    public List<String> getBookIds() {
        // synchronized here for memory visibility of the bookIds field
        synchronized(this) {
            return bookIds;
        }
    }

    public void setBookIds(List<String> bookIds) {
        // synchronized here for memory visibility of the bookIds field
        synchronized(this) {
            this.bookIds = bookIds;
        }
    }
}

As alluded to earlier, the above code still has a potential problem with some thread accessing the ArrayList after it has been retrieved by getBookIds() . 如前所述,上面的代码在通过getBookIds()检索到ArrayList之后仍然有一些线程访问它的潜在问题。 Since the ArrayList itself is not synchronized, accessing it after retrieving it should be synchronized on the chosen guard object: 由于ArrayList本身不同步,因此在获取它之后对其进行访问应该在所选的保护对象上同步:

public class SomeOtherClass {
    public void run() {
        SynchronizedBookIdsAggregator aggregator = getAggregator();
        List<String> bookIds = aggregator.getBookIds();
        // Access to the bookIds list must happen while synchronized on the
        // chosen guard object -- in this case, aggregator
        synchronized(aggregator) {
            <work with the bookIds list>
        }
    }
}

I can imagine using Collections.newSynchronizedList as part of the design of this aggregator, but it is not a panacea. 我可以想象使用Collections.newSynchronizedList作为此聚合器设计的一部分,但这不是万能的。 Concurrency design really requires an understanding of the underlying concerns, more than "picking the right tool / collection for the job" (although the latter is not unimportant). 并发设计确实需要了解潜在的问题,而不是“为工作选择合适的工具/集合”(尽管后者并不重要)。

Another potential option to look at is CopyOnWriteArrayList . 另一个潜在的选择是CopyOnWriteArrayList


As skaffman alluded to, it might be better to not allow direct access to the bookIds list at all (eg, remove the getter and setter). 正如skaffman所提到的,最好根本不允许直接访问bookIds列表(例如,删除getter和setter)。 If you enforce that all access to the list must run through methods written in SynchronizedBookIdsAggregator , then SynchronizedBookIdsAggregator can enforce all concurrency control of the list. 如果您强制所有对列表的访问必须通过用SynchronizedBookIdsAggregator编写的方法来运行,则SynchronizedBookIdsAggregator可以强制执行列表的所有并发控制。 As my answer above indicates, allowing consumers of the aggregator to use a "getter" to get the list creates a problem for the user of that list: to write correct code they must have knowledge of the synchronization strategy / guard object, and furthermore they must also use that knowledge to actively synchronize externally and correctly. 正如我上面的回答所表明的,允许聚合者的使用者使用“ getter”来获取列表会给该列表的用户带来一个问题:要编写正确的代码,他们必须了解同步策略/保护对象,而且他们还必须使用该知识来主动进行外部和正确同步。


Regarding your second approach. 关于第二种方法。 What you have shown looks technically correct (good!). 您所显示的内容在技术上看起来是正确的(好!)。

But , presumably you are going to read from masterBookIds at some point, too? 但是 ,大概您masterBookIds在某个时候阅读masterBookIds吗? And you don't show or describe that part of the program! 而且您不会显示或描述程序的那部分! So when you start thinking about when and how you are going to read masterBookIds (ie the return value of fetchMasterListOfBookIds() ), just remember to consider concurrency concerns there too! 因此,当您开始考虑何时以及如何读取masterBookIds (即fetchMasterListOfBookIds()的返回值)时,请记住也要考虑并发问题! :) :)

If you make sure all tasks/worker threads have finished before you start reading masterBookIds , you shouldn't have to do anything special. 如果您在开始阅读masterBookIds之前确保所有任务/工作人员线程都已完成,则无需执行任何特殊操作。

But, at least in the code you have shown, you aren't ensuring that. 但是,至少在显示的代码中,您并不能确保做到这一点。

Note that libraryBookIdsExecutor.shutdown() returns immediately. 请注意, libraryBookIdsExecutor.shutdown()立即返回。 So if you start using the masterBookIds list immediately after fetchMasterListOfBookIds() returns, you will be reading masterBookIds while your worker threads are actively writing data to it, and this entails some extra considerations. 因此,如果在fetchMasterListOfBookIds()返回后立即开始使用masterBookIds列表, masterBookIds 工作线程正在masterBookIds主动写入数据的同时读取masterBookIds ,这需要一些额外的注意事项。

Maybe this is what you want -- maybe you want to read the collection while it is being written to, to show realtime results or something. 也许这就是您想要的-也许您想在写入集合时阅读它,以显示实时结果或其他内容。 But then you must consider synchronizing properly on the collection if you want to iterate over it while it is being written to. 但是,如果要在写入集合时对其进行迭代,则必须考虑在集合上进行正确同步。

If you would just like to make sure all writes to masterBookIds by worker threads have completed before fetchMasterListOfBookIds() returns, you could use ExecutorService.awaitTermination (in combination with .shutdown() , which you are already calling). 如果您只是想确保在fetchMasterListOfBookIds()返回之前,工作线程对masterBookIds的所有写操作都已完成,则可以使用ExecutorService.awaitTermination (与已经调用的.shutdown()结合使用)。

Collections.SynchronizedList (which is the wrapper type you'd get) would synchronize almost every method on either itself or a mutex object you pass to the constructor (or Collections.synchronizedList(...) ). Collections.SynchronizedList (您将获得的包装器类型)将使几乎每个方法本身或传递给构造函数的互斥对象(或Collections.synchronizedList(...) )同步。 Thus it would basically be the same as your approach. 因此,它基本上与您的方法相同。

All the methods called using the wrapper returned by Collections.synchronizedList() will be synchronized. 使用Collections.synchronizedList()返回的包装器调用的所有方法都将被同步。 This means that the addAll method of normal List when called by this wrapper will be something like this :- 这意味着普通的List的addAll方法在被该包装器调用时将类似于以下内容:

synchronized public static <T> boolean addAll(Collection<? super T> c, T... elements)

So, every method call for the list (using the reference returned and not the original reference) will be synchronized. 因此,对列表的每个方法调用(使用返回的引用而不是原始引用)将被同步。

However, there is no synchronization between different method calls. 但是,不同的方法调用之间没有同步。 Consider following code snippet :- 考虑以下代码片段:-

 List<String> l = Collections.synchronizedList(new ArrayList<String>);
 l.add("Hello");
 l.add("World");

While multiple threads are accessing the same code, it is quite possible that after Thread A has added "Hello", Thread B will start and again add "Hello" and "World" both to list and then Thread A resumes. 当多个线程正在访问同一代码时,很有可能在线程A添加“ Hello”之后,线程B将启动,然后再次将“ Hello”和“ World”都添加到列表中,然后线程A恢复。 So, list would have ["hello", "hello", "world", "world"] instead of ["hello", "world", hello", "world"] as was expected. This is just an example to show that list is not thread-safe between different method calls of the list. If we want the above code to have desired result, then it should be inside synchronized block with lock on list (or this). 因此,列表将具有[“ hello”,“ hello”,“ world”,“ world”]而不是预期的[“ hello”,“ world”,“ hello”,“ world”],这只是一个示例表明列表在列表的不同方法调用之间不是线程安全的,如果我们希望上面的代码具有期望的结果,则它应该在具有锁定列表(或此列表)的同步块内。

However, with your design there is only one method call. 但是,在您的设计中,只有一个方法调用。 SO IT IS SAME AS USING Collections.synchronizedList() . 就像使用Collections.synchronizedList()

Moreover, as Mike Clark rightly pointed out, you should also synchronized getBookIds() and setBookIds(). 此外,正如Mike Clark正确指出的那样,您还应该同步getBookIds()和setBookIds()。 And synchronizing it over List itself would be more clear since it is like locking the list before operating on it and unlocking it after operating. 并且通过List本身对其进行同步将更加清晰,因为就像在操作列表之前锁定列表,然后在操作之后解锁列表一样。 So that nothing in-between can use the List. 这样,中间的任何人都无法使用列表。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM