简体繁体 English

Web应用程序在群集环境中的要求是什么

[英]What are requirements for a web application to work in a cluster environment

原文 2011-05-06 15:47:48 5 4 java/ java-ee-6/ cluster-computing

I need to check if existing web application is ready the be deployed in a clustered environment. 我需要检查现有的Web应用程序是否已准备好部署在集群环境中。
Cluster : 集群：
Several Linux boxes. 几个Linux盒子。 The flow is controlled by a load balancer that is using simple round robin algorithm with sticky session. 该流程由负载均衡器控制，该负载均衡器使用具有粘性会话的简单循环算法。
Application Stateless (hopefully) java web application that retrieves content from back office and format it appropriately. 应用程序无状态（希望）java Web应用程序，它从后台检索内容并对其进行适当格式化。

I have access to the source code. 我可以访问源代码。 What should I check in the code to be sure that it will run in the cluster? 我应该检查代码以确保它将在群集中运行？

Check that something is not cached in a memory or file system that stores state of the application. 检查某些内容是否未缓存在存储应用程序状态的内存或文件系统中。
...Something else? ......还有别的吗？

4 个解决方案

If you're using EJBs (which is recommended if you access a DB), then here is a list of restrictions: 如果您正在使用EJB（如果您访问数据库则建议使用EJB），那么这里是一个限制列表：

http://java.sun.com/blueprints/qanda/ejb_tier/restrictions.html http://java.sun.com/blueprints/qanda/ejb_tier/restrictions.html

I guess similar restrictions apply to the web application. 我想类似的限制适用于Web应用程序。

The easiest way to check the application is to start by having it running on 2 servers with the same data so at startup both are in the same state. 检查应用程序的最简单方法是首先让它在具有相同数据的2台服务器上运行，这样在启动时两者都处于相同的状态。 Let's assume for a user to complete an operation, the browser will make 2 consecutive HTTP requests to your web app -- what you need to do is hit webserver 1 with first call and web server 2 with second call; 让我们假设一个用户完成一个操作，浏览器会向你的网络应用程序发出2个连续的HTTP请求 - 你需要做的就是点击第一个呼叫的网络服务器1和第二个呼叫的网络服务器2; then try the other way around, then with both requests going to the same webserver -- and if you get the same result each time you're very likely you have ready-to-cluster application. 然后尝试相反的方式，然后将两个请求转到同一个Web服务器 - 如果每次您很可能获得相同的结果，那么您就可以使用现成的群集应用程序。 (It doesn't mean the app IS ready to cluster as there might be object states etc it stores in memory which are not easy to spot from the front-end, but it gives you a higher probability that IT MIGHT BE ok to run in a cluster.) （这并不意味着应用程序是准备集群有可能是对象状态等它保存在内存中，其不容易从前端发现，但它给你一个更高的概率， 这可能是确定运行在一个集群。）

If its truly "stateless", there would be no problem, you could make any request of any server at any time and everything would just work. 如果它真正“无国籍”，那就没有问题，你可以随时提出任何服务器的任何请求，一切都会正常工作。 Most things aren't quite that easy so any sort of state would either have to be streamed to and from the page as it moves from client to server, or be stored on the back end, and have some sort of token passed back and forth in order to retrieve it from whatever shared data store you're using for that. 大多数事情都不是那么容易，所以当任何类型的状态从客户端移动到服务器或者存储在后端时，任何类型的状态都必须从页面流入和流出，并且具有来回传递的某种令牌为了从您正在使用的任何共享数据存储中检索它。 If they are using the HttpSession, then anything that is retrieved from the session, if modified, needs to be set back into the session with session.setAttribute(key,value). 如果他们正在使用HttpSession，那么从会话中检索到的任何内容（如果已修改）都需要使用session.setAttribute（key，value）重新设置到会话中。 This setting the attribute acts as a signal that whatever is being stored in the session needs to be replicated to the redundant servers. 此设置属性充当一个信号，表示会话中存储的任何内容都需要复制到冗余服务器。 Make sure anything stored in the session implements, and actually is, Serializable. 确保会话中存储的任何内容都实现，实际上是Serializable。 Some servers will allow you to store objects, (I'm looking at you weblogic), but will then throw an exception when it tries to replicate the object. 有些服务器允许你存储对象，（我正在看你weblogic），但是当它试图复制对象时会抛出异常。 I've had many a coworker complain that having to set stuff back to the session should be redundant, and perhaps it should, but this is just the way things work. 我有很多同事抱怨说必须将东西设置回会话应该是多余的，也许应该是，但这只是事情的运作方式。

Having state is not a big problem if done properly. 如果做得好，拥有州不是一个大问题。 Anyway, all applications have state. 无论如何，所有应用程序都有状态。 Even if serving somewhat static file, the file content associated with an URL is indeed part of the state. 即使提供某种静态文件，与URL关联的文件内容也确实是状态的一部分。

The problem is how this state is propagated and shared. 问题是如何传播和共享此状态。

state inside user session is a no brainer. 用户会话中的状态是没有道理的。 Use a session replication mechanism (slower but no session loss on node crash) or session sticky load balancer and your problem is solved. 使用会话复制机制（节点崩溃时会慢但没有会话丢失）或会话粘性负载均衡器，您的问题就解决了。

All other shared state is indeed a problem. 所有其他共享状态确实是一个问题。 In particular even cache state must be shared and perfectly coherent otherwise a refresh on the same page could generate different result on random depending on witch web server, and thus the cache you hit. 特别是即使缓存状态也必须共享并且完全一致，否则同一页面上的刷新可能会随机产生不同的结果，这取决于巫婆网络服务器，因此您点击的缓存。

You can still cache data using a shared cached (like ehcache), or failing back to session sticky. 您仍然可以使用共享缓存（如ehcache）缓存数据，或者无法返回会话粘性。

I guess it is pretty difficult to be sure that the application will indeed work in a clusterised environement because a singleton in some obscure service, a static member somewhere, anything can potentially produce strange results. 我想很难确定应用程序确实可以在集群环境中工作，因为在一些不起眼的服务中的单例，某个地方的静态成员，任何东西都可能产生奇怪的结果。 You can validate the general architecture for sure, but you'll need to do in reality and perform some validation test before going into production. 您可以确定一般体系结构，但是您需要在实际操作中进行一些验证测试才能投入生产。