駱駝：文件使用者組件“咬得比它所能咀嚼的更多”，管道因內存不足錯誤而終止

Question

我在Camel中定義了一條路由，該路由類似於以下內容：收到GET請求，在文件系統中創建了一個文件。 文件使用者將其拾取，從外部Web服務中獲取數據，然后通過POST將結果消息發送到其他Web服務。

下面的簡化代碼：

    // Update request goes on queue:
    from("restlet:http://localhost:9191/update?restletMethod=post")
    .routeId("Update via POST")
    [...some magic that defines a directory and file name based on request headers...]
    .to("file://cameldest/queue?allowNullBody=true&fileExist=Ignore")

    // Update gets processed
    from("file://cameldest/queue?delay=500&recursive=true&maxDepth=2&sortBy=file:parent;file:modified&preMove=inprogress&delete=true")
    .routeId("Update main route")
    .streamCaching() //otherwise stuff can't be sent to multiple endpoints
    [...enrich message from some web service using http4 component...]
    .multicast()
        .stopOnException()
        .to("direct:sendUpdate", "direct:dependencyCheck", "direct:saveXML")
    .end();

多播中的三個端點只是將結果消息發布到其他Web服務。

當隊列（即文件目錄cameldest ）為空時，這一切都很好。 正在使用cameldest/<subdir>創建文件，由文件使用者將其拾取並移入cameldest/<subdir>/inprogress ，並且將東西發送到三個外發POST都沒問題。

但是，一旦傳入的請求堆積了大約300,000個文件，進度就會變慢，最終由於內存不足錯誤 （超出了GC開銷限制）， 管道將失敗 。

通過增加日志記錄，我可以看到文件使用者輪詢基本上從未運行，因為它似乎對每次看到的所有文件負責 ，等待它們完成處理，然后才開始另一輪輪詢。除了（我假設）導致資源瓶頸之外，這還干擾了我的排序要求：一旦隊列中塞滿了數千條等待處理的消息，新消息將被天真地排序得更高-如果它們仍然被撿起-仍在等待那些已經“開始”的人。

現在，我已經試過maxMessagesPerPoll和eagerMaxMessagesPerPoll選項。 一開始它們似乎可以緩解問題，但經過多次調查后，我仍然在“開始的”困境中得到了數千個文件。

唯一maxMessages...就是使delay和maxMessages...瓶頸變得如此狹窄，以至於平均而言，處理將比文件輪詢周期更快。

顯然，那不是我想要的。 我希望我的管道盡快處理文件，但不要更快。 我期望文件使用者在路由繁忙時等待。

我犯了一個明顯的錯誤嗎？

（如果這是問題的一部分，那么我將在帶有XFS的Redhat 7機器上運行稍舊的Camel 2.14.0。）

Answer 1

嘗試將源文件端點上的maxMessagesPerPoll設置為一個較低的值，以使每次輪詢最多只能拾取X個文件，這也限制了您的Camel應用程序中的運行中消息總數。

您可以在Camel文檔中找到有關該選項的更多信息，該文件組件

Answer 2

除非您確實需要將數據另存為文件，否則我將提出一種替代解決方案。

從您的restlet使用者處，將每個請求發送到消息隊列應用程序，例如activemq或rabbitmq或類似的東西。 您很快就會在該隊列中收到很多消息，但這沒關系。

然后，將文件使用方替換為隊列使用方。 這將需要一些時間，但是每條消息都應分別處理並發送到所需的任何地方。 我已經用大約500 000條消息測試了rabbitmq，並且效果很好。 這也應減輕消費者的負擔。

Answer 3

簡短的答案是沒有答案：Camel的文件組件的sortBy選項太內存sortBy ，無法適應我的用例：

唯一性：如果文件已經存在，我不想將其放在隊列中。
優先級：標記為高優先級的文件應首先處理。
性能：擁有幾十萬個文件，甚至幾百萬個文件應該沒有問題。
FIFO ：（獎勵）最早的文件（按優先級排序）應首先獲取。

問題是，如果我正確閱讀了源代碼和文檔，則無論使用內置語言還是自定義可插拔sorter ，所有文件詳細信息都在內存中以執行排序。 文件組件總是會創建一個包含所有細節對象的列表，而且顯然會導致垃圾收集開銷的瘋狂額時輪詢許多文件經常。

大多數情況下，我的用例都能正常工作，而不必通過以下步驟使用數據庫或編寫自定義組件：

從父目錄上的一個文件使用方cameldest/queue移動到兩個使用方 ，子目錄遞歸地對子目錄中的文件（ cameldest/queue/high/ cameldest/queue/low/ ）進行分類，每個目錄一個，不進行任何排序。
通過/cameldest/queue/high/ 僅設置使用者以通過我的實際業務邏輯處理文件。
從/cameldest/queue/low設置使用者，以簡單地將文件從“ low”升級為“ high”（將其復制，即.to("file://cameldest/queue/high"); ）
至關重要的是，為了僅在高忙時將其從“低”提升為“高” ，請將路由策略附加到“高”以限制其他路由 ，即，如果“高”中有任何正在運行的消息，則將“低” ”
另外，我將ThrottlingInflightRoutePolicy添加到“高”，以防止它一次影響太多的交換。

想象一下，就像在機場辦理登機手續一樣，如果那里是空的話，就會邀請游客進入商務艙專用道。

這在負載下就像是一種魅力，即使數十萬個文件處於“低”隊列中，新消息（文件）也可以在幾秒鍾內直接處理成“高”。

該解決方案不能滿足的唯一要求是順序性：不能保證首先拾取較舊的文件，而是隨機拾取它們。 可以想象這樣一種情況，一堆穩定的傳入文件流可能導致一個特定的文件X總是很不走運，而且永遠不會被拾取。 但是，發生這種情況的機會很小。

可能的改進：當前，允許/中止將文件從“低”提升為“高”的提升的閾值設置為“高”飛行中的0條消息。一方面，這保證了放到“高”位置的文件將在執行從“低”位置進行的另一次升級之前得到處理，另一方面，這會導致有點停止啟動模式，尤其是在多線程環境中場景。 雖然這不是一個真正的問題，但其性能還是令人印象深刻的。

資源：

我的路線定義：

    ThrottlingInflightRoutePolicy trp = new ThrottlingInflightRoutePolicy();
    trp.setMaxInflightExchanges(50);

    SuspendOtherRoutePolicy sorp = new SuspendOtherRoutePolicy("lowPriority");

    from("file://cameldest/queue/low?delay=500&maxMessagesPerPoll=25&preMove=inprogress&delete=true")
    .routeId("lowPriority")
    .log("Copying over to high priority: ${in.headers."+Exchange.FILE_PATH+"}")
    .to("file://cameldest/queue/high");

    from("file://cameldest/queue/high?delay=500&maxMessagesPerPoll=25&preMove=inprogress&delete=true")
    .routeId("highPriority")
    .routePolicy(trp)
    .routePolicy(sorp)
    .threads(20)
    .log("Before: ${in.headers."+Exchange.FILE_PATH+"}")
    .delay(2000) // This is where business logic would happen
    .log("After: ${in.headers."+Exchange.FILE_PATH+"}")
    .stop();

我的SuspendOtherRoutePolicy ，像ThrottlingInflightRoutePolicy一樣松散地構建

public class SuspendOtherRoutePolicy extends RoutePolicySupport implements CamelContextAware {

    private CamelContext camelContext;
    private final Lock lock = new ReentrantLock();
    private String otherRouteId;

    public SuspendOtherRoutePolicy(String otherRouteId) {
        super();
        this.otherRouteId = otherRouteId;
    }

    @Override
    public CamelContext getCamelContext() {
        return camelContext;
    }

    @Override
    public void onStart(Route route) {
        super.onStart(route);
        if (camelContext.getRoute(otherRouteId) == null) {
            throw new IllegalArgumentException("There is no route with the id '" + otherRouteId + "'");
        }
    }

    @Override
    public void setCamelContext(CamelContext context) {
        camelContext = context;
    }

    @Override
    public void onExchangeDone(Route route, Exchange exchange) {
        //log.info("Exchange done on route " + route);
        Route otherRoute = camelContext.getRoute(otherRouteId);
        //log.info("Other route: " + otherRoute);
        throttle(route, otherRoute, exchange);
    }

    protected void throttle(Route route, Route otherRoute, Exchange exchange) {
        // this works the best when this logic is executed when the exchange is done
        Consumer consumer = otherRoute.getConsumer();

        int size = getSize(route, exchange);
        boolean stop = size > 0;
        if (stop) {
            try {
                lock.lock();
                stopConsumer(size, consumer);
            } catch (Exception e) {
                handleException(e);
            } finally {
                lock.unlock();
            }
        }

        // reload size in case a race condition with too many at once being invoked
        // so we need to ensure that we read the most current size and start the consumer if we are already to low
        size = getSize(route, exchange);
        boolean start = size == 0;
        if (start) {
            try {
                lock.lock();
                startConsumer(size, consumer);
            } catch (Exception e) {
                handleException(e);
            } finally {
                lock.unlock();
            }
        }
    }

    private int getSize(Route route, Exchange exchange) {
        return exchange.getContext().getInflightRepository().size(route.getId());
    }

    private void startConsumer(int size, Consumer consumer) throws Exception {
        boolean started = super.startConsumer(consumer);
        if (started) {
            log.info("Resuming the other consumer " + consumer);
        }
    }

    private void stopConsumer(int size, Consumer consumer) throws Exception {
        boolean stopped = super.stopConsumer(consumer);
        if (stopped) {
            log.info("Suspending the other consumer " + consumer);
        }
    }
}

駱駝：文件使用者組件“咬得比它所能咀嚼的更多”，管道因內存不足錯誤而終止

問題描述

3 個解決方案

解決方案1
2 2017-02-19 08:02:56

解決方案2
0 2017-02-20 08:07:55

解決方案3
0 已采納 2017-02-24 10:34:36

駱駝：文件使用者組件“咬得比它所能咀嚼的更多”，管道因內存不足錯誤而終止

問題描述

3 個解決方案

解決方案1 2 2017-02-19 08:02:56

解決方案2 0 2017-02-20 08:07:55

解決方案3 0 已采納 2017-02-24 10:34:36

解決方案1
2 2017-02-19 08:02:56

解決方案2
0 2017-02-20 08:07:55

解決方案3
0 已采納 2017-02-24 10:34:36