簡體   English   中英

JsonParser 的解析方法大大減慢了我的代碼

[英]Parse method of JsonParser drastically slow down my code

我正在研究項目,該項目應該從 JSON 文件(包含有關波蘭代表的信息)中提取數據,並僅使用這些數據進行一些計算。

代碼正在正確執行,但一種方法大大減慢了一切。 我不是最好的描述,所以讓我們展示我的 Jsonreader 類
要點鏈接
(方法在第 17、43、50 行中使用)代碼看起來有點亂,但它工作正常,使用 jsonparser.parse 方法排除片段。 每個特使花費了不可接受的約 2 秒。 我必須更改那幾行,但我不知道如何更改。 我正在考慮規定 json 文件來映射對象,然后對其進行處理,但我不確定這是否是一個不錯的選擇。
(抱歉我的語法不好)

如何檢查問題是否出在 getContent 方法中?

您可以間接證明這一點:只需在您的 Web 瀏覽器網絡調試器選項卡中檢查您的服務 API 性能,或者測量簡單 wget 的時間,例如time wget YOUR_URL

我同意Andreas懷疑parse方法是邪惡的根源。 其實不是。 如果您仔細查看您的要點,您會發現parse方法接受委托閱讀器,該閱讀器實際上使用與遠程主機“連接”的底層輸入流。 I/O 通常是非常耗時的操作,尤其是網絡操作。 此外,在這里建立 HTTP 連接是一件昂貴的事情。 在我的機器上,我最終得到了以下平均時間:

  • 發出 HTTP 請求:第一個請求約 1.50..2.00 秒,連續請求約 0.50..1.00 秒;
  • 讀取數據:~0.80s(直到最后的愚蠢讀取,或 JSON 解析——並不重要,Gson 真的非常快;你也可以使用網絡調試器或time wget URL在瀏覽器中分析性能,如果你使用 Unix 終端)。

Andreas 建議的另一點是使用多個線程來並行運行獨立的任務。 不幸的是,這可以加快速度,但它不會給您帶來巨大的變化,因為您的服務訪問速度不是那么快。

Executing SingleThreadedDemo...
Executing SingleThreadedDemo took 1063935ms         = ~17:43
Executing MultiThreadedDemo...
Executing MultiThreadedDemo took 353044ms           = ~5:53

稍后運行demo給出了以下結果(快了大約3倍,不知道之前減速的真正原因是什么)

Executing SingleThreadedDemo...
Executing SingleThreadedDemo took 382249ms          = ~6:22
Executing MultiThreadedDemo...
Executing MultiThreadedDemo took 130502ms           = ~2:11
Executing MultiThreadedDemo...
Executing MultiThreadedDemo took 110119ms           = ~1:50

抽象演示程序

下面的類違反了一些好的 OOP 設計理念,但為了不增加類的總數,讓它的東西就在這里。

abstract class AbstractDemo
        implements Callable<List<EnvoyData>> {

    // Gson is thread-safe
    private static final Gson gson = new Gson();

    // JsonParser is thread-safe: https://groups.google.com/forum/#!topic/google-gson/u6hq2OVpszc
    private static final JsonParser jsonParser = new JsonParser();

    interface IPointsAndYearbooksConsumer {

        void acceptPointsAndYearbooks(SerializedDataPoints points, SerializedDataYears yearbooks);

    }

    interface ITripsConsumer {

        void acceptTrips(SerializedDataTrips trips);

    }

    AbstractDemo() {
    }

    protected abstract List<EnvoyData> doCall()
            throws Exception;

    // This implementation measures time (in milliseconds) taken for each demo call
    @Override
    public final List<EnvoyData> call()
            throws Exception {
        final String name = getClass().getSimpleName();
        final long start = currentTimeMillis();
        try {
            out.printf("Executing %s...\n", name);
            final List<EnvoyData> result = doCall();
            out.printf("Executing %s took %dms\n", name, currentTimeMillis() - start);
            return result;
        } catch ( final Exception ex ) {
            err.printf("Executing %s took %dms\n", name, currentTimeMillis() - start);
            throw ex;
        }
    }

    // This is a generic method that encapsulates generic pagination and lets you to iterate over the service pages in for-each style manner 
    static Iterable<JsonElement> jsonRequestsAt(final URL startUrl, final Function<? super JsonObject, URL> nextLinkExtrator, final JsonParser jsonParser) {
        return () -> new Iterator<JsonElement>() {
            private URL nextUrl = startUrl;

            @Override
            public boolean hasNext() {
                return nextUrl != null;
            }

            @Override
            public JsonElement next() {
                if ( nextUrl == null ) {
                    throw new NoSuchElementException();
                }
                try ( final Reader reader = readFrom(nextUrl) ) {
                    final JsonElement root = jsonParser.parse(reader);
                    nextUrl = nextLinkExtrator.apply(root.getAsJsonObject());
                    return root;
                } catch ( final IOException ex ) {
                    throw new RuntimeException(ex);
                }
            }
        };
    }

    // Just a helper method to iterate over the start response
    static Iterable<JsonElement> getAfterwords()
            throws MalformedURLException {
        return jsonRequestsAt(
                afterwordsUrl(),
                root -> {
                    try {
                        final JsonElement next = root.get("Links").getAsJsonObject().get("next");
                        return next != null ? new URL(next.getAsString()) : null;
                    } catch ( final MalformedURLException ex ) {
                        throw new RuntimeException(ex);
                    }
                },
                jsonParser
        );
    }

    // Just extract points and yearbooks.
    // You can return a custom data holder class, but this one uses consuming-style passing the results via its parameter consumer
    static void extractPointsAndYearbooks(final Reader reader, final IPointsAndYearbooksConsumer consumer) {
        final JsonObject expensesJsonObject = jsonParser.parse(reader)
                .getAsJsonObject()
                .get("layers")
                .getAsJsonObject()
                .get("wydatki")
                .getAsJsonObject();
        final SerializedDataPoints points = gson.fromJson(expensesJsonObject.get("punkty").getAsJsonArray(), SerializedDataPoints.class);
        final SerializedDataYears yearbooks = gson.fromJson(expensesJsonObject.get("roczniki").getAsJsonArray(), SerializedDataYears.class);
        consumer.acceptPointsAndYearbooks(points, yearbooks);
    }

    // The same as above but for another type of response
    static void extractTrips(final Reader reader, final ITripsConsumer consumer) {
        final JsonElement tripsJsonElement = jsonParser.parse(reader)
                .getAsJsonObject()
                .get("layers")
                .getAsJsonObject()
                .get("wyjazdy");
        final SerializedDataTrips trips = tripsJsonElement.isJsonArray()
                ? gson.fromJson(tripsJsonElement.getAsJsonArray(), SerializedDataTrips.class)
                : null;
        consumer.acceptTrips(trips);
    }

    // It might be a constant field, but the next methods are dynamic (parameter-dependent), so let them all be similar
    // Checked exceptions are not that evil, and let the call-site decide what to do with them
    static URL afterwordsUrl()
            throws MalformedURLException {
        return new URL("https://api-v3.mojepanstwo.pl/dane/poslowie.json");
    }

    // The same as above
    static URL afterwordsUrl(final int page)
            throws MalformedURLException {
        return new URL("https://api-v3.mojepanstwo.pl/dane/poslowie.json?_type=objects&page=" + page);
    }

    // The same as above
    static URL tripsUrl(final int envoyId)
            throws MalformedURLException {
        return new URL("https://api-v3.mojepanstwo.pl/dane/poslowie/" + envoyId + ".json?layers[]=wyjazdy");
    }

    // The same as above
    static URL expensesUrl(final int envoyId)
            throws MalformedURLException {
        return new URL("https://api-v3.mojepanstwo.pl/dane/poslowie/" + envoyId + ".json?layers[]=wydatki");
    }

    // Since jsonParser is encapsulated
    static JsonElement parseJsonElement(final Reader reader) {
        return jsonParser.parse(reader);
    }

    // A helper method to return a reader for the given URL
    static Reader readFrom(final URL url)
            throws IOException {
        final HttpURLConnection request = (HttpURLConnection) url.openConnection();
        request.connect();
        return new BufferedReader(new InputStreamReader((InputStream) request.getContent()));
    }

    // Waits for all futures used in multi-threaded demo
    // Not sure how good this method is since I'm not an expert in concurrent programming unfortunately
    static void waitForAllFutures(final Iterable<? extends Future<?>> futures)
            throws ExecutionException, InterruptedException {
        final Iterator<? extends Future<?>> iterator = futures.iterator();
        while ( iterator.hasNext() ) {
            final Future<?> future = iterator.next();
            future.get();
            iterator.remove();
        }
    }

}

單線程演示程序

最簡單的演示。 整個數據拉取在單個線程中執行,因此它往往是這里最慢的演示。 這個是完全線程安全的,沒有字段,可以聲明為單例。

final class SingleThreadedDemo
        extends AbstractDemo {

    private static final Callable<List<EnvoyData>> singleThreadedDemo = new SingleThreadedDemo();

    private SingleThreadedDemo() {
    }

    static Callable<List<EnvoyData>> getSingleThreadedDemo() {
        return singleThreadedDemo;
    }

    @Override
    protected List<EnvoyData> doCall()
            throws IOException {
        final List<EnvoyData> envoys = new ArrayList<>();
        for ( final JsonElement afterwordJsonElement : getAfterwords() ) {
            final JsonArray dataObjectArray = afterwordJsonElement.getAsJsonObject().get("Dataobject").getAsJsonArray();
            for ( final JsonElement dataObjectElement : (Iterable<JsonElement>) dataObjectArray::iterator ) {
                final int envoyId = dataObjectElement.getAsJsonObject().get("id").getAsInt();
                try ( final Reader expensesReader = readFrom(expensesUrl(envoyId)) ) {
                    extractPointsAndYearbooks(expensesReader, (points, yearbooks) -> {
                        // ... consume points and yearbooks here
                    });
                }
                try ( final Reader tripsReader = readFrom(tripsUrl(envoyId)) ) {
                    extractTrips(tripsReader, trips -> {
                        // ... consume trips here
                    });
                }
            }
        }
        return envoys;
    }

}

多線程演示程序

不幸的是,我在 Java 並發方面真的很弱,而且這些多線程演示可能會得到顯着改進。 這個使用兩種方法的半多線程演示:

  • 一個用於遍歷頁面的線程;
  • 多個線程來獲取積分、年鑒和旅行數據。

另請注意,此演示(以及下面的另一個多線程演示)不是故障安全的:如果在提交的任務中出現任何異常,則執行程序服務后台線程將不會正確終止。 因此,您可能希望自己使其具有故障安全性和穩健性。

final class MultiThreadedDemo
        extends AbstractDemo {

    private final ExecutorService executorService;

    private MultiThreadedDemo(final ExecutorService executorService) {
        this.executorService = executorService;
    }

    static Callable<List<EnvoyData>> getMultiThreadedDemo(final ExecutorService executorService) {
        return new MultiThreadedDemo(executorService);
    }

    @Override
    protected List<EnvoyData> doCall()
            throws InterruptedException, ExecutionException, MalformedURLException {
        final List<EnvoyData> envoys = synchronizedList(new ArrayList<>());
        final Collection<Future<?>> futures = new ConcurrentLinkedQueue<>();
        for ( final JsonElement afterwordJsonElement : getAfterwords() ) {
            final JsonArray dataObjectArray = afterwordJsonElement.getAsJsonObject().get("Dataobject").getAsJsonArray();
            for ( final JsonElement dataObjectElement : (Iterable<JsonElement>) dataObjectArray::iterator ) {
                final int envoyId = dataObjectElement.getAsJsonObject().get("id").getAsJsonPrimitive().getAsInt();
                submitExtractPointsAndYearbooks(futures, envoyId);
                submitExtractTrips(futures, envoyId);
            }
        }
        waitForAllFutures(futures);
        return envoys;
    }

    private void submitExtractPointsAndYearbooks(final Collection<? super Future<?>> futures, final int envoyId) {
        futures.add(executorService.submit(() -> {
            try ( final Reader expensesReader = readFrom(expensesUrl(envoyId)) ) {
                extractPointsAndYearbooks(expensesReader, (points, yearbooks) -> {
                    // ... consume points and yearbooks here
                });
                return null;
            }
        }));
    }

    private void submitExtractTrips(final Collection<? super Future<?>> futures, final int envoyId) {
        futures.add(executorService.submit(() -> {
            try ( final Reader tripsReader = readFrom(tripsUrl(envoyId)) ) {
                extractTrips(tripsReader, trips -> {
                    // ... consume trips here
                });
                return null;
            }
        }));
    }

}

MultiThreadedEstimatedPagesDemo.java

這是上一個演示的增強版。 但是這個演示提交執行器服務任務以迭代服務頁面。 要實現它,需要事先檢測頁數。 並且擁有頁數可以使https://...poslowie.json?...page=... URL 並行處理。 請注意,如果找到超過 1 個頁面,則下一次迭代從第 2 個頁面開始,而不是進行重復請求。

final class MultiThreadedEstimatedPagesDemo
        extends AbstractDemo {

    private final ExecutorService executorService;

    private MultiThreadedEstimatedPagesDemo(final ExecutorService executorService) {
        this.executorService = executorService;
    }

    static Callable<List<EnvoyData>> getMultiThreadedEstimatedPagesDemo(final ExecutorService executorService) {
        return new MultiThreadedEstimatedPagesDemo(executorService);
    }

    @Override
    protected List<EnvoyData> doCall()
            throws IOException, JsonIOException, JsonSyntaxException, InterruptedException, ExecutionException {
        final List<EnvoyData> envoys = synchronizedList(new ArrayList<>());
        final JsonObject page1RootJsonObject;
        final int totalPages;
        try ( final Reader page1Reader = readFrom(afterwordsUrl()) ) {
            page1RootJsonObject = parseJsonElement(page1Reader).getAsJsonObject();
            totalPages = estimateTotalPages(page1RootJsonObject);
        }
        final Collection<Future<?>> futures = new ConcurrentLinkedQueue<>();
        futures.add(executorService.submit(() -> {
            final JsonArray dataObjectArray = page1RootJsonObject.getAsJsonObject().get("Dataobject").getAsJsonArray();
            for ( final JsonElement dataObjectElement : (Iterable<JsonElement>) dataObjectArray::iterator ) {
                final int envoyId = dataObjectElement.getAsJsonObject().get("id").getAsInt();
                submitExtractPointsAndYearbooks(futures, envoyId);
                submitExtractTrips(futures, envoyId);
            }
            return null;
        }));
        for ( int page = 2; page <= totalPages; page++ ) {
            final int finalPage = page;
            futures.add(executorService.submit(() -> {
                try ( final Reader reader = readFrom(afterwordsUrl(finalPage)) ) {
                    final JsonElement afterwordJsonElement = parseJsonElement(reader);
                    final JsonArray dataObjectArray = afterwordJsonElement.getAsJsonObject().get("Dataobject").getAsJsonArray();
                    for ( final JsonElement dataObjectElement : (Iterable<JsonElement>) dataObjectArray::iterator ) {
                        final int envoyId = dataObjectElement.getAsJsonObject().get("id").getAsInt();
                        submitExtractPointsAndYearbooks(futures, envoyId);
                        submitExtractTrips(futures, envoyId);
                    }
                }
                return null;
            }));
        }
        waitForAllFutures(futures);
        return envoys;
    }

    private static int estimateTotalPages(final JsonObject rootJsonObject) {
        final int elementsPerPage = rootJsonObject.get("Dataobject").getAsJsonArray().size();
        final int totalElements = rootJsonObject.get("Count").getAsInt();
        return (int) ceil((double) totalElements / elementsPerPage);
    }

    private void submitExtractPointsAndYearbooks(final Collection<? super Future<?>> futures, final int envoyId) {
        futures.add(executorService.submit(() -> {
            try ( final Reader expensesReader = readFrom(expensesUrl(envoyId)) ) {
                extractPointsAndYearbooks(expensesReader, (points, yearbooks) -> {
                    // ... consume points and yearbooks here
                });
                return null;
            }
        }));
    }

    private void submitExtractTrips(final Collection<? super Future<?>> futures, final int envoyId) {
        futures.add(executorService.submit(() -> {
            try ( final Reader tripsReader = readFrom(tripsUrl(envoyId)) ) {
                extractTrips(tripsReader, trips -> {
                    // ... consume trips here
                });
                return null;
            }
        }));
    }

}

測試.java

和演示本身:

public final class Test {

    private Test() {
    }

    public static void main(final String... args)
            throws Exception {
        runSingleThreadedDemo();
        runMultiThreadedDemo();
        runMultiThreadedEstimatedPagesDemo();
    }

    private static void runSingleThreadedDemo()
            throws Exception {
        final Callable<?> singleThreadedDemo = getSingleThreadedDemo();
        singleThreadedDemo.call();
    }

    private static void runMultiThreadedDemo()
            throws Exception {
        final ExecutorService executorService = newFixedThreadPool(getRuntime().availableProcessors());
        final Callable<?> demo = getMultiThreadedDemo(executorService);
        demo.call();
        executorService.shutdown();
    }

    private static void runMultiThreadedEstimatedPagesDemo()
            throws Exception {
        final ExecutorService executorService = newFixedThreadPool(getRuntime().availableProcessors());
        final Callable<?> demo = getMultiThreadedEstimatedPagesDemo(executorService);
        demo.call();
        executorService.shutdown();
    }

}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM