[英]Javascript is taking too long to extract a unique array out of a (multiline) string
我有一個巨大的數據集,它被加載到 pre 標簽中,如下所示。
00:00:00 INFO SERVER-SYSTEM - Cmd Line Arg: sysName = SERVER
00:00:01 INFO SERVER-SYSTEM - Cmd Line Arg: resultsDirName = github
00:00:02 INFO SERVER-SYSTEM - Cmd Line Arg: Device4Branch = //github/server_manager01/test1
00:00:02 FAIL SERVER-SYSTEM - Cmd Line Arg: testCase = server_manager01
00:00:03 INFO SERVER-SYSTEM - Cmd Line Arg: timestamp_style = RELATIVE
00:00:04 INFO SERVER-SYSTEM - Cmd Line Arg: token = 36
00:00:04 FAIL SERVER-SYSTEM - Cmd Line Arg: Campaign = True
將有大約 30,000 多行,我想將唯一的單詞存儲在一個數組中。 以下是從 div 中獲取數據的代碼行,其中包含此預標記數據並將由空格分隔的唯一單詞存儲到數組中。
pre_data = document.getElementById("data_div").innerHTML.split('\n');
var words = [];
var reg = new RegExp("\\S*", "ig");
for (x = 0; x < pre_data.length; x++) {
words = words .concat(pre_data[x].match(reg));
}
// To remove null values
filtered_data = words .filter(function (el) {
return el != ''; });
// Set gives unique data
unique_data = Array.from(new Set(filtered_data ));
但是如果有 30,000+ 行,這需要 10+ 秒。 什么可能是更快獲得它的有效方法?
我不確定這是否是您正在尋找的,但這就是我的做法;
w => w
PS:您不必像我一樣創建 3 個變量,您可以鏈接數組函數並根據需要創建一個變量
var linesArray = document.getElementById('my-pre').innerHTML.split('\n'); var words = linesArray.reduce((acc, cur) => { return acc.concat(cur.split(' ').filter(w => w)); }, []) var uniqueWords = words.filter((value, index, self) => self.indexOf(value) === index) console.log(uniqueWords)
<pre id="my-pre"> Hello Stackoverflow I'm having some issues with extracting a unique array out of my multiline string this is a test to see if the the array is unique enough this is a test to see if the the array is unique enough this is a test to see if the the array is unique enough </pre>
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.