简体   繁体   English

将字符串拆分为多维数组

[英]splitting a string into a multidimensional array

I have a list of strings, I want to check if the string contains a specific word, and if it does split all the words in the string and add it to an associative array. 我有一个字符串列表,我想检查字符串是否包含一个特定的单词,以及是否确实将字符串中的所有单词拆分并将其添加到关联数组中。

myString = ['RT @Arsenal: Waiting for the international', 'We’re hungry for revenge @_nachomonreal on Saturday\'s match and aiming for a strong finish']

wordtoFind = ['@Arsenal']       

I want to loop through the wordtoFind and if it is in myString , split up myString into individual words and create an object like 我想通过循环wordtoFind ,如果它是在myString ,分裂myString为单个单词和创造这样一个对象

newWord = {@Arsenal:[{RT:1},{Waiting:1},{for:1},{the:1},{international:1}]}

for(z=0; z <wordtoFind.length; z++){
  for ( i = 0 ; i < myString.length; i++) {
    if (myString[i].indexOf(wordtoFind[z].key) > -1){
      myString[i].split(" ")
    }
  }
}

I would say something likes would work, this also counts the amount of occurrences of a word in a sentence. 我会说喜欢的东西会起作用,这也会计算一个单词在一个句子中出现的次数。 JavaScript does not have associative arrays like PHP for instance. JavaScript没有像PHP这样的关联数组。 They just have objects or numbered arrays : 它们只是objects或编号arrays

var myString = ['RT @Arsenal: Waiting for the international', 'We’re hungry for revenge @_nachomonreal on Saturday\'s match and aiming for a strong finish'];

var wordtoFind = ['@Arsenal'];

var result = {};

for(var i = 0, l = wordtoFind.length; i < l; i++) {

    for(var ii = 0, ll = myString.length; ii < ll; ii++) {
        if(myString[ii].indexOf(wordtoFind[i]) !== -1) {
            var split = myString[ii].split(' ');
            var resultpart = {};
            for(var iii = 0, lll = split.length; iii < lll; iii++) {
                if(split[iii] !== wordtoFind[i]) {
                    if(!resultpart.hasOwnProperty(split[iii])) {
                      resultpart[split[iii]] = 0;
                    }
                    resultpart[split[iii]]++;
                }
            }
            result[wordtoFind[i]] = resultpart;
        }
    }
}

console.log(result); 
//{"@Arsenal":{"RT":1,"Waiting":1,"for":1,"the":1,"international":1}}

This method makes use of the forEach -function and callbacks. 该方法利用了forEach函数和回调。 The containsWord-function was left with a for-loop for now to reduce some callbacks, this can obviously be changed. 现在,containsWord函数留有一个for循环,以减少某些回调,这显然可以更改。

var myString = [
    'RT @Arsenal: Waiting for the international',
    'We’re hungry for revenge @_nachomonreal on Saturday\'s match and aiming for a strong finish',
    '@Arsenal: one two three four two four three four three four'
];

var wordtoFind = ['@Arsenal'];

// define the preprocessor that is used before the equality check
function preprocessor(word) {
    return word.replace(':', '');
}

function findOccurences(array, search, callback, preprocessor) {
    var result = {};
    var count = 0;
    // calculate the maximum iterations
    var max = search.length * array.length;
    // iterate the search strings that should be matched
    search.forEach(function(needle) {
        // iterate the array of strings that should be searched in
        array.forEach(function(haystack) {
            if (containsWord(haystack, needle, preprocessor)) {
                var words = haystack.split(' ');
                // iterate every word to count the occurences and write them to the result
                words.forEach(function(word) {
                    countOccurence(result, needle, word);
                })
            }
            count++;
            // once every iteration finished, call the callback
            if (count == max) {
                callback && callback(result);
            }
        });
    });
}

function containsWord(haystack, needle, preprocessor) {
    var words = haystack.split(' ');
    for (var i = 0; i < words.length; i++) {
        var word = words[i];
        // preprocess a word before it's compared
        if (preprocessor) {
            word = preprocessor(word);
        }
        // if it matches return true
        if (word === needle) {
            return true;
        }
    }
    return false;
}

function countOccurence(result, key, word) {
    // add array to object if it doesn't exist yet
    if (!result.hasOwnProperty(key)) {
        result[key] = [];
    }
    var entry = result[key];
    // set the count to 0 if it doesn't exist yet
    if (!entry.hasOwnProperty(word)) {
        entry[word] = 0;
    }
    entry[word]++;
}

// call our function to find the occurences
findOccurences(myString, wordtoFind, function(result) {
    // do something with the result
    console.log(result);
}, preprocessor);

// output:
/*
 { '@Arsenal':
   [ RT: 1,
    '@Arsenal:': 2,
    Waiting: 1,
    for: 1,
    the: 1,
    international: 1,
    one: 1,
    two: 2,
    three: 3,
    four: 4 ] }
 */

Feel free to ask any questions, if the answer needs clarification. 如果答案需要澄清,请随时提出任何问题。

I hope this fits your needs. 我希望这符合您的需求。

You're on the right track. 您走在正确的轨道上。 You just need to store the split string into the associative array variable. 您只需要将拆分字符串存储到关联数组变量中即可。

var assocArr = [];
for(z=0; z <wordtoFind.length; z++){
     for ( i = 0 ; i < myString.length; i++) {
         if (myString[i].indexOf(wordtoFind[z]) > -1){

             myString[i].split(" ").forEach(function(word){
                 assocArr.push(word);
             });

         }
     }
}

I think the key problem that stuck you is the data structure. 我认为困扰您的关键问题是数据结构。 The optimal structure should be something like this: 最佳结构应如下所示:

{
    @Arsenal:[
        {RT:1, Waiting:1, for:1, the:1, international:1},
        {xxx:1, yyy:1, zzz:3}, //for there are multiple ones in 'myString' that contain the same '@Arsenal'
        {slkj:1, sldjfl:2, lsdkjf:1} //maybe more
    ]
    someOtherWord:[
        {},
        {},
        ....
    ]
}

And the code: 和代码:

var result = {};

//This function will return an object like {RT:1, Waiting:1, for:1, the:1, international:1}.
function calculateCount(string, key) {
    var wordCounts = {};
    string.split(" ").forEach(function (word) {
        if (word !== key) {
            if (wordCounts[word] === undefined) wordCounts[word] = 1;
            else wordCounts[word]++;
        }
    });
    return wordCounts;
}

//For each 'word to find' and each string that contain the 'word to find', push in that returned object {RT:1, Waiting:1, for:1, the:1, international:1}.
wordToFind.forEach(function (word) {
    var current = result[word] = [];
    myString.forEach(function (str) {
        if (str.indexOf(word) > -1) {
            current.push(
                calculateCount(str, word)
            );
        }
    });  //Missed the right parenthesis here
});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM