简体   繁体   English

.push 不是 web 爬虫中的 function

[英].push is not a function in web crawler

I am writing a node JS web crawler class, and I have encountered the following error, this.textInvertedIndex[word].push is not a function .我在写一个节点JS web爬虫class,遇到如下错误, this.textInvertedIndex[word].push is not a function Upon further inspection I realised that for some reason this.textInvertedIndex[word] was written as a native object, function Object({ [native code] }) .经过进一步检查,我意识到由于某种原因this.textInvertedIndex[word]被写为原生 object, function Object({ [native code] }) For the first few iterations, by console logging this.textInvertedIndex everything seemed fine as it was an object of arrays.对于前几次迭代,通过控制台记录 this.textInvertedIndex 一切似乎都很好,因为它是 arrays 的 object。 But then suddenly this error occurred.但是突然发生了这个错误。 Is there any part of the code where I am implicitly rewriting textInvertedIndex?代码中有没有隐式重写 textInvertedIndex 的部分?

Here is the relevant class:这是相关的 class:

function Crawler(queue, maxIndexSize) {
  this.queue = queue;
  this.maxIndexSize = maxIndexSize;

  this.findChunks = () => {
    let currentChunk;
    let minimumDistance = Infinity;

    for (i = 1; i <= this.maxIndexSize; i++) {
      if (this.maxIndexSize % i === 0) {
        const newDistance = Math.abs(i - 30);

        if (newDistance < minimumDistance) {
          minimumDistance = newDistance;
          currentChunk = i;
        } else {
          return currentChunk
        };
      };
    };
  };

  this.chunks = this.findChunks();
  this.chunkSize = this.maxIndexSize / this.chunks;
  this.totalWordOccurances = {};
  this.imageInvertedIndex = {};
  this.textInvertedIndex = {};
  this.images = [];
  this.sites = [];
  this.seen = {};

  this.write = (url, html) => {
    const documentId = this.sites.length;
    const website = new Website(url, html);
    const title = website.title();
    const content = website.content(title);
    const words = content.filter(item => typeof item !== "object");
    const wordsLength = words.length;
    const query = new Query(words);
    const individualWords = query.individualize(words);

    this.seen[url] = true;

    this.sites.push({
      url,
      title,
      description: website.description()
    });

    for (word of individualWords) {
      const normalizedTf = query.count(word) / wordsLength;
      const textInvertedIndexEntry = {
        documentId,
        normalizedTf
      };

      if (this.textInvertedIndex[word]) {
        this.textInvertedIndex[word].push(textInvertedIndexEntry);
      } else {
        this.textInvertedIndex[word] = [textInvertedIndexEntry];
      };

      if (this.totalWordOccurances[word]) {
        this.totalWordOccurances[word] += 1;
      } else {
        this.totalWordOccurances[word] = 1;
      };
    };

    for (i = 0; i < content.length; i++) {
      const item = content[i];

      if (typeof item === "object") {
        const imageId = this.images.length;

        this.images.push(item);

        for (word of individualWords) {
          const imageScore = getImageScore(i, word, content);
          const imageInvertedIndexEntry = {
            imageId,
            imageScore
          };

          if (this.imageInvertedIndex[word]) {
            this.imageInvertedIndex[word].push(imageInvertedIndexEntry);
          } else {
            this.imageInvertedIndex[word] = [imageInvertedIndexEntry];
          };
        };
      };
    };
  };

  this.crawl = async () => {
    while (this.sites.length !== this.maxIndexSize) {
      let nextQueue = [];
      const websitesUnfiltered = await Promise.all(this.queue.map((url) => {
        const website = new Website(url);

        return website.request();
      }));
      const websitesToAdd = this.maxIndexSize - this.sites.length;
      let websites = websitesUnfiltered.filter(message => message !== "Failure")
                                       .slice(0, websitesToAdd);
      
      for (site of websites) {
        const url = site.url;
        const htmlCode = site.htmlCode;
        const website = new Website(url, htmlCode);

        this.write(url, htmlCode);

        nextQueue = nextQueue.concat(website.urls());
      };

      nextQueue = new Query(nextQueue.filter(url => !this.seen[url]))
                                      .individualize();
      this.queue = nextQueue;
    };
  };
};

Called like this像这样调用

const crawler = new Crawler(["https://stanford.edu/"], 25000000);
crawler.crawl();

this.textInvertedIndex = {}; is defining an Object of which push is not a valid function.正在定义一个 Object,其中 push 不是有效的 function。 you can change it to an array by defining it as this.textInvertedIndex = [];您可以通过将其定义为this.textInvertedIndex = [];将其更改为数组otherwise you can add key/value entries to the object as it is defined like this: this.textInvertedIndex[key] = value;否则,您可以将键/值条目添加到 object,因为它的定义如下: this.textInvertedIndex[key] = value;

Turns out, my key was accessing this.textInvertedIndex[word] .原来,我的关键是访问this.textInvertedIndex[word] And word was constructor .词是constructor constructor is already a built in object property so it can never be rewritten as an array with .push defined. constructor已经是一个内置的 object 属性,因此它永远不能被重写为定义了.push的数组。 To solve this problem, make all object keys capital, so constructor will become CONSTRUCTOR , thus making sure that already existing object properties are never called.为了解决这个问题,将所有 object 键都设为大写,因此constructor将变为CONSTRUCTOR ,从而确保永远不会调用已经存在的 object 属性。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM