简体   繁体   English

使用Promise.all()通过await语句获取网址列表

[英]Using Promise.all() to fetch a list of urls with await statements

tl;dr - if you have to filter the promises (say for errored ones) don't use async functions tl; dr-如果必须过滤promise(例如错误的promise),请不要使用异步函数

I'm trying to fetch a list of urls with async and parse them, the problem is that if there's an error with one of the urls when I'm fetching - let's say for some reason the api endpoint doesn't exists - the program crushes on the parsing with the obvious error: 我正在尝试使用异步方式获取网址列表并解析它们,问题是如果我获取时其中一个网址出现错误-出于某种原因,我们可以说api端点不存在-该程序压碎具有明显错误的解析:

UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): TypeError: ext is not iterable

I've tried checking if the res.json() is undefined, but obviously that's not it as it complains about the entire 'ext' array of promises not being iterable. 我试过检查res.json()是否未定义,但显然不是这样,因为它抱怨承诺的整个“ ext”数组都是不可迭代的。

async function fetchAll() {
  let data
  let ext
  try {
    data = await Promise.all(urls.map(url=>fetch(url)))
  } catch (err) {
    console.log(err)
  }
  try {
    ext = await Promise.all(data.map(res => {
      if (res.json()==! 'undefined') { return res.json()}
    }))
  } catch (err) {
    console.log(err)
  }
  for (let item of ext) {
    console.log(ext)
  }
}

Question 1: 问题1:

How do I fix the above so it won't crash on an invalid address? 我该如何解决以上问题,使其不会在无效地址上崩溃?

Question 2: 问题2:

My next step is to write the extracted data to the database. 我的下一步是将提取的数据写入数据库。 Assuming the data size of 2-5mgb of content, is my approach of using Promise.all() memory efficient? 假设内容大小为2-5mgb,我使用Promise.all()内存的方法是否有效? Or will it be more memory efficient and otherwise to write a for loop which handles each fetch then on the same iteration writes to the database and only then handles the next fetch? 还是会提高内存效率,或者编写一个for循环来处理每个提取,然后在同一迭代中将其写入数据库,然后才处理下一个提取?

You have several problems with your code on a fundamental basis. 从根本上来说,您的代码有几个问题。 We should address those in order and the first is that you're not passing in any URLS! 我们应该按顺序解决这些问题,首先是您不传递任何URL!

async function fetchAll(urls) {
  let data
  let ext
  try {
    data = await Promise.all(urls.map(url=>fetch(url)))
  } catch (err) {
    console.log(err)
  }
  try {
    ext = await Promise.all(data.map(res => {
      if (res.json()==! 'undefined') { return res.json()}
    }))
  } catch (err) {
    console.log(err)
  }
  for (let item of ext) {
    console.log(ext)
  }
}

First you have several try catch blocks on DEPENDANT DATA. 首先,您在DEPENDENT DATA上有几个try catch块。 They should all be in a single try catch block: 它们都应该放在一个try catch块中:

async function fetchAll(urls) {
  try {
    let data = await Promise.all(urls.map(url=>fetch(url)))
    let ext = await Promise.all(data.map(res => {
      // also fixed the ==! 'undefined'
      if (res.json() !== undefined) { return res.json()}
    }))
    for (let item of ext) {
      console.log(ext)
    }
  } catch (err) {
    console.log(err)
  }
}

Next is the problem that res.json() returns a promise wrapped around an object if it exists 接下来是res.json()返回包装在对象周围的promise的问题

if (res.json() !== undefined) { return res.json()}

This is not how you should be using the .json() method. 这不是应该使用.json()方法的方式。 It will fail if there is no parsable json. 如果没有可解析的json,它将失败。 You should be putting a .catch on it 你应该在上面放一个.catch

async function fetchAll(urls) {
  try {
    let data = await Promise.all(urls.map(url => fetch(url).catch(err => err)))
    let ext = await Promise.all(data.map(res => res.json ? res.json().catch(err => err) : res))
    for (let item of ext) {
      console.log(ext)
    }
  } catch (err) {
    console.log(err)
  }
}

Now when it cannot fetch a URL, or parse a JSON you'll get the error and it will cascade down without throwing. 现在,当它无法获取URL或解析JSON时,您将收到错误,并且它将级联下来而不会抛出错误。 Now your try catch block will ONLY throw if there is a different error that happens. 现在,仅当发生其他错误时,您的try catch块才会抛出。

Of course this means we're putting an error handler on each promise and cascading the error, but that's not exactly a bad thing as it allows ALL of the fetches to happen and for you to distinguish which fetches failed. 当然,这意味着我们在每个promise上都放置了一个错误处理程序并层叠错误,但这并不是一件坏事,因为它允许发生所有提取,并让您区分哪些提取失败。 Which is a lot better than just having a generic handler for all fetches and not knowing which one failed. 这比仅对所有提取使用通用处理程序而不知道哪一个失败要好得多。

But now we have it in a form where we can see that there is some better optimizations that can be performed to the code 但是现在我们以某种形式看到它,可以看到可以对代码进行一些更好的优化

async function fetchAll(urls) {
  try {
    let ext = await Promise.all(
      urls.map(url => fetch(url)
        .then(r => r.json())
        .catch(error => ({ error, url }))
      )
    )
    for (let item of ext) {
      console.log(ext)
    }
  } catch (err) {
    console.log(err)
  }
}

Now with a much smaller footprint, better error handling, and readable, maintainable code, we can decide what we eventually want to return. 现在有了更小的占用空间,更好的错误处理以及可读,可维护的代码,我们可以决定最终要返回的内容。 Now the function can live wherever, be reused, and all it takes is a single array of simple GET URLs. 现在,该函数可以存在于任何地方,可以重复使用,并且只需要一个简单的GET URL数组即可。

Next step is to do something with them so we probably want to return the array, which will be wrapped in a promise, and realistically we want the error to bubble since we've handled each fetch error, so we should also remove the try catch. 下一步是对它们进行操作,因此我们可能希望返回将被包装在promise中的数组,并且实际上,由于处理了每个提取错误,我们希望错误冒泡,因此我们也应该删除try catch 。 At that point making it async no longer helps, and actively harms. 在这一点上,使其异步不再有用,并且会造成危害。 Eventually we get a small function that groups all URL resolutions, or errors with their respective URL that we can easily filter over, map over, and chain! 最终,我们得到了一个小功能,可以将所有URL分辨率或错误与各自的URL分组在一起,我们可以轻松地对其进行过滤,映射和链接!

function fetchAll(urls) {
  return Promise.all(
    urls.map(url => fetch(url)
      .then(r => r.json())
      .then(data => ({ data, url }))
      .catch(error => ({ error, url }))
    )
  )
}

Now we get back an array of similar objects, each with the url it fetched, and either data or an error field! 现在,我们返回一个相似对象的数组,每个对象都有它获取的URL,以及数据或错误字段! This makes chaining and inspecting SUPER easy. 这使得链接和检查超级容易。

Instead of fetch(url) on line 5, make your own function, customFetch , which calls fetch but maybe returns null, or an error object, instead of throwing. 代替第5行的fetch(url) ,创建您自己的函数customFetch ,该函数调用fetch,但可能返回null或错误对象,而不是抛出异常。

something like 就像是

async customFetch(url) {
    try {
       let result = await fetch(url);
       if (result.json) return await result.json();
    }
    catch(e) {return e}
}

Regarding question 1 , please refer to this: 关于问题1 ,请参考:

Handling errors in Promise.all 处理Promise.all中的错误

Promise.all is all or nothing. Promise.all是全部或全部。 It resolves once all promises in the array resolve, or reject as soon as one of them rejects. 一旦阵列中的所有承诺都解决,它就会解决,或者一旦其中一个拒绝,就立即拒绝。 In other words, it either resolves with an array of all resolved values, or rejects with a single error. 换句话说,它要么使用所有已解析值的数组进行解析,要么使用单个错误进行拒绝。

if (res.json()==! 'undefined')

Makes no sense whatsoever and is an asynchronous function. 毫无意义,它是一个异步函数。 Remove that condition and just return res.json() : 删除该条件,仅返回res.json()

try {
    ext = await Promise.all(data.map(res => res.json()))
} catch (err) {
    console.log(err)
}

Whether or not your approach is "best" or "memory efficient" is up for debate. 您的方法是“最佳”还是“内存高效”尚待争论。 Ask another question for that. 提出另一个问题。

You are getting a TypeError: ext is not iterable - because ext is still undefined when you caught an error and did not assign an array to it. 您将收到TypeError: ext is not iterable -因为当您捕获到错误并且未为其分配数组时ext仍未undefined Trying to loop over it will then throw an exception that you do not catch. 尝试遍历它会抛出一个您没有捕获的异常。

I guess you're looking for 我猜你在找

async function fetchAll() {
  try {
    const data = await Promise.all(urls.map(url => fetch(url)));
    const ext = await Promise.all(data.map(res => res.json()));
    for (let item of ext) {
      console.log(ext)
    }
  } catch (err) {
    console.log(err)
  }
}

You can have fetch and json not fail by catching the error and return a special Fail object that you will filter out later: 通过捕获错误并返回一个特殊的Fail对象,可以使fetchjson不会失败,您稍后将对其进行过滤:

function Fail(reason){this.reason=reason;};
const isFail = o => (o&&o.constructor)===Fail;
const isNotFail = o => !isFail(o);
const fetchAll = () =>
  Promise.all(
    urls.map(
      url=>
        fetch(url)
       .then(response=>response.json())
       .catch(error=>new Fail([url,error]))
    )
  );

//how to use:
fetchAll()
.then(
  results=>{
    const successes = results.filter(isNotFail);
    const fails = results.filter(isFail);

    fails.forEach(
      e=>console.log(`failed url:${e.reason[0]}, error:`,e.reason[1])
    )
  }
)

As for question 2: 至于问题2:

Depending on how many urls you got you may want to throttle your requests and if the urls come from a large file (gigabytes) you can use stream combined with the throttle. 根据您获得的URL数量,您可能希望限制您的请求,并且如果这些URL来自大文件(千兆字节),则可以将与限制结合使用。

async function fetchAll(url) {
    return Promise.all(
      url.map(
        async (n) => fetch(n).then(r => r.json())
      )
    );
 }

 fetchAll([...])
   .then(d => console.log(d))
   .catch(e => console.error(e));

  Will this work for you?

If you don't depend on every resource being a success I would have gone back to basics skipping async/await 如果您不依赖于每个资源都能成功,那么我将回到基础知识,跳过异步/等待

I would process each fetch individual so I could catch the error for just the one that fails 我会处理每个获取的个体,这样我就可以捕获失败的错误

function fetchAll() {
  const result = []
  const que = urls.map(url => 
    fetch(url)
    .then(res => res.json())
    .then(item => {
      result.push(item)
    })
    .catch(err => {
      // could't fetch resource or the
      // response was not a json response
    })
  )

  return Promise.all(que).then(() => result)
}

Something good @TKoL said: 好东西@TKoL说:

Promise.all errors whenever one of the internal promises errors, so whatever advice anyone gives you here, it will boil down to -- Make sure that you wrap the promises in an error handler before passing them to Promise.all Promise.all错误,只要其中一个内部承诺错误,那么无论有人在这里给您什么建议,它都会归结为-确保在将承诺传递给Promise.all之前,将承诺包装在错误处理程序中

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM