Fetch so many things, at once
There is the fetch
API in Node, which allows us to make a HTTP request and get some information from the servers. We can use that to make REST calls, get HTML content of a webpage(if we are using node for scraping) and many more things.
This article is valid for any function that returns a promise.
An example of such call goes like this
js1fetch('/url')2 .then((res) => res.json())3 .then((data) => console.log(data));4
The Async way
We could do the same thing, using async and await.
js1const result = await fetch('/url');2const data = await result.json();34console.log(data);56// Or, a one-liner7// const data = await (await fetch('/url')).json(); 😉8
I have so many things to fetch!
Okay fine. We can do that over a classic for loop. The synchronous nature will be preserved. I mean, we can fetch one after the other, synchronously.
js1const urls = [...];2for(const url of urls) {3 const result = await fetch(url);4 const data = await result.json();56 console.log(data);7}8
But what if, the order does not matter? We can fetch them all at once. Yes, all at once, using the Promise API. After all, fetch
returns a promise and that's why we await
for it to be resolved.
Promise API has this method Promise.all()
, which can be awaited on for all the promises that it accepts as an argument to be resolved.
js1const urls = [...];2const promises = urls.map(url => fetch(url));34await Promise.all(promises);56for (const promise of promises) {7 const data = await promise.json();8 console.log(data);9}10
This will save us a lot of time. Imagine we want to parse many webpages, around 100, and each webpage takes 2 seconds to be fetched and scraped for information we need. If we fetch it one after the other, it will take us around 200 seconds, which is over 3 minutes. But if we fetch all at once, it will take under a minute.
Like, really SO MANY!
What is we have over 10000 urls to fetch. If we do the same thing as above, we will most probably not make it. We will have to face some weird socket hangup error. What can we do about it?
There is a node package called Bluebird
which has its own Promise API and it functions the same. It has this method called map
, which takes an extra options argument where we can set concurrency.
Promise.map(urls => fetch(url), { concurrency: 100 });
This will, as we can infer from the line, concurrently fetch 100 requests at a time. This will save a significant load on CPU.
js1const Promise = require('bluebird').Promise;2const urls = [...];3const promises = await Promise.map(4 urls => fetch(url),5 { concurrency: 100 }6);78for (const promise of promises) {9 const data = await promise.json();10 console.log(data);11}12
Thanks for making it till the end.
Keep on Hacking! ✌
By Aravind Balla, a Javascript Developer building things to solve problems faced by him & his friends. You should hit him up on Twitter!