Skip to content Skip to sidebar Skip to footer

Puppeteer - Update Content Of A Scraped Website

I want to create a cli script using nodejs to scrape the content of a sports results website that does not provide an api. I know how to manage the content scraping, but I have a d

Solution 1:

Here is a simplified example. The script opens https://time.is/ and logs the time to the console on each change of the site clock element. It uses page.exposeFunction() and MutationObserver.

import puppeteer from 'puppeteer';

const browser = await puppeteer.launch();

try {
  const [page] = await browser.pages();

  await page.goto('https://time.is/');
  await page.exposeFunction('updateTime', updateTime);

  await page.evaluate(() => {
    const clock = document.querySelector('#clock0_bg');

    const config = { subtree: true, childList: true, attributes: true, characterData: true };
    const callback = function () { window.updateTime(clock.innerText); };
    const observer = new MutationObserver(callback);
    observer.observe(clock, config);
  });
} catch (err) { console.error(err); }

function updateTime(time) {
  console.log(time);
}

Post a Comment for "Puppeteer - Update Content Of A Scraped Website"