What is a page tracker?

A page tracker is a utility that empowers developers to detect and monitor the content of any web page. Use cases range from ensuring that the deployed web application loads only the intended content throughout its lifecycle to tracking changes in arbitrary web content when the application lacks native tracking capabilities. In the event of a change, whether it's caused by a broken deployment or a legitimate content modification, the tracker promptly notifies the user.

NOTE

Currently, Secutils.dev doesn't support tracking content for web pages protected by application firewalls (WAF) or any form of CAPTCHA. If you require tracking content for such pages, please comment on #secutils/34 to discuss your use case.

On this page, you can find guides on creating and using page trackers.

note

The Content extractor script is essentially a Playwright scenario that allows you to extract almost anything from the web page as long as it doesn't exceed 1MB in size. For instance, you can include text, links, images, or even JSON.

Create a page tracker

In this guide, you'll create a simple page tracker for the top post on Hacker News:

Navigate to Web Scraping → Page trackers and click Track page button
Configure a new tracker with the following values:

Name

Hacker News Top Post

Frequency

Manually

Content extractor

export async function execute(page) {
  // Navigate to the Hacker News homepage.
  await page.goto('https://news.ycombinator.com/');

  // Get the link to the top post.
  const titleLink = page.locator('css=.titleline a').first();

  // Return the title and link of the top post formatted as markdown.
  return `[${(await titleLink.textContent()).trim()}](${await titleLink.getAttribute('href')})`;
};

Click the Save button to save the tracker
Once the tracker is set up, it will appear in the trackers grid
Expand the tracker's row and click the Update button to run it for the first time

After a few seconds, the tracker will fetch the content of the top post on Hacker News and display it below the tracker's row. The content includes only the title of the post. However, as noted at the beginning of this guide, the content extractor script allows you to return almost anything, even the entire HTML of the post.

Watch the video demo below to see all the steps mentioned earlier in action:

Detect changes with a page tracker

In this guide, you'll create a page tracker and test it with changing content:

Navigate to Web Scraping → Page trackers and click Track page button
Configure a new tracker with the following values:

Name

World Clock

Frequency

Hourly

Content extractor

export async function execute(page) {
  // Navigate to the Berlin world clock page.
  await page.goto('https://www.timeanddate.com/worldclock/germany/berlin');

  // Wait for the time element to be visible and get its value.
  const time = await page.locator('css=#qlook #ct').textContent();

  // Return the time formatted as markdown with a link to the world clock page.
  return `Berlin time is [**${time}**](https://www.timeanddate.com/worldclock/germany/berlin)`;
};

Click the Save button to save the tracker
Once the tracker is set up, it will appear in the trackers grid with bell and timer icons, indicating that the tracker is configured to regularly check content and send notifications when changes are detected
Expand the tracker's row and click the Update button to make the first snapshot of the web page content
After a few seconds, the tracker will fetch the current Berlin time and render a nice markdown with a link to a word clock website:

EXAMPLE

Berlin time is 01:02:03

With this configuration, the tracker will check the content of the web page every hour and notify you if any changes are detected.

Watch the video demo below to see all the steps mentioned earlier in action:

Track web page resources

You can also use page tracker utility to detect and track resources of any web page. This functionality falls under the category of synthetic monitoring tools and helps ensure that the deployed application loads only the intended web resources (JavaScript and CSS) during its lifetime. If any unintended changes occur, which could result from a broken deployment or malicious activity, the tracker will promptly notify developers or IT personnel about the detected anomalies.

Additionally, security researchers who focus on discovering potential vulnerabilities in third-party web applications can use page trackers to be notified when the application's resources change. This allows them to identify if the application has been upgraded, providing an opportunity to re-examine it and potentially discover new vulnerabilities.

EXAMPLE

Extracting all page resources isn't as straightforward as it might seem, so it's recommended to use the utilities provided by Secutils.dev, as demonstrated in the examples in the following sections. Utilities return CSS and JS resource descriptors with the following interfaces:

/**
 * Describes external or inline resource.
 */
interface WebPageResource {
  /**
   * Resource type, either 'script' or 'stylesheet'.
   */
  type: 'script' | 'stylesheet';

  /**
   * The URL resource is loaded from.
   */
  url?: string;

  /**
   * Resource content descriptor (size and digest), if available.
   */
  content: WebPageResourceContent;
}

/**
 * Describes resource content.
 */
interface WebPageResourceContent {
  /**
   * Resource content data.
   */
  data: { raw: string } | { tlsh: string } | { sha1: string };

  /**
   * Describes resource content data, it can either be the raw content data or a hash such as Trend Micro Locality
   * Sensitive Hash or simple SHA-1.
   */
  size: number;
}

In this guide, you'll create a simple page tracker to track resources of the Hacker News:

Navigate to Web Scraping → Page trackers and click Track page button
Configure a new tracker with the following values:

Name

Hacker News (resources)

Content extractor

export async function execute(page, { previousContent }) {
  // Load built-in utilities for tracking resources.
  const { resources: utils } = await import(`data:text/javascript,${encodeURIComponent(
    await (await fetch('https://secutils.dev/retrack/utilities.js')).text()
  )}`);

  // Start tracking resources.
  utils.startTracking(page);

  // Navigate to the target page.
  await page.goto('https://news.ycombinator.com');
  await page.waitForTimeout(1000);

  // Stop tracking and return resources.
  const resources = await utils.stopTracking(page);

  // Format resources as a table, 
  // showing diff status if previous content is available.
  return utils.formatAsTable(
    previousContent
      ? utils.setDiffStatus(previousContent.original.source, resources)
      : resources
  );
};

Click the Save button to save the tracker
Once the tracker is set up, it will appear in the trackers grid
Expand the tracker's row and click the Update button to make the first snapshot of the web page resources

It's hard to believe, but as of the time of writing, Hacker News continues to rely on just a single script and stylesheet!

Watch the video demo below to see all the steps mentioned earlier in action:

Filter web page resources

In this guide, you will create a page tracker for the GitHub home page and learn how to track only specific resources:

Navigate to Web Scraping → Page trackers and click Track page button
Configure a new tracker with the following values:

Name

GitHub

Content extractor

export async function execute(page, { previousContent }) {
  // Load built-in utilities for tracking resources.
  const { resources: utils } = await import(`data:text/javascript,${encodeURIComponent(
    await (await fetch('https://secutils.dev/retrack/utilities.js')).text()
  )}`);

  // Start tracking resources.
  utils.startTracking(page);

  // Navigate to the target page.
  await page.goto('https://github.com');
  await page.waitForTimeout(1000);

  // Stop tracking and return resources.
  const resources = await utils.stopTracking(page);

  // Format resources as a table, 
  // showing diff status if previous content is available.
  return utils.formatAsTable(
    previousContent
      ? utils.setDiffStatus(previousContent.original.source, resources)
      : resources
  );
};

Click the Save button to save the tracker
Once the tracker is set up, it will appear in the trackers grid
Expand the tracker's row and click the Update button to make the first snapshot of the web page resources
Once the tracker has fetched the resources, they will appear in the resources grid. You'll notice that there are nearly 100 resources used for the GitHub home page! In the case of large and complex pages like this one, it's recommended to have multiple separate trackers, e.g. one per logical functionality domain, to avoid overwhelming the developer with too many resources and consequently changes they might need to track. Let's say we're only interested in "vendored" resources.
To filter out all resources that are not "vendored", we'll adjust content extractor script. Click the pencil icon next to the tracker's name to edit the tracker and update the following properties:

Content extractor

export async function execute(page, { previousContent }) {
  // Load built-in utilities for tracking resources.
  const { resources: utils } = await import(`data:text/javascript,${encodeURIComponent(
    await (await fetch('https://secutils.dev/retrack/utilities.js')).text()
  )}`);

  // Start tracking resources.
  utils.startTracking(page);

  // Navigate to the target page.
  await page.goto('https://github.com');
  await page.waitForTimeout(1000);

  // Stop tracking and return resources.
  const allResources = await utils.stopTracking(page);

  // Filter out all resources that are not "vendored".
  const resources = {
    scripts: allResources.scripts.filter((resource) => resource.url?.includes('vendors')),
    styles: allResources.styles.filter((resource) => resource.url?.includes('vendors')),
  };

  // Format resources as a table,
  // showing diff status if previous content is available.
  return utils.formatAsTable(
    previousContent
      ? utils.setDiffStatus(previousContent.original.source, resources)
      : resources
  );
};

Now, click the Save button to save the tracker.
Click the Update button to re-fetch web page resources. Once the tracker has re-fetched resources, only about half of the previously extracted resources will appear in the resources grid.

Watch the video demo below to see all the steps mentioned earlier in action:

Detect changes in web page resources

In this guide, you will create a page tracker and test it using a custom HTML responder:

First, navigate to Webhooks → Responders and click Create responder button
Configure a few responders with the following values to emulate JavaScript files that we will track changes for across revisions:

This JavaScript will remain unchanged across revisions:

Name	`no-changes.js`
Path	`/no-changes.js`
Headers	`Content-Type: application/javascript; charset=utf-8`
Body	`document.body.insertAdjacentHTML( 'beforeend', 'Source: no-changes.js<br>' );`

This JavaScript will change across revisions:

Name	`changed.js`
Path	`/changed.js`
Headers	`Content-Type: application/javascript; charset=utf-8`
Body	`document.body.insertAdjacentHTML( 'beforeend', 'Source: changed.js, Changed: no<br>' );`

This JavaScript will be removed across revisions:

Name	`removed.js`
Path	`/removed.js`
Headers	`Content-Type: application/javascript; charset=utf-8`
Body	`document.body.insertAdjacentHTML( 'beforeend', 'Source: removed.js<br>' );`

This JavaScript will be added in a new revision:

Name	`added.js`
Path	`/added.js`
Headers	`Content-Type: application/javascript; charset=utf-8`
Body	`document.body.insertAdjacentHTML( 'beforeend', 'Source: added.js<br>' );`

Now, configure a new responder with the following values to respond with a simple HTML page that references previously created JavaScript responders (except for added.js):

Name	`track-me.html`
Path	`/track-me.html`
Headers	`Content-Type: text/html; charset=utf-8`
Body	`<!DOCTYPE html> <html lang="en"> <head> <title>Evaluate resources tracker</title> <script type="text/javascript" src="./no-changes.js" defer></script> <script type="text/javascript" src="./changed.js" defer></script> <script type="text/javascript" src="./removed.js" defer></script> </head> <body></body> </html>`

Click the Save button to save the responder
Once the responder is set up, it will appear in the responders grid along with its unique URL
Click on the responder's URL and make sure that it renders the following content:

Source: no-changes.js
Source: changed.js, Changed: no
Source: removed.js

Now, navigate to Web Scraping → Page trackers and click Track page button
Configure a new tracker for track-me.html responder with the following values:

Name	`Demo`
URL	`https://[YOUR UNIQUE ID].webhooks.secutils.dev/track-me.html`
Frequency	`Daily`
Notifications	`☑`
Content extractor	export async function execute(page, { previousContent }) { // Load built-in utilities for tracking resources. const { resources: utils } = await import(`data:text/javascript,${encodeURIComponent( await (await fetch('https://secutils.dev/retrack/utilities.js')).text() )}`); // Start tracking resources. utils.startTracking(page); // Navigate to the target page // Replace `[YOUR UNIQUE ID]` with your unique handle!. await page.goto('https://[YOUR UNIQUE ID].webhooks.secutils.dev/track-me.html'); await page.waitForTimeout(1000); // Stop tracking, and return resources. const resources = await utils.stopTracking(page); // Format resources as a table, // showing diff status if previous content is available. return utils.formatAsTable( previousContent ? utils.setDiffStatus(previousContent.original.source, resources) : resources ); };

TIP

Configured tracker will fetch the resources of the track-me.html responder once a day and notify you if any changes are detected. You can change the frequency and notification settings to suit your needs.

Click the Save button to save the tracker
Once the tracker is set up, it will appear in the trackers grid
Expand the tracker's row and click the Update button to make the first snapshot of the web page resources
Once the tracker has fetched the resources, they will appear in the resources grid:

Source	Diff	Type	Size
`https://[YOUR UNIQUE ID].webhooks.secutils.dev/no-change.js`	-	Script	81
`https://[YOUR UNIQUE ID].webhooks.secutils.dev/changed.js`	-	Script	91
`https://[YOUR UNIQUE ID].webhooks.secutils.dev/removed.js`	-	Script	78

Now, navigate to Webhooks → Responders and edit track-me.html responder to reference added.js responder, and remove reference to removed.js:

<!DOCTYPE html>
<html lang="en">
<head>
  <title>Evaluate resources tracker</title>
  <script type="text/javascript" src="./no-changes.js" defer></script>
  <script type="text/javascript" src="./changed.js" defer></script>
- <script type="text/javascript" src="./removed.js" defer></script>
+ <script type="text/javascript" src="./added.js" defer></script>
</head>
<body></body>
</html>

Next, change the body of the changed.js responder to something like this:

document.body.insertAdjacentHTML(
  'beforeend',
- 'Source: changed.js, Changed: no<br>'
+ 'Source: changed.js, Changed: yes<br>'
);

Finally, navigate to Web Scraping → Page trackers and expand the Demo tracker's row
Click Update button to fetch the next revision of the web page resources
Once the tracker has fetched updated resources, they will appear in the resources grid together with the diff status:

Source	Diff	Type	Size
`https://[YOUR UNIQUE ID].webhooks.secutils.dev/no-change.js`	-	Script	81
`https://[YOUR UNIQUE ID].webhooks.secutils.dev/changed.js`	Changed	Script	91
`https://[YOUR UNIQUE ID].webhooks.secutils.dev/added.js`	Added	Script	76
`https://[YOUR UNIQUE ID].webhooks.secutils.dev/removed.js`	Removed	Script	78

Annex: Content extractor script examples

In this section, you can find examples of content extractor scripts that extract various content from web pages. Essentially, the script defines a function with the following signature:

/**
 * Content extractor script that extracts content from a web page.
 * @param page - The Playwright Page object representing the web page.
 * See more details at https://playwright.dev/docs/api/class-page.
 * @param context.previousContent - The context extracted during 
 * the previous execution, if available.
 * @returns {Promise<unknown>} - The extracted content to be tracked.
 */
export async function execute(
  page: Page,
  context: { previousContent?: { original: unknown } }
)

Track markdown-style content

The script can return any valid markdown-style content that Secutils.dev will happily render in preview mode.

export async function execute() {
  return `
    ## Text
    ### h3 Heading
    #### h4 Heading
    
    **This is bold text**
    
    *This is italic text*
    
    ~~Strikethrough~~
    
    ## Lists
    
    * Item 1
    * Item 2
      * Item 2a
    
    ## Code
    
    \`\`\` js
    const foo = (bar) => {
      return bar++;
    };
    
    console.log(foo(5));
    \`\`\`
    
    ## Tables
    
    | Option   | Description   |
    | -------- | ------------- |
    | Option#1 | Description#1 |
    | Option#2 | Description#2 |
    
    ## Links
    
    [Link Text](https://secutils.dev)
    
    ## Emojies
    
    :wink: :cry: :laughing: :yum:
  `;
}

Track API response

You can use page tracker to track API responses as well (until dedicated API tracker utility is released). For instance, you can track the response of the JSONPlaceholder API:

NOTE

Ensure that the web page from which you're making a fetch request allows cross-origin requests. Otherwise, you'll get an error.

export async function execute() {
  const {url, method, headers, body} = {
    url: 'https://jsonplaceholder.typicode.com/posts',
    method: 'POST',
    headers: {'Content-Type': 'application/json; charset=UTF-8'},
    body: JSON.stringify({title: 'foo', body: 'bar', userId: 1}),
  };
  const response = await fetch(url, {method, headers, body});
  return {
    status: response.status,
    headers: Object.fromEntries(response.headers.entries()),
    body: (await response.text()) ?? '',
  };
}

Use previous content

In the content extract script, you can use the context.previousContent.original property to access the content extracted during the previous execution:

export async function execute(page, { previousContent }) {
  // Update counter based on the previous content.
  return (previousContent?.original ?? 0) + 1;
}

Use external content extractor script

Sometimes, your content extractor script can become large and complicated, making it hard to edit in the Secutils.dev UI. In such cases, you can develop and deploy the script separately in any development environment you prefer. Once the script is deployed, you can just use URL as the script content :

// This code assumes your script exports a function named `execute` function.
https://secutils-dev.github.io/secutils-sandbox/content-extractor-scripts/markdown-table.js

You can find more examples of content extractor scripts at the Secutils.dev Sandbox repository.

Annex: Custom cron schedules

NOTE

Custom cron schedules are available only for Pro subscription users.

In this section, you can learn more about the supported cron expression syntax used to configure custom tracking schedules. A cron expression is a string consisting of six or seven subexpressions that describe individual details of the schedule. These subexpressions, separated by white space, can contain any of the allowed values with various combinations of the allowed characters for that subexpression:

Subexpression	Mandatory	Allowed values	Allowed special characters
`Seconds`	Yes	0-59	* / , -
`Minutes`	Yes	0-59	* / , -
`Hours`	Yes	0-23	* / , -
`Day of month`	Yes	1-31	* / , - ?
`Month`	Yes	0-11 or JAN-DEC	* / , -
`Day of week`	Yes	1-7 or SUN-SAT	* / , - ?
`Year`	No	1970-2099	* / , -

Following the described cron syntax, you can create almost any schedule you want as long as the interval between two consecutive checks is longer than 10 minutes. Below are some examples of supported cron expressions:

Expression	Meaning
`0 0 12 * * ?`	Run at 12:00 (noon) every day
`0 15 10 ? * *`	Run at 10:15 every day
`0 15 10 * * ?`	Run at 10:15 every day
`0 15 10 * * ? *`	Run at 10:15 every day
`0 15 10 * * ? 2025`	Run at 10:15 every day during the year 2025
`0 0/10 14 * * ?`	Run every 10 minutes from 14:00 to 14:59, every day
`0 10,44 14 ? 3 WED`	Run at 14:10 and at 14:44 every Wednesday in March
`0 15 10 ? * MON-FRI`	Run at 10:15 from Monday to Friday
`0 11 15 8 10 ?`	Run every October 8 at 15:11

To assist you in creating custom cron schedules, Secutils.dev lists five upcoming scheduled times for the specified schedule:

Secutils.dev UI - Custom schedule

Create a page tracker​

Detect changes with a page tracker​

Track web page resources​

Filter web page resources​

Detect changes in web page resources​

Annex: Content extractor script examples​

Track markdown-style content​

Track API response​

Use previous content​

Use external content extractor script​

Annex: Custom cron schedules​

Create a page tracker

Detect changes with a page tracker

Track web page resources

Filter web page resources

Detect changes in web page resources

Annex: Content extractor script examples

Track markdown-style content

Track API response

Use previous content

Use external content extractor script

Annex: Custom cron schedules