Can Distill detect updates/changes in multi page websites?

I’m trying to detect changes/updates to forms uploaded by authorities in their websites. In some websites, the forms may span across multiple pages. If a form in say page 2 is update, will Distill be able to detect the change and inform me about it?

Hi @scsb

Firstly, welcome to Distill’s forums!
Yes, this should be feasible.

There are usually 2 different scenarios:

  1. If the 2nd page of the form has a unique URL, you can add it simply as a new monitor in your watchlist
  2. In the scenario that this is not the case i.e. the URL remains the same upon navigating to the page, then you can use Macros to record the steps required to reach the page & subsequently monitor the changes to the page.

Please feel free to reach out to us, if you have any additional questions.

Best Regards,
Surya

2 Likes

Thank you @surya, I shall give that a go.

Meanwhile, I’ve thought of another question which I hope you can help me with.

If the form names stated on the website remains unchanged after it is updated in the background by the authorities and there are no date of update indicated on the website, will Distill be able to detect that changes has been made to the particular form? In other words, can Distill detect changes that are not visible on the website.

Hi @scsb

Distill’s tracking is not restricted to just the text content. For instance, we can track changes to attributes like href.

Is this helpful? If not, can you elaborate more on the exact problem that you are trying to address.

Best Regards,
Surya

Thank you @surya

I’ve tried to create a macro earlier but was unable to do so as there was an error msg stating “Upgrade account to create and use macros in cloud”.

Can you please confirm whether the macro feature is only available to paying users?

Hi @scsb

Yes, macros are currently available to our paying customers. You can find all our plans here.

Best,
Surya

Hi Surya,

I am trying to track the changes to the forms that are available on How to Apply for Dust Disease Care | icare through the href. However, the href option is not available in the drop down (refer attached screenshot). In this instance, what would be the best way for me to track the changes?

Also I am trying to track the changes to the land registry forms on Forms Library - NSW Land Registry Services
I noticed that in order to track the forms, I will need to expand the relevant row that bears the form heading and select the relevant forms and only one row can be expanded at a time. Will Distill be able to track the changes to the forms if I expand a certain row, then click Select Element at the bottom left and then select the relevant form (and select href), then pause Select Element, move on to expand another row, then click Select Element again to select another form?

Your assistance in this regard would be greatly appreciated.

Thank you.

Hi @scsb

For the website: How to Apply for Dust Disease Care | icare
Please try expanding the selection, as shown in the screenshot below, you should get the anchor tag & the corresponding href element:

In the case of Forms Library - NSW Land Registry Services, I believe if you select the forms which you’d like to track & the corresponding href attributes, it should work. Please feel free to reach out, if you have any challenges & we’ll be happy to help you out.

Best regards,
Surya

Hi Surya,

Thank you for your assistance. After expanding the selection in the iCare page, I managed to view and select the href attribute, however when I expand the webpage in the watchlist, I could see a few huge pdf images (refer image). Is there a way for me to get rid of those pdf images?

Thank you Surya.

Meanwhile, I have also tried to select the forms in Forms Library - NSW Land Registry Services. When I landed in the website, I sorted the forms by form number and selected the forms that I wish to track within each page. However after I saved the webpage, I realised that the forms that appeared in the webpage dropdown is different from what I have selected (refer image below). Would you be able to advise how can I properly track the forms that I have selected?

Hi @scsb

Can you please export the JSON for the corresponding monitor & share it with me?
Please feel free to take a look at this video, to export the JSON configuration.
I’d like to take a look at the configuration, to suggest you the best way to address this challenge.

Cheers,
Surya

Hi @scsb

I believe that in this scenario you will have to use a macro, if you need to track the sorted list of forms.
If you notice, when you click the sort button, the URL doesn’t change & hence tracking on this page will have to be setup using a macro.

Best,
Surya

Hi Surya,

Thank you for your reply.

Here’s the JSON for the Dust Diseases Form Monitor:

{“client”:{“local”:1},“data”:[{“name”:“NSW Forms / Dust Diseases Forms (7/6)”,“uri”:“https://www.icare.nsw.gov.au/injured-or-ill-people/work-related-dust-disease/make-an-application",“config”:“{\“selections\”:[{\“frames\”:[{\“index\”:0,\“excludes\”:[],\“includes\”:[{\“type\”:\“xpath\”,\“expr\”:\”(//main[@id=‘main’]/section[contains(@class,‘cm’)]/ul//a[contains(@class,‘link-item’)])[1]\",\“fields\”:[{\“name\”:\“text\”,\“type\”:\“builtin\”},{\“type\”:\“attribute\”,\“name\”:\“href\”}]},{"type":"xpath","expr":"(//main[@id=‘main’]/section[contains(@class,‘cm’)]/ul//a[contains(@class,‘link-item’)])[2]","fields":[{"name":"text","type":"builtin"},{"type":"attribute","name":"href"}]},{"type":"xpath","expr":"(//main[@id=‘main’]/section[contains(@class,‘cm’)]/ul//a[contains(@class,‘link-item’)])[3]","fields":[{"name":"text","type":"builtin"},{"type":"attribute","name":"href"}]},{"type":"xpath","expr":"(//main[@id=‘main’]/section[contains(@class,‘cm’)]/ul//a[contains(@class,‘link-item’)])[4]","fields":[{"name":"text","type":"builtin"},{"type":"attribute","name":"href"}]},{"type":"xpath","expr":"(//main[@id=‘main’]/section[contains(@class,‘cm’)]/ul//a[contains(@class,‘link-item’)])[6]","fields":[{"name":"text","type":"builtin"},{"type":"attribute","name":"href"}]},{"type":"xpath","expr":"(//main[@id=‘main’]/section[contains(@class,‘cm’)]/ul//a[contains(@class,‘link-item’)])[7]","fields":[{"name":"text","type":"builtin"},{"type":"attribute","name":"href"}]},{"type":"xpath","expr":"(//main[@id=‘main’]/section[contains(@class,‘cm’)]/ul//a[contains(@class,‘link-item’)])[5]","fields":[{"name":"text","type":"builtin"},{"type":"attribute","name":"href"}]}]}],"dynamic":true,"delay":2}],"ignoreEmptyText":true,"includeStyle":false,"dataAttr":"text"}”,“tags”:null,“content_type”:2,“state”:40,“schedule”:“{"type":"INTERVAL","params":{"interval":84688}}”,“ts”:“2023-06-07T07:57:34.855Z”,“datasource_id”:null}]}

Hi Surya,

I’ve set up a monitor a few days ago to track the changes to a number of Local Court forms on https://www.localcourt.nsw.gov.au/forms-and-fees/forms.html#Criminal1. Earlier, I received an alert and noted that some of the forms that I’ve selected has changed to other forms, aren’t the trackers set to the specific documents selected? I would be grateful if you can clarify the reason behind the change.

Thank you.

Hi Surya,

I’ve also received another alert for the District Court Forms. As seen below, I’ve only selected 8 forms to track however it has now added another form (the one highlighted in green). I am not sure why is this the case.

Hi @scsb

This can potentially arise because the order in which the forms are displayed on the page may have changed.
You may have to modify the selector configuration to ensure you always track those specific links of interest.

Cheers,
Surya

Hi Surya,

Thanks for your reply.

How should I modify the configuration to ensure that the specific links are tracked?

Hi Surya,

I hope you’ve been well.

As I have been working on other projects, I haven’t been able to update the webpage selection criteria for those websites in my watchlist so that specific links are tracked (so that the forms that I want to track won’t be untracked just because the arrangement in the source website has changed). I recall from my our previous conversation that you have suggested me to use the webapp for better tracking selectors. I have tried to do so today but I note that the selectors are similar to the ones done through the Google Chrome extension (refer screenshot). As such, can I trouble you to please share the steps on how I can improve the selection criteria once again? Thank you.

Hi @scsb

The selectors are a configuration associated with the monitor.

In this case, the most likely scenario is that you originally created the monitor using the web app in the cloud and switched it to local monitoring (using the extension).

Is there a way to uniquely identify the rows that you’d like to track/monitor?
Assuming the form code number does not change, you can try using an XPath as follows:

//*[contains(text(),'form code')]//../td

The above XPath can be broken down as follows:
a. Select the element whose text contains form code
b. The ‘//..’ selects the parent
c. The ‘/td’ selects the first occurrence of td element among the children

Note that the above selector will select the first element in the table row. If you need the 2nd td element, you’ll have to modify the XPath as follows:

//*[contains(text(),'form code')]//../td[2]

Please ensure to replace the form code corresponding to each form you want to track.
Hope that helps.

Cheers,
Surya

Hi Surya,

Thank you for your reply.

I have tried to select the elements and changed the selector type to XPath and I note that it seems to be selecting everything in the table. I believe I am supposed to update the text ‘page-content’ to the actual form code but I wasn’t sure what is the form code in this instance. Are you able to clarify further please?