.halguru-webscraping.yaml

Represents the configuration settings for a website crawler or scraper. Defines parameters such as the name of the website, the starting URL, maximum allowed levels and pages, specific URL patterns to process, and connectors required for linking external components like LLMs and file systems.

StartUrl: https://www.example.com
MaxLevel: 5
MaxPages: 5
Pages: []
UrlsStartWith: []

Properties#

Name	Type	Required	Description
StartUrl	Url	✔️	The starting URL for the website.
MaxLevel	Integer	✔️	The maximum level allowed for processing or operations in the website.
MaxPages	Integer	✔️	The maximum number of pages to process for the website.
Pages	List	✔️	The collection of website pages configuration.
UrlsStartWith	List	✔️	The collection of URL prefixes used to filter and process relevant URLs.

Technical Information#

Property	Value
Path	`.halguru-webscraping.yaml:`
Internal Root Type	`WebScrapingHalConfiguration`
File Extension	`.halguru-webscraping.yaml`
JSON Schema	halguru-webscraping-schema.json

Last updated:		2026-01-26
Autogenerated:		Yes
AI powered:		Yes
Core version:		1.77.0