A simple setup to perform a breadth-first crawl of a given web-domain, storing per page (meta)data such as internal and external links etc and semantically split chunks with embeddings for consumption of a RAG applications.
A simple setup to perform a breadth-first crawl of a given web-domain, storing per page (meta)data such as internal and external links etc and semantically split chunks with embeddings for consumption of a RAG applications.
Read about our thoughts, projects and ideas.


