Skip to content

Latest commit

 

History

History
15 lines (9 loc) · 468 Bytes

crawler.md

File metadata and controls

15 lines (9 loc) · 468 Bytes

Too lazy to explain this one. Just read the code.

But here's a quick rundown:

  • Spider is a struct that represents a worker. It can "crawl", "fetch_page", and "add_related_pages"
  • Index is the shared data structure that holds all the data
  • Page is a struct that represents a page

This is the state diagram of a Spider

Spider State Diagram

In the UI you can see the graph of the pages that were crawled

UI