-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update mermaid graph of onigumo processing with a new approach #237
Conversation
You can play with graph by live service https://mermaid.live/ It is partially solving #232 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing the diagram causes a slight mismatch in the description above it: although “Onigumo is composed of three sequentially interconnected components” is still correct in some sense, it doesn’t mention the new Feeder part. I’d say something like “The crawling part of Onigumo is composed of three sequentially interconnected components”. We can do that in a follow-up pull request.
It would be possible to object that the diagram doesn’t reflect the actual architecture of the project at the moment, but it didn’t do so before the change either: there is no Parser (will be resolved by #89), no Operator, no Materializer, no flow from start to end… So I think we don’t need to wait with merging for the actual feeder to emerge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diagrams are very nice! ❤️ I like how the Feeder, the starting point, sits almost in the upper-left corner and the Materialized, the finishing point, is located in the bottom-right corner, nicely flowing like a text on a page.
spider_parser(🕷️ PARSER) | ||
spider_operator(🕷️ OPERATOR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea of using the emoji to get around GitHub not supporting fontawesome icons in Mermaid diagrams. 💪🏻
README.md
Outdated
start([START]) --> onigumo_feeder[FEEDER] | ||
onigumo_feeder -- .raw --> Crawling | ||
onigumo_feeder -- .urls --> Crawling | ||
onigumo_feeder -- .json --> Crawling | ||
Crawling --> spider_materializer(🕷️ MATERIALIZER) | ||
spider_materializer --> done([END]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we only use a single space,
start([START]) --> onigumo_feeder[FEEDER] | |
onigumo_feeder -- .raw --> Crawling | |
onigumo_feeder -- .urls --> Crawling | |
onigumo_feeder -- .json --> Crawling | |
Crawling --> spider_materializer(🕷️ MATERIALIZER) | |
spider_materializer --> done([END]) | |
start([START]) --> onigumo_feeder[FEEDER] | |
onigumo_feeder -- .raw --> Crawling | |
onigumo_feeder -- .urls --> Crawling | |
onigumo_feeder -- .json --> Crawling | |
Crawling --> spider_materializer(🕷️ MATERIALIZER) | |
spider_materializer --> done([END]) |
or align the items to table columns?
start([START]) --> onigumo_feeder[FEEDER] | |
onigumo_feeder -- .raw --> Crawling | |
onigumo_feeder -- .urls --> Crawling | |
onigumo_feeder -- .json --> Crawling | |
Crawling --> spider_materializer(🕷️ MATERIALIZER) | |
spider_materializer --> done([END]) | |
start([START]) --> onigumo_feeder[FEEDER] | |
onigumo_feeder -- .raw --> Crawling | |
onigumo_feeder -- .urls --> Crawling | |
onigumo_feeder -- .json --> Crawling | |
Crawling --> spider_materializer(🕷️ MATERIALIZER) | |
spider_materializer --> done([END]) |
I don’t have preference (at least now), but I’d like it to be at least consistent.
README.md
Outdated
spider_operator -. "<hash>.urls" .-> onigumo_downloader | ||
onigumo_downloader -. "<hash>.raw" .-> spider_parser | ||
spider_parser -. "<hash>.json" .-> spider_operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spider_operator -. "<hash>.urls" .-> onigumo_downloader | |
onigumo_downloader -. "<hash>.raw" .-> spider_parser | |
spider_parser -. "<hash>.json" .-> spider_operator | |
onigumo_downloader -. "<hash>.raw" .-> spider_parser | |
spider_operator -. "<hash>.urls" .-> onigumo_downloader | |
spider_parser -. "<hash>.json" .-> spider_operator |
spider_operator -. "<hash>.urls" .-> onigumo_downloader | |
onigumo_downloader -. "<hash>.raw" .-> spider_parser | |
spider_parser -. "<hash>.json" .-> spider_operator | |
onigumo_downloader -. "<hash>.raw" .-> spider_parser | |
spider_operator. -. "<hash>.urls" .-> onigumo_downloader | |
spider_parser -. "<hash>.json" .-> spider_operator |
README.md
Outdated
onigumo_feeder -- .json --> spider_operator | ||
spider_operator ---> spider_materializer | ||
onigumo_feeder -- .urls --> onigumo_downloader | ||
onigumo_feeder -- .raw --> spider_parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
onigumo_feeder -- .json --> spider_operator | |
spider_operator ---> spider_materializer | |
onigumo_feeder -- .urls --> onigumo_downloader | |
onigumo_feeder -- .raw --> spider_parser | |
onigumo_feeder -- .json --> spider_operator | |
spider_operator ---> spider_materializer | |
onigumo_feeder -- .urls --> onigumo_downloader | |
onigumo_feeder -- .raw --> spider_parser |
onigumo_feeder -- .json --> spider_operator | |
spider_operator ---> spider_materializer | |
onigumo_feeder -- .urls --> onigumo_downloader | |
onigumo_feeder -- .raw --> spider_parser | |
onigumo_feeder -- .json --> spider_operator | |
spider_operator ---> spider_materializer | |
onigumo_feeder -- .urls --> onigumo_downloader | |
onigumo_feeder -- .raw --> spider_parser |
README.md
Outdated
spider_operator ---> spider_materializer | ||
onigumo_feeder -- .urls --> onigumo_downloader | ||
onigumo_feeder -- .raw --> spider_parser | ||
|
||
spider_parser -. "<hash>.json" .-> spider_operator | ||
onigumo_downloader -. "<hash>.raw" .-> spider_parser | ||
spider_operator -. "<hash>.urls" .-> onigumo_downloader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spider_operator ---> spider_materializer | |
onigumo_feeder -- .urls --> onigumo_downloader | |
onigumo_feeder -- .raw --> spider_parser | |
spider_parser -. "<hash>.json" .-> spider_operator | |
onigumo_downloader -. "<hash>.raw" .-> spider_parser | |
spider_operator -. "<hash>.urls" .-> onigumo_downloader | |
onigumo_feeder -- .urls --> onigumo_downloader | |
onigumo_feeder -- .raw --> spider_parser | |
spider_parser -. "<hash>.json" .-> spider_operator | |
onigumo_downloader -. "<hash>.raw" .-> spider_parser | |
spider_operator -. "<hash>.urls" .-> onigumo_downloader | |
spider_operator ---> spider_materializer |
Let’s push the spider_operator ---> spider_materializer
relation to the bottom, so the Spider subgraph arrows are coupled together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed 🤝
README.md
Outdated
start([START]) --> onigumo_feeder[FEEDER] | ||
onigumo_feeder -- .raw --> Crawling | ||
onigumo_feeder -- .urls --> Crawling | ||
onigumo_feeder -- .json --> Crawling | ||
Crawling --> spider_materializer(🕷️ MATERIALIZER) | ||
spider_materializer --> done([END]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
start([START]) --> onigumo_feeder[FEEDER] | |
onigumo_feeder -- .raw --> Crawling | |
onigumo_feeder -- .urls --> Crawling | |
onigumo_feeder -- .json --> Crawling | |
Crawling --> spider_materializer(🕷️ MATERIALIZER) | |
spider_materializer --> done([END]) | |
start([START]) --> onigumo_feeder[FEEDER] | |
onigumo_feeder -- .raw --> Crawling | |
onigumo_feeder -- .urls --> Crawling | |
onigumo_feeder -- .json --> Crawling | |
Crawling --> spider_materializer(🕷️ MATERIALIZER) | |
spider_materializer --> done([END]) |
I‘d split this into smaller groups.
README.md
Outdated
onigumo_feeder -- .json --> spider_operator | ||
spider_operator ---> spider_materializer | ||
onigumo_feeder -- .urls --> onigumo_downloader | ||
onigumo_feeder -- .raw --> spider_parser | ||
|
||
spider_parser -. "<hash>.json" .-> spider_operator | ||
onigumo_downloader -. "<hash>.raw" .-> spider_parser | ||
spider_operator -. "<hash>.urls" .-> onigumo_downloader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
onigumo_feeder -- .json --> spider_operator | |
spider_operator ---> spider_materializer | |
onigumo_feeder -- .urls --> onigumo_downloader | |
onigumo_feeder -- .raw --> spider_parser | |
spider_parser -. "<hash>.json" .-> spider_operator | |
onigumo_downloader -. "<hash>.raw" .-> spider_parser | |
spider_operator -. "<hash>.urls" .-> onigumo_downloader | |
onigumo_feeder -- .json --> spider_operator | |
onigumo_feeder -- .urls --> onigumo_downloader | |
onigumo_feeder -- .raw --> spider_parser | |
spider_parser -. "<hash>.json" .-> spider_operator | |
onigumo_downloader -. "<hash>.raw" .-> spider_parser | |
spider_operator -. "<hash>.urls" .-> onigumo_downloader | |
spider_operator ---> spider_materializer |
See my comment from my previous review. I’d push the spider_operator --→ spider_materializer
relation to the bottom, so the Spider subgraph arrows are coupled together
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth updating the description “The flowchart below illustrates the flow of data between those parts:” to use a plural form “flowchart” now when there are two. But it’s not necessary: we may consider that to be one flowchart consisting of two parts. 😃
onigumo_downloader -. "<hash>.raw" .-> spider_parser | ||
spider_parser -. "<hash>.json" .-> spider_operator | ||
``` | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were thinking about adding some title to the second diagram. But I don’t know what it should say.
No description provided.