Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Firecrawl billing #15

Open
dmatora opened this issue Feb 6, 2025 · 7 comments
Open

Firecrawl billing #15

dmatora opened this issue Feb 6, 2025 · 7 comments

Comments

@dmatora
Copy link
Contributor

dmatora commented Feb 6, 2025

Ollama support gets rid of LLM bills
To get rid of Firecrawl billing, it must be replaced with something like puppeteer

@Demezy
Copy link
Contributor

Demezy commented Feb 6, 2025

Possible, corresponding feature request may be opened in firecrawl repo. Since you can self host it.

For this case, only env var for FirecrawlApp needed.

@dmatora
Copy link
Contributor Author

dmatora commented Feb 6, 2025

@Demezy can self hosted Firecrawl run without dependance on other API?

@Demezy
Copy link
Contributor

Demezy commented Feb 6, 2025

@dmatora afaik they use openai official endpoint. But product is open, guess, as with open-deep-research only base url exposure needed. But didn't really dive into docs, feel free to correct me.

So my skepticism point was regarding "remove firecrawl in prefer for puppetteer". Firecrawl is already a wrapper around browser afaik. Nevertheless its do great job for scrapping.

So, instead, I propose to address "add ollama support" to firecrawl, not open-deep-research

upd, in previous message poinded wrong url for self host guilde, updated

@dmatora
Copy link
Contributor Author

dmatora commented Feb 7, 2025

Well I tried running self hosted Firecrawl and it doesn't work unless you plug in scraping APIs that cost more than Firecrawl.

Firecrawl does use browser, but only for scraping, not for search.
For search it uses plain axios.get which gets banned by google instantly

I tried to replace axios.get with playwright call, and replacing headless session with headed so that I could fill in captcha and save cookies, but for some reason calling playwright from Firecrawl doesn't work (doing curl to playwright does)

I've already spent much more time on this than I can afford, not sure when I'll be able to have another look at this.

@dmatora
Copy link
Contributor Author

dmatora commented Feb 7, 2025

So, instead, I propose to address "add ollama support" to firecrawl, not open-deep-research

firecrawl claims to be functional without openAI api keys, while that is subject for dispute, whenever that's the case or not, and whenever ollama support is merged to firecrawl or not, that has nothing to do with open-deep-research need for ollama support, but that part is already working in my fork

@baoduy
Copy link

baoduy commented Feb 13, 2025

@Demezy can self hosted Firecrawl run without dependance on other API?

Yes you can self-host without any dependence here is sample of self-host on my docker.

Image

@dmatora
Copy link
Contributor Author

dmatora commented Feb 13, 2025

@baoduy how did you get google search working without API dependance?
I tried and it just didn't work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants