some questions #83

dpcdsg · 2025-02-17T07:11:32Z

I want to ask if only https://bigcode-bigcodebench-evaluator.hf.space/ can be used to generate scores after the results are generated.

terryyz · 2025-02-17T07:14:24Z

Yes, unless you just want to check the ground-truth pass rate. To evaluate any models, you need to do the generation first.

dpcdsg · 2025-02-17T07:19:47Z

I already have results generated by other models, and now I want to score them. Can I only use the https://bigcode-bigcodebench-evaluator.hf.space/ you provided to get the scores?

terryyz · 2025-02-17T07:25:28Z

Regarding the gradio (HF space) endpoint, please refer to the following note:

The gradio backend is hosted on the Hugging Face space by default. The default space can be sometimes slow, so we recommend you to use the gradio backend with a cloned bigcodebench-evaluator endpoint for faster evaluation. Otherwise, you can also use the e2b sandbox for evaluation, which is also pretty slow on the default machine.

For any other execution methods, please refer to ADVANCED_USAGE., where you can choose the execution from e2b, gradio, local. e2b is typically slower than the others. local requires you to either use the provided docker image or manually configure the local environment.

dpcdsg · 2025-02-17T07:27:09Z

OK, thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some questions #83

some questions #83

dpcdsg commented Feb 17, 2025

terryyz commented Feb 17, 2025

dpcdsg commented Feb 17, 2025

terryyz commented Feb 17, 2025

dpcdsg commented Feb 17, 2025

some questions #83

some questions #83

Comments

dpcdsg commented Feb 17, 2025

terryyz commented Feb 17, 2025

dpcdsg commented Feb 17, 2025

terryyz commented Feb 17, 2025

dpcdsg commented Feb 17, 2025