host.promiseQueue Causing Cache Misses and Repeated Requests #972

anrgct · 2024-12-31T14:09:47Z

v1.87.0, I found that host.promiseQueue causes the cache to miss, resulting in repeated requests.

// repomap-test.genai.mjs
script({
    title: "generate repomap for the repo",
    model: "openai:deepseek-chat",
    cache: "repomap-test",
    files: ["src/greeter.ts", "src/counting.py"], 
})

let prompts = [
    `summarize the file in one sentence.`
]

async function processFile(current_file, prompts) {
    let result = [];
    for (const prompt of prompts) {
        const { text } = await runPrompt((_) => {
            _.def("FILE", current_file);
            _.$`=============`;
            _.$`${prompt}`;
        }, { system: [] , cache: "repomap-test" });
        result.push(text);
    }
    console.log(result)
}


// this does not hit "src/counting.py" cache
const queue = host.promiseQueue(2)
const summaries = await queue.mapAll(
    env.files,
    (file) => processFile(file, prompts)
)

// // this work fine
// for (const file of env.files) {
//     await processFile(file, prompts)
// }

The text was updated successfully, but these errors were encountered:

github-actions · 2024-12-31T14:14:18Z

Feedback:

The issue described is that host.promiseQueue is causing cache misses and repeated requests. This is likely due to the fact that the queue is not respecting the cache settings applied individually within each processFile call.

To resolve this, you should ensure that each call to runPrompt within processFile respected the cache options provided. This means that if a specific file's result is already cached under "repomap-test", it shouldn't be re-run for that file in a separate queue task. Instead, you might want to manually check the cache before running the prompt and only proceed if the result is not found.

Additionally, consider using host.promiseQueue options or passing additional context to ensure that each task within the queue respects its own caching settings correctly. This would prevent unnecessary reprocessing of files that have already been cached.

AI-generated content by issue-reviewer may be incorrect

pelikhan · 2025-01-03T04:20:33Z

Good one. I'll look into this next week.

pelikhan · 2025-01-03T13:56:51Z

refactor cache logic outside completer
store promise for pending value computation

pelikhan · 2025-01-05T13:39:18Z

This works now:

const innerPrompt = `Generate 2 word poem. ${Math.random()}`
await Promise.all(
    Array(10)
        .fill(0)
        .map(async (_, i) => {
            await runPrompt(innerPrompt, { cache: "inner", label: `run-${i}` })
        })
)

anrgct · 2025-01-06T13:31:55Z

I modified the innerPrompt in the example to Generate 2 word poem, and pulled the code and ran the example, but the first request out of ten will be made, and it doesn't use the cache even if there is one. I looked at packages/sample/.genaiscript/cache/inner/db.jsonl, and the SHA of the new request is the same? Is this normal?

pelikhan · 2025-01-06T15:12:21Z

The sha is computed by takes the LLM request object and the LLM identifier for the same set of messages/provider, you get a cache hit (label is ignored which is the only different piece in this inner request).

The first request that has a cache miss runs and the promise of the chat execution is maintained in the cache object. The next request that has a cache hit while the chat execution is still in progress will share the same promise.

I'll run your example to see.

Note: building a repository map seems to be a hot topic!

pelikhan · 2025-01-06T15:55:21Z

(I've pushed v1.88.0 with the current changes)

pelikhan · 2025-01-06T16:53:47Z

@anrgct also added deepseek:deekseek-chat support

pelikhan · 2025-01-06T17:35:03Z

You were right. @anrgct i fixed a race on the loading of the cache. Looks good on your sample now.

anrgct · 2025-01-07T12:37:08Z

"Added deepseek:deekseek-chat support" You're really attentive! Thank you!
"I fixed a race on the loading of the cache" Did you submit the code? I didn't see the bug disappear.

pelikhan · 2025-01-07T17:41:58Z

I am getting cache hits on every invocations now, at least on main. I will do a release shortly.

pelikhan mentioned this issue Jan 5, 2025

refactor run cache #976

Merged

pelikhan self-assigned this Jan 5, 2025

pelikhan closed this as completed Jan 7, 2025

pelikhan reopened this Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

host.promiseQueue Causing Cache Misses and Repeated Requests #972

host.promiseQueue Causing Cache Misses and Repeated Requests #972

anrgct commented Dec 31, 2024

github-actions bot commented Dec 31, 2024

pelikhan commented Jan 3, 2025

pelikhan commented Jan 3, 2025

pelikhan commented Jan 5, 2025

anrgct commented Jan 6, 2025 •

edited

Loading

pelikhan commented Jan 6, 2025 •

edited

Loading

pelikhan commented Jan 6, 2025

pelikhan commented Jan 6, 2025

pelikhan commented Jan 6, 2025

anrgct commented Jan 7, 2025

pelikhan commented Jan 7, 2025

host.promiseQueue Causing Cache Misses and Repeated Requests #972

host.promiseQueue Causing Cache Misses and Repeated Requests #972

Comments

anrgct commented Dec 31, 2024

github-actions bot commented Dec 31, 2024

pelikhan commented Jan 3, 2025

pelikhan commented Jan 3, 2025

pelikhan commented Jan 5, 2025

anrgct commented Jan 6, 2025 • edited Loading

pelikhan commented Jan 6, 2025 • edited Loading

pelikhan commented Jan 6, 2025

pelikhan commented Jan 6, 2025

pelikhan commented Jan 6, 2025

anrgct commented Jan 7, 2025

pelikhan commented Jan 7, 2025

anrgct commented Jan 6, 2025 •

edited

Loading

pelikhan commented Jan 6, 2025 •

edited

Loading