-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to generate long content using a.generation() #2416
Comments
Thanks for opening this issue @amuresia. Can you please provide logs for a request that generates less words than expected? If you don't have logging enabled for your AppSync API, you can do so with: export const data = defineData({
// existing ...
logging: {
fieldLogLevel: 'all'
},
}); Generation routes have a 30 second timeout due to the AppSync resolver execution limit. Lambda functions aren't involved here; the requests go directly from AppSync to Bedrock. (source)
So something else is going on here. My hunch is that this is model specific and that using a newer model like Claude 3.5 Haiku or Claude 3.5 Sonnet / Sonnet v2 would have better results. But the logs will help us confirm what's going on here. Thanks! |
Hi @atierian! Thanks for the prompt reply! Logs
|
Thanks for that. There should be many more log statements generated for the that request, including the request and response to/from Bedrock. If you haven't already, you'll need to redeploy once you enable logging via If you've already taken those steps, you may be filtering the additional log statements out. |
Here are all the logs that get generated @atierian. I hope this helps! CloudWatch Logs Live tail
|
After changing to
after running for 30 seconds (AppSync resolver execution limit) |
Thanks for the logs. We'll take a look and follow up. |
Environment information
Describe the bug
I am trying to generate content that is quite long. I never get responses that are 1000 words long using the prompt from below. The longest response I got was around 500 words and the shortest was 42 words long. I notice that from the moment I hit enter, 3 seconds elapse and then I get the response from the LLM. Whatever is generated during those 3 seconds (presumably before the execution of a lambda function times out?) is what I get.
This makes the parameter
maxTokens
ininferenceConfiguration
a bit pointless IMHO, unless I am missing something.Reproduction steps
Schema
The text was updated successfully, but these errors were encountered: