Add support for Keyin API for itwin.js for AI usecase #7614

khanaffan · 2025-01-27T14:36:56Z

A KeyIn allows features to be invoked through textual commands in MicroStation. Although this method is old, it can be a very effective way to bridge the complexity of the UI and the application with LLMs.

Offer a registry for KeyIns, including commands, their arguments, descriptions of the arguments and commands, along with example invocations.
Ensure all UI components in itwin.js register any commands they expose via the UI as KeyIns, so they can be advertised to LLMs.
Enable non-UI functionalities to be exposed via KeyIns as well, such as importing schemas or creating cubes.

Some advantages

This straightforward infrastructure enables itwin.js apps to leverage LLMs efficiently without needing to write any code to invoke the feature.
It also facilitates the creation of compound Keyins for more complex tasks, allowing LLMs to handle them.
It supports cross-application functionality, effectively decoupling LLM/AI code from the application. This allows the AI team to concentrate on AI workflows rather than figuring out how to invoke individual application features.

The registry should allow keyin in same way as LLM expose tool/functions where tool and its parameters has description so LLM can fill out parameters from user input.

keyIns.register({
  type: "function",
  name: "toggle_subject_by_id",
  description: "Toggle subject by id",
  parameters: [
    {
      name: "id",
      type: "string",
      description: "Id is a hex string of subject",
    },
  ],
  handler: async (id: Id64String) => {},
});

rajkanben · 2025-01-27T14:47:29Z

This key-ins approach would help in

Co-pilot use cases where an NLP request can be translated as key-in commands for execution
NLP to key-in parsing shall be done using low end AI models running in webGPU or local (need not use larger models with PTU)
Would benefit automating workflows using key ins

pmconne · 2025-01-27T15:43:11Z

Key-in commands are useful beyond LLM use cases; at one point early in the design of iTwin.js editing capabilities some folks were pushing for everything to be based on commands so that everything could be scripted. Today all Tools have a key-in as an entry point, but for most of them that just starts the tool which then reacts to often complex series of user inputs. Those user inputs (change tool setting value, enter data point, etc) would need to be scriptable as well. For an LLM to understand those interactions they would need more context than just the command description, like the prompts we display to users (e.g., when creating a line string, after first two points are entered we ask the user to subsequently enter more data points or click reset button to accept). Alternatively, the series of inputs would have to output a single command that captures the entire operation. Either way, developing tools this way is a big ask for app developers - it's more complicated and takes more time - so the benefits would have to be obvious and relatively immediate. (MicroStation's support for scripting similarly varies across modules; many tasks can only be accomplished through user interaction, because nothing strictly enforces otherwise).

ColinKerr · 2025-01-28T18:24:35Z

@wgoehrig I believe there is overlap with this request and ideas for editing.

I think @karolis-zukauskas was also interested in editing APIs written as functions/library.

jmoutte · 2025-02-02T15:10:58Z

It is certainly desirable to make our APIs easy to consume for LLMs to enable automation. I wonder if Key-ins are the right approach though. If you allow me to simplify a bit, a key-in is just another, higher level, API. The question then becomes: does the LLM really need that higher level API or can it work with the existing one?

Creating and more importantly, maintaining a new API is costly. We should always think very carefully before we introduce new APIs and measure the return on investment. In a world where LLM will become pervasive, I am quite convinced that the use of key-ins by humans will reduce in favor of interactions with LLMs that will do the automation for them more efficiently. Shouldn't we instead remove key-ins and make LLM proficient at using the lower level APIs ?

ramanujam-raman · 2025-02-02T16:21:23Z

A couple of related notes -

Here's Visual Studio Code's Language Model API -https://code.visualstudio.com/api/extension-guides/language-model
And here's OpenAI's approach to agents that perform web tasks - https://openai.com/index/computer-using-agent/

rajkanben · 2025-02-03T08:58:16Z

For data workflows, may be we can also think of leveraging GraphQL as an intermediate layer. GraphQL can complement LLMs as a tool for Agents to automate workflows.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Keyin API for itwin.js for AI usecase #7614

Add support for Keyin API for itwin.js for AI usecase #7614

khanaffan commented Jan 27, 2025

rajkanben commented Jan 27, 2025

pmconne commented Jan 27, 2025

ColinKerr commented Jan 28, 2025

jmoutte commented Feb 2, 2025

ramanujam-raman commented Feb 2, 2025 •

edited

Loading

rajkanben commented Feb 3, 2025

Add support for Keyin API for itwin.js for AI usecase #7614

Add support for Keyin API for itwin.js for AI usecase #7614

Comments

khanaffan commented Jan 27, 2025

rajkanben commented Jan 27, 2025

pmconne commented Jan 27, 2025

ColinKerr commented Jan 28, 2025

jmoutte commented Feb 2, 2025

ramanujam-raman commented Feb 2, 2025 • edited Loading

rajkanben commented Feb 3, 2025

ramanujam-raman commented Feb 2, 2025 •

edited

Loading