Add OpenCtx annotations to the chat's context#4968
Add OpenCtx annotations to the chat's context#4968bevzzz wants to merge 8 commits intosourcegraph:mainfrom
Conversation
| type OpenCtxClient = Pick<Client<vscode.Range>, 'meta' | 'mentions' | 'items'> | ||
| // TODO(dyma): Signature for Controller['annotation'] doesn't make sense for all Cody clients, | ||
| // e.g. Cody CLI cannot directly access VSCode's APIs. The {uri: Uri, getText(): string} interface | ||
| // should be a common abstraction. | ||
| type OpenCtxClient = Pick<Controller, 'meta' | 'mentions' | 'items' | 'annotations'> |
There was a problem hiding this comment.
agreed. Why don't you keep using client though? It has an annotations call. Sure, that means we need to populate AnnotationsParams.content, but that should be fine?
There was a problem hiding this comment.
tl;dr: openCtx.client is a Controller, not a Client. I think exporting the Controller in terms of client/platform-agnostic interfaces in @openctx/controller is the way to go.
So the catch is that openCtx.client is actually a Controller and not a Client.
This doesn't jump in your face (typescript's not catching it) because of the vague typing here (there's a TODO to fix this):
Unlike the other methods which have the same signature in Client and Controller, the annotations call expects this {uri: Uri, getText()} interface and not AnnotationsParams.content.
vscode-lib feels like the wrong dependency to have in our lib/shared directory
same here; it does the job, but it's confusing.
My initial instinct would be this:
// lib/shared/.../api.ts
export interface OpenCtxClient extends Client<vscode.Range>, 'meta' | 'mentions' | 'items'> {
annotations(doc: FileHandle, opts?: ProviderMethodOptions)
}
/**
* FileHandle is an abstract interface every Cody client can have an adapter for.
* vscode: (await vscode.workspace.openTextDocument(uri) satisfies FileHandle
* cody-cli: some wrapper that exposes `getText()`
* etc
*/
export interface FileHandle {
uri: FileURI
getText(): string
}Done and dusted.
On the second thought though, I'd see vscode APIs abstracted away already in OpenCtx. Looking at Controller, I don't see why the definition's got to be so vscode-specific. My understanding is that vscode just happeded to be the primary target and it was skip the abstraction layers.
What I'd do is export Controller in @openctx/controller similarly to @openctx/client, hiding all vscode APIs behind OpenCtx facade interfaces.
There was a problem hiding this comment.
thanks for the very detailed response. In #4531 I made the setOpenCtxClient call set the controller instead since I wanted that extra behaviour. I see why that fails now since annotations is different unlike the other pre-existing calls we have in Cody.
I agree with your assesment we should probably have a controller package which defines this properly. Alternatively, should we change the type signature of controller to more closely match the client? IE it is kinda weird client is different than controller only for annotations.
cc annotations mastermind @sqs
There was a problem hiding this comment.
Tbh, the client-controller dichotomy is still somewhat unclear to me. While they seem to be interchangeable in some cases, I guess controller is supposed to be more aware of the environment in which Cody is running? Whereas client knows how to talk to the providers (HTTP vs. load the module)?
| } | ||
| annotations.push({ ...ann, title: ann.item.title, content: aiContent }) | ||
| }) | ||
| return { ...fileItem, annotations } |
There was a problem hiding this comment.
I'm not exactly sure where in the prompt construction this function is called. If it is just before we send of to the LLM, this is indeed the right time to ask for annotations (I believe it is).
However, I think this should be working differently. In my head the annotations are extra information and are not core to the file. It feels like these should be there own "context item" type which cody can then make decisions on if its worth including or not w.r.t. token budgets/etc. Additionally, the way cody presents the context included needs some sort of way to communicate to the user that context.
@sourcegraph/noodle what do you think?
There was a problem hiding this comment.
You're right, this function is called shortly before we pass the context items to the prompter (and then build the prompt and send it). It is also at this stage that we fetch items for each openctx-mention in the message.
It feels like these should be there own "context item" type which cody can then make decisions on if its worth including or not w.r.t. token budgets/etc
OK, I see. So, e.g., if the token budget is low, Cody may decide to exclude the annotations while still including the file content and that's not something it'd be able to do at the moment. Yes, sounds good
some sort of way to communicate to the user that context
Keeping in mind the above (that the two are logically separated), I was thinking of adding something like a badge in the corner of the mention chip. @file.go -> @file.go* and present the option to exclude annotations from the file in a right-click context menu.
WDYT? Or would you rather mention along with all the other context items next to the Cody's response?
There was a problem hiding this comment.
In my head the annotations are extra information and are not core to the file
Would still make sense to fetch annotations in resolveFileOrSymbolContentItem, right? We could have this function return Promise<ContextItemWithContext[]> with a single File-item and several Annotation-items, which would then be merged into the homogenous "context items" list.
There was a problem hiding this comment.
Yup, this would also match the newer APIs we have in openctx where a "mention" can become several items once resolved.
To be honest you have likely made much deeper modifications to this part of the code than I have so I don't have great taste yet to know what is good. Let me try rope in some others who have wrestled with context construction more.
There was a problem hiding this comment.
+1 for "with a single File-item and several Annotation-items" that is what we are doing with the mentioned items right now, and it allows us to skip context items if tokens are limited.
There was a problem hiding this comment.
I would add a new ContextItemFileAnnotation type and also update the prompt builder template to include the right explanatory prefix in the prompt for the annotation context item.
There was a problem hiding this comment.
add a new ContextItemFileAnnotation type
What about extending ContextItemOpenCtx? It already has mention? field; we could add annotation? to differentiate between "items" and "annotations" (item.annotation !== undefined)
update the prompt builder template to include the right explanatory prefix for the annotation context item
Something like what addAnnotations does, or would you suggest sth different?
Update: added kind: 'item' | 'annotation' discriminator to ContextItemOpenCtx. Both share all of the other required fields and I decided not to add a dedicated type. Updated existing call sites to specify kind: 'item'.
@thenamankumar let me know if let me know if that works for you.
There was a problem hiding this comment.
Question: Do you think that, rather than including annotations individually in "as-is" order, annotations from the same provider for the same file should be somehow rendered together (provided the token budget allows)?
Annotations from https://openctx.com/provider-sourcegraph for /foo/bar.js:
- [Annotation 1]
- [Annotation 2]
Annotations from https://openctx.com/provider-jira for /foo/bar.js:
- [Annotation 1]
Or do you think it wouldn't matter to Cody at the end of the day?
There was a problem hiding this comment.
I think the order and selection of particular context items to be added to the prompt should be handled by a re-ranker and we can solve that problem separately. For now, it would be best to return each annotation context item separately.
|
Also your example is awesome! Sorry, we need to get on clearing out all the open PRs/etc in the openctx repo. |
|
This is a SUPER cool PR, and I want this feature now! Thank you! @keegancsmith has been sharing this with other Cody devs here and we'll work with you on getting this merged. |
|
Let's do it, @sqs! If we need to sync later, you can always get a hold of me on Slack. I'll get on with implementing the suggestions from @keegancsmith in the meantime. |
1. Annotation's Item can have content for assistants consumption. We fetch annotations from all enabled OpenCtx providers and add them to the prompt's context. 2. Define OpenCtxClient in terms of @openctx/vscode-lib Controller. Afaik, VSCode is the only Cody client with OpenCtx at the moment.
Rather than attaching annotations to files, we resolve them to ContextItemOpenCtx and treat them as individual context items. As such, they can be rendered in their own section of the prompt or omitted from it altogether if the token budget is low. + Reuse range-limited file content for fetching annotations + Simplified how annotations are rendered to the prompt + Updated call sites of ContextItemOpenCtx to specify 'kind' + Unit tests
Annotations are certain to overlap with some context items like file-mentions and annotations for the same selection from a *different* provider. Annotations are not allowed to overlap with other annotations from the *same* provider. Those will be filtered out by isUniqueContextItem.
"@" is a metacharacter in TSDoc comments and it breaks the display formatting when the docstring is rendered to the hover-on card.
NOTE: on 'pnpm run check' ran locally 'tsc' is complaining that ContextItemOpenCtx is not generic, when it clearly is.
51e9c35 to
a44d6df
Compare
rangeContainsLines can replace isWithinRange in editor-context.
|
This PR is marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed automatically in 5 days. |
Context
OpenCtx providers may return additional
AssistantInfoto be consumed by code AI. At the moment the VSCode extension does not include these annotations in Cody's prompt.Here we will fetch annotations from the enabled OpenCtx providers for each file that's added to the chat's context and provide them to Cody.
Check this out!
Haven't done any sanity checks on the suggestions yet, I expect not all of them are accurate. I want to see if Cody will find the optimizations described in the article I linked below. Secondary to the PR's goal though.
Setup
I have
pprofprovider configured, which is how Cody is able to access information about the functions' CPU time:Checkout sourcegraph/openctx#133 for the source code.
The Go code in the example comes from Profiling Go Programs.
Test plan
TBD
TODO / Open questions:
MentionComponentfortype === 'file'items?ContextItem.size. Users can't really control the amount of content a provider puts into an annotation. What would be a good strategy here? Should there be some threshold until which annotations are "for free"?