Skip to content
This repository was archived by the owner on Aug 1, 2025. It is now read-only.

Add OpenCtx annotations to the chat's context#4968

Closed
bevzzz wants to merge 8 commits intosourcegraph:mainfrom
bevzzz:feat/file-annotations
Closed

Add OpenCtx annotations to the chat's context#4968
bevzzz wants to merge 8 commits intosourcegraph:mainfrom
bevzzz:feat/file-annotations

Conversation

@bevzzz
Copy link
Copy Markdown
Contributor

@bevzzz bevzzz commented Jul 22, 2024

Context

OpenCtx providers may return additional AssistantInfo to be consumed by code AI. At the moment the VSCode extension does not include these annotations in Cody's prompt.

Here we will fetch annotations from the enabled OpenCtx providers for each file that's added to the chat's context and provide them to Cody.

Check this out!

Screenshot 2024-07-22 at 20 22 54

Haven't done any sanity checks on the suggestions yet, I expect not all of them are accurate. I want to see if Cody will find the optimizations described in the article I linked below. Secondary to the PR's goal though.

Setup

I have pprof provider configured, which is how Cody is able to access information about the functions' CPU time:

// .vscode/settings.json
"openctx.providers": {
	"file:///Users/d.solovei/src/github.com/bevzzz/openctx/provider/pprof/dist/bundle.js": {
		"reportGlob": "**/*.pprof",
		"rootDirectoryMarkers": ["go.mod"],
		"top": {
			"sort": "flat"
		}
	}
}

Checkout sourcegraph/openctx#133 for the source code.

The Go code in the example comes from Profiling Go Programs.

Test plan

TBD

TODO / Open questions:

  • Toggle for this feature. Context menu on the MentionComponent for type === 'file' items?
  • Reflect added token count in ContextItem.size. Users can't really control the amount of content a provider puts into an annotation. What would be a good strategy here? Should there be some threshold until which annotations are "for free"?
  • How well does the current template accommodate other providers?

Comment thread lib/shared/package.json
Comment thread lib/shared/src/context/openctx/api.ts Outdated
Comment on lines +4 to +6
type OpenCtxClient = Pick<Client<vscode.Range>, 'meta' | 'mentions' | 'items'>
// TODO(dyma): Signature for Controller['annotation'] doesn't make sense for all Cody clients,
// e.g. Cody CLI cannot directly access VSCode's APIs. The {uri: Uri, getText(): string} interface
// should be a common abstraction.
type OpenCtxClient = Pick<Controller, 'meta' | 'mentions' | 'items' | 'annotations'>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed. Why don't you keep using client though? It has an annotations call. Sure, that means we need to populate AnnotationsParams.content, but that should be fine?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tl;dr: openCtx.client is a Controller, not a Client. I think exporting the Controller in terms of client/platform-agnostic interfaces in @openctx/controller is the way to go.


So the catch is that openCtx.client is actually a Controller and not a Client.

https://github.com/sourcegraph/cody/blob/3abd2251532b5c5e7bc0624fd96372e2a13f4f4f/vscode/src/context/openctx.ts#L27-L36

This doesn't jump in your face (typescript's not catching it) because of the vague typing here (there's a TODO to fix this):

https://github.com/sourcegraph/cody/blob/3abd2251532b5c5e7bc0624fd96372e2a13f4f4f/vscode/src/context/openctx.ts#L15

Unlike the other methods which have the same signature in Client and Controller, the annotations call expects this {uri: Uri, getText()} interface and not AnnotationsParams.content.


vscode-lib feels like the wrong dependency to have in our lib/shared directory

same here; it does the job, but it's confusing.

My initial instinct would be this:

// lib/shared/.../api.ts

export interface OpenCtxClient extends Client<vscode.Range>, 'meta' | 'mentions' | 'items'>  {
    annotations(doc: FileHandle, opts?: ProviderMethodOptions)
}

/** 
* FileHandle is an abstract interface every Cody client can have an adapter for.
* vscode: (await vscode.workspace.openTextDocument(uri) satisfies FileHandle
* cody-cli: some wrapper that exposes `getText()`
* etc
*/
export interface FileHandle {
    uri:  FileURI
    getText(): string
}

Done and dusted.

On the second thought though, I'd see vscode APIs abstracted away already in OpenCtx. Looking at Controller, I don't see why the definition's got to be so vscode-specific. My understanding is that vscode just happeded to be the primary target and it was skip the abstraction layers.

What I'd do is export Controller in @openctx/controller similarly to @openctx/client, hiding all vscode APIs behind OpenCtx facade interfaces.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the very detailed response. In #4531 I made the setOpenCtxClient call set the controller instead since I wanted that extra behaviour. I see why that fails now since annotations is different unlike the other pre-existing calls we have in Cody.

I agree with your assesment we should probably have a controller package which defines this properly. Alternatively, should we change the type signature of controller to more closely match the client? IE it is kinda weird client is different than controller only for annotations.

cc annotations mastermind @sqs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tbh, the client-controller dichotomy is still somewhat unclear to me. While they seem to be interchangeable in some cases, I guess controller is supposed to be more aware of the environment in which Cody is running? Whereas client knows how to talk to the providers (HTTP vs. load the module)?

Comment thread vscode/src/editor/utils/editor-context.ts Outdated
Comment thread vscode/src/editor/utils/editor-context.ts Outdated
}
annotations.push({ ...ann, title: ann.item.title, content: aiContent })
})
return { ...fileItem, annotations }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure where in the prompt construction this function is called. If it is just before we send of to the LLM, this is indeed the right time to ask for annotations (I believe it is).

However, I think this should be working differently. In my head the annotations are extra information and are not core to the file. It feels like these should be there own "context item" type which cody can then make decisions on if its worth including or not w.r.t. token budgets/etc. Additionally, the way cody presents the context included needs some sort of way to communicate to the user that context.

@sourcegraph/noodle what do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this function is called shortly before we pass the context items to the prompter (and then build the prompt and send it). It is also at this stage that we fetch items for each openctx-mention in the message.

It feels like these should be there own "context item" type which cody can then make decisions on if its worth including or not w.r.t. token budgets/etc

OK, I see. So, e.g., if the token budget is low, Cody may decide to exclude the annotations while still including the file content and that's not something it'd be able to do at the moment. Yes, sounds good

some sort of way to communicate to the user that context

Keeping in mind the above (that the two are logically separated), I was thinking of adding something like a badge in the corner of the mention chip. @file.go -> @file.go* and present the option to exclude annotations from the file in a right-click context menu.

WDYT? Or would you rather mention along with all the other context items next to the Cody's response?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my head the annotations are extra information and are not core to the file

Would still make sense to fetch annotations in resolveFileOrSymbolContentItem, right? We could have this function return Promise<ContextItemWithContext[]> with a single File-item and several Annotation-items, which would then be merged into the homogenous "context items" list.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, this would also match the newer APIs we have in openctx where a "mention" can become several items once resolved.

To be honest you have likely made much deeper modifications to this part of the code than I have so I don't have great taste yet to know what is good. Let me try rope in some others who have wrestled with context construction more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for "with a single File-item and several Annotation-items" that is what we are doing with the mentioned items right now, and it allows us to skip context items if tokens are limited.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would add a new ContextItemFileAnnotation type and also update the prompt builder template to include the right explanatory prefix in the prompt for the annotation context item.

Copy link
Copy Markdown
Contributor Author

@bevzzz bevzzz Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a new ContextItemFileAnnotation type

What about extending ContextItemOpenCtx? It already has mention? field; we could add annotation? to differentiate between "items" and "annotations" (item.annotation !== undefined)

update the prompt builder template to include the right explanatory prefix for the annotation context item

Something like what addAnnotations does, or would you suggest sth different?


Update: added kind: 'item' | 'annotation' discriminator to ContextItemOpenCtx. Both share all of the other required fields and I decided not to add a dedicated type. Updated existing call sites to specify kind: 'item'.

@thenamankumar let me know if let me know if that works for you.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Do you think that, rather than including annotations individually in "as-is" order, annotations from the same provider for the same file should be somehow rendered together (provided the token budget allows)?

Annotations from https://openctx.com/provider-sourcegraph for /foo/bar.js:
- [Annotation 1]
- [Annotation 2]
Annotations from https://openctx.com/provider-jira for /foo/bar.js:
- [Annotation 1]

Or do you think it wouldn't matter to Cody at the end of the day?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the order and selection of particular context items to be added to the prompt should be handled by a re-ranker and we can solve that problem separately. For now, it would be best to return each annotation context item separately.

@keegancsmith
Copy link
Copy Markdown
Member

Also your example is awesome! Sorry, we need to get on clearing out all the open PRs/etc in the openctx repo.

@sqs
Copy link
Copy Markdown
Member

sqs commented Jul 25, 2024

This is a SUPER cool PR, and I want this feature now! Thank you! @keegancsmith has been sharing this with other Cody devs here and we'll work with you on getting this merged.

@bevzzz
Copy link
Copy Markdown
Contributor Author

bevzzz commented Jul 25, 2024

Let's do it, @sqs! If we need to sync later, you can always get a hold of me on Slack.

I'll get on with implementing the suggestions from @keegancsmith in the meantime.

bevzzz added 6 commits July 29, 2024 09:54
1. Annotation's Item can have content for assistants consumption. We fetch
annotations from all enabled OpenCtx providers and add them to the prompt's context.

2. Define OpenCtxClient in terms of @openctx/vscode-lib Controller. Afaik, VSCode is the only Cody client with OpenCtx at the moment.
Rather than attaching annotations to files, we resolve them to
ContextItemOpenCtx and treat them as individual context items.
As such, they can be rendered in their own section of the prompt
or omitted from it altogether if the token budget is low.

+ Reuse range-limited file content for fetching annotations
+ Simplified how annotations are rendered to the prompt
+ Updated call sites of ContextItemOpenCtx to specify 'kind'
+ Unit tests
Annotations are certain to overlap with some context items like file-mentions
and annotations for the same selection from a *different* provider.

Annotations are not allowed to overlap with other annotations from the *same* provider. Those will
be filtered out by isUniqueContextItem.
"@" is a metacharacter in TSDoc comments and it breaks the display formatting when the docstring is rendered to the hover-on card.
NOTE: on 'pnpm run check' ran locally 'tsc' is complaining that ContextItemOpenCtx is not generic, when it clearly is.
@bevzzz bevzzz force-pushed the feat/file-annotations branch from 51e9c35 to a44d6df Compare July 29, 2024 07:59
bevzzz and others added 2 commits July 29, 2024 11:33
@bevzzz bevzzz marked this pull request as ready for review August 11, 2024 15:53
@github-actions
Copy link
Copy Markdown

This PR is marked as stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed automatically in 5 days.

@github-actions github-actions Bot added the Stale label Oct 11, 2024
@github-actions github-actions Bot closed this Oct 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants