[Prototype] Vision multimodal by jlamypoirier · Pull Request #369 · ServiceNow/Fast-LLM

jlamypoirier · 2025-09-26T03:31:26Z

✨ Description

An attempt at integrating multimodal vision models to main. Still a lot of work to do...

tscholak · 2025-10-04T13:16:19Z

fast_llm/layers/vision/config.py

+        hint=FieldHint.architecture,
+    )
+    # TODO: ====== Appropriate name?? ======
+    decoder: BlockSequenceConfig = Field(


tscholak · 2025-10-04T13:27:03Z

fast_llm/layers/vision/vision_encoder.py

+            peft=self._peft,
+        )
+        # TODO: ====== Appropriate name?? ======
+        self.decoder = self._config.decoder.get_layer(


tscholak · 2025-10-04T13:28:17Z

fast_llm/layers/vision/vision_encoder.py

+            peft=self._peft,
+        )
+        # TODO: ====== Hidden dim ======
+        self.adapter = self._config.adapter.get_layer(


I don't think we want to make the adapter part of the encoder, because adapter tensor shapes depend on decoder. And we also want to mix and match existing pre trained encoders and decoders...

It's the same with every module basically, their shapes all need to match. I'm organizing the modules so thy manage their internal hidden shapes, but input and output shapes are managed by the parent modules (hidden_dim argument), so in that case it makes sense to keep the adapter here.

The todo refers to the MLP assuming matching input and output dimensions, that's an easy fix but I haven't gotten to it yet.

jlamypoirier added 9 commits September 22, 2025 17:34

clean history

ecd1918

Vision multimodal

9114ce2

Drop varlen mamba

a44642c

cleanup

ddf2143

cleanup

8ee7d5e

cleanup

43ca913

cleanup

15405a1

stuff

a3dc89d

stuff

414f87e

jlamypoirier mentioned this pull request Sep 26, 2025

Base model interface review #370

Merged

jlamypoirier added 2 commits September 26, 2025 16:25

stuff

4a21360

Merge branch 'jlp/mlp_block' into jlp/vision_multimodal

2180ea5

jlamypoirier changed the base branch from main to jlp/mlp_block September 26, 2025 20:29

Embeddings

bb7c62d

jlamypoirier mentioned this pull request Sep 30, 2025

[Workspace] Dev branch merge attempt #367

Closed

Model interface

47b9a44

Base automatically changed from jlp/mlp_block to main October 3, 2025 23:18

jlamypoirier added 5 commits October 3, 2025 19:33

Merge branch 'main' into jlp/vision_multimodal

09b0215

Fix merge

f31a313

model

3d84972

cleanup

8f8ef19

language_model

6084122

tscholak reviewed Oct 4, 2025

View reviewed changes

jlamypoirier added 3 commits October 6, 2025 16:27

fixes

4a96980

fixes

7854138

Merge branch 'jlp/language_model_block' into jlp/vision_multimodal

0350e17

jlamypoirier changed the base branch from main to jlp/language_model_block October 6, 2025 21:11

Base automatically changed from jlp/language_model_block to main October 6, 2025 22:38

jlamypoirier added 2 commits October 14, 2025 22:52

Dataset interface

1a18929

misc

fd63846

jlamypoirier added 11 commits October 15, 2025 16:21

fix

2486caf

Language model sample

92e93e8

fix

d6f6944

fixes

5c802fa

test

95d1840

fixes

eafd9cb

cleanup

c56df69

misc

7f437e1

misc

dfd27f5

Merge branch 'main' into jlp/vision_multimodal

d937f58

Merge branch 'jlp/lm_sample' into jlp/vision_multimodal

11bbee2

jlamypoirier changed the base branch from main to jlp/lm_sample October 17, 2025 19:54

jlamypoirier changed the title ~~Vision multimodal~~ [Prototype] Vision multimodal Oct 30, 2025

Base automatically changed from jlp/lm_sample to main November 24, 2025 18:19

jlamypoirier closed this Nov 24, 2025

jlamypoirier deleted the jlp/vision_multimodal branch November 24, 2025 19:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Prototype] Vision multimodal#369

[Prototype] Vision multimodal#369
jlamypoirier wants to merge 34 commits intomainfrom
jlp/vision_multimodal

jlamypoirier commented Sep 26, 2025

Uh oh!

tscholak Oct 4, 2025

Uh oh!

tscholak Oct 4, 2025

Uh oh!

tscholak Oct 4, 2025

Uh oh!

jlamypoirier Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jlamypoirier commented Sep 26, 2025

✨ Description

Uh oh!

tscholak Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

tscholak Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

tscholak Oct 4, 2025

Choose a reason for hiding this comment

Uh oh!

jlamypoirier Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants