Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,23 @@ env:
ONNXRUNTIME_NODE_INSTALL: skip

jobs:
knip:
Comment thread
nico-martin marked this conversation as resolved.
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Install pnpm
uses: pnpm/action-setup@b906affcce14559ad1aafd4ab0e942779e9f58b1 # v4
- name: Use Node.js 24.10.0
uses: actions/setup-node@53b83947a5a98c8d113130e565377fae1a50d02f # v6
with:
node-version: "24.10.0"
cache: "pnpm"
- run: pnpm install --frozen-lockfile
- run: pnpm format:check
- run: pnpm knip

build:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
Expand Down
43 changes: 42 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@
"test": "pnpm -r test",
"format": "prettier --write .",
"format:check": "prettier --check .",
"dev": "node scripts/dev.mjs"
"dev": "node scripts/dev.mjs",
"knip": "knip"
},
"repository": {
"type": "git",
Expand All @@ -23,9 +24,49 @@
},
"homepage": "https://github.com/huggingface/transformers.js#readme",
"devDependencies": {
"@types/node": "^24.10.9",
"knip": "6.15.0",
"prettier": "3.8.1",
"typescript": "5.9.3"
},
"knip": {
"tags": [
"-lintignore"
],
"ignoreBinaries": [
"doc-builder",
"python"
],
"workspaces": {
".": {
"entry": [
"scripts/*.mjs"
],
"project": [
"scripts/**/*.mjs"
]
},
"packages/transformers": {
"entry": [
"src/models/registry.js",
"tests/**/*.js",
"scripts/*.mjs",
"docs/scripts/**/*.js",
"docs/plugins/**/*.js"
],
"project": [
"src/**/*.{js,ts}",
"tests/**/*.js",
"scripts/**/*.mjs",
"docs/scripts/**/*.js",
"docs/plugins/**/*.js",
"!coverage/**",
"!dist/**",
"!types/**"
]
}
}
},
"prettier": {
"overrides": [
{
Expand Down
3 changes: 2 additions & 1 deletion packages/transformers/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -57,17 +57,18 @@
"dependencies": {
"@huggingface/jinja": "^0.5.6",
"@huggingface/tokenizers": "^0.1.3",
"onnxruntime-common": "1.24.3",

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember someone mentioned this before, but since onnxruntime-node and onnxruntime-web depend on it, I don't think it's necessary to define here.

but I suppose if there is a mismatch one day between e.g., Tensor from onnxruntime-node and onnxruntime-web, it may cause some issues.

Lmk what you think!

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should keep this as a direct dependency because @huggingface/transformers imports onnxruntime-common directly. And in my opinion as soon as we have a direct import, we should have it as a direct dependency and not rely on being provided by sub-dependencies.

The version mismatch concern is valid, but declaring it directly makes the resolved version explicit instead of depending on hoisting. If ORT node/web diverge further, we should probably revisit the ORT versions together rather than rely on a transitive copy.

"onnxruntime-node": "1.24.3",
"onnxruntime-web": "1.26.0-dev.20260416-b7804b056c",
"sharp": "^0.34.5"
},
"devDependencies": {
"@jest/globals": "30.2.0",
"@types/jest": "^30.0.0",
"@types/node": "^24.1.0",
"@webgpu/types": "^0.1.69",
"esbuild": "^0.27.2",
"jest": "^30.2.0",
"jest-environment-node": "^30.2.0",
"jsdoc-to-markdown": "^9.1.3",
Comment on lines +66 to 72

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you explain the jest-environment-node -> jest/globals package change?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was one of the suggestions from knip since we don't use it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and about the "@jest/globals": knip found that we import @jest/globals in two places in the project but we don't have it in the dependencies. It never caused an issue because I think its still somewhere in the dependency tree, but as I said in the "onnxruntime-common", whenever we import a package directly, we should also have it in our dependencies.

"typescript": "5.9.3"
},
Expand Down
2 changes: 1 addition & 1 deletion packages/transformers/scripts/build/constants.mjs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import path from "node:path";
import { fileURLToPath } from "node:url";

export const DIST_FOLDER = "dist";
const DIST_FOLDER = "dist";
export const NODE_IGNORE_MODULES = ["onnxruntime-web"];
export const NODE_EXTERNAL_MODULES = [
"onnxruntime-common",
Expand Down
22 changes: 11 additions & 11 deletions packages/transformers/src/models/modeling_utils.js
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ export function boolTensor(value) {
return new Tensor('bool', [value], [1]);
}

export { getSessionsConfig, getTextOnlySessions, MODEL_TYPES } from './session_config.js';
export { MODEL_TYPES } from './session_config.js';

/**
* Runtime-only model type configuration (forward functions, generation flags).
Expand Down Expand Up @@ -1101,7 +1101,7 @@ export class PreTrainedModel extends Callable {
* @returns {Promise<Seq2SeqLMOutput>} Promise that resolves with the output of the seq2seq model.
* @private
*/
export async function seq2seq_forward(self, model_inputs) {
async function seq2seq_forward(self, model_inputs) {
let { encoder_outputs, input_ids, decoder_input_ids, decoder_attention_mask, ...other_decoder_inputs } =
model_inputs;
// Encode if needed
Expand Down Expand Up @@ -1164,7 +1164,7 @@ export async function encoder_forward(self, model_inputs) {
return await sessionRun(session, encoderFeeds);
}

export async function auto_encoder_forward(self, model_inputs) {
async function auto_encoder_forward(self, model_inputs) {
const encoded = await self.encode(model_inputs);
const decoded = await self.decode(encoded);
return decoded;
Expand Down Expand Up @@ -1219,7 +1219,7 @@ export function getPastKeyValues(decoderResults, pastKeyValues) {
* @param {Object} model_output The output of the model.
* @returns {{cross_attentions?: Tensor[]}} An object containing attentions.
*/
export function getAttentions(model_output) {
function getAttentions(model_output) {
const attentions = {};

for (const attnName of ['cross_attentions', 'encoder_attentions', 'decoder_attentions']) {
Expand Down Expand Up @@ -1376,7 +1376,7 @@ export async function decoder_forward(self, model_inputs, is_encoder_decoder = f
* @returns {Promise<Tensor>} The model's output tensor
* @private
*/
export async function generic_text_to_text_forward(
async function generic_text_to_text_forward(
self,
{
// Generic parameters:
Expand Down Expand Up @@ -1489,7 +1489,7 @@ export async function generic_text_to_text_forward(
* @returns {Promise<Tensor>} The model's output tensor.
* @private
*/
export async function audio_text_to_text_forward(self, params) {
async function audio_text_to_text_forward(self, params) {
return await generic_text_to_text_forward(self, {
...params,
modality_input_names: ['audio_values', 'input_features'],
Expand All @@ -1506,7 +1506,7 @@ export async function audio_text_to_text_forward(self, params) {
* @returns {Promise<Tensor>} The model's output tensor.
* @private
*/
export async function image_text_to_text_forward(self, params) {
async function image_text_to_text_forward(self, params) {
return await generic_text_to_text_forward(self, {
...params,
modality_input_names: ['pixel_values'],
Expand Down Expand Up @@ -1559,7 +1559,7 @@ export function cumsum_masked_fill(attention_mask, start_index = 0) {
* position_ids = position_ids[:, -input_ids.shape[1] :]
* ```
*/
export function create_position_ids(model_inputs, past_key_values = null, start_index = 0) {
function create_position_ids(model_inputs, past_key_values = null, start_index = 0) {
const { input_ids, inputs_embeds, attention_mask } = model_inputs;

const { data, dims } = cumsum_masked_fill(attention_mask, start_index);
Expand Down Expand Up @@ -1631,15 +1631,15 @@ export function encoder_decoder_prepare_inputs_for_generation(self, input_ids, m
};
}

export function multimodal_text_to_text_prepare_inputs_for_generation(self, ...args) {
function multimodal_text_to_text_prepare_inputs_for_generation(self, ...args) {
if (self.config.is_encoder_decoder) {
return encoder_decoder_prepare_inputs_for_generation(self, ...args);
} else {
return decoder_prepare_inputs_for_generation(self, ...args);
}
}

export function default_merge_input_ids_with_features({
function default_merge_input_ids_with_features({
modality_token_id,
inputs_embeds,
modality_features,
Expand Down Expand Up @@ -1714,7 +1714,7 @@ export function default_merge_input_ids_with_audio_features({
* @returns {Promise<Record<string, any>>} A Promise that resolves to a dictionary of configuration objects.
* @private
*/
export async function get_optional_configs(pretrained_model_name_or_path, names, options) {
async function get_optional_configs(pretrained_model_name_or_path, names, options) {
return Object.fromEntries(
await Promise.all(
Object.keys(names).map(async (name) => {
Expand Down
2 changes: 1 addition & 1 deletion packages/transformers/src/models/whisper/common_whisper.js
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ const WHISPER_LANGUAGES = [
// @ts-ignore
export const WHISPER_LANGUAGE_MAPPING = new Map(WHISPER_LANGUAGES);
// @ts-ignore
export const WHISPER_TO_LANGUAGE_CODE_MAPPING = new Map([
const WHISPER_TO_LANGUAGE_CODE_MAPPING = new Map([
...WHISPER_LANGUAGES.map(([k, v]) => [v, k]),
...[
['burmese', 'my'],
Expand Down
4 changes: 3 additions & 1 deletion packages/transformers/src/utils/audio.js
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,9 @@ export async function load_audio(url, sampling_rate) {
/**
* @deprecated Use {@link load_audio} instead.
*/
export const read_audio = load_audio;
export async function read_audio(url, sampling_rate) {
return await load_audio(url, sampling_rate);
}

/**
* Helper function to generate windows that are special cases of the generalized cosine window.
Expand Down
4 changes: 1 addition & 3 deletions packages/transformers/src/utils/constants.js
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
export const GITHUB_ISSUE_URL = 'https://github.com/huggingface/transformers.js/issues/new/choose';

export const CONFIG_NAME = 'config.json';
export const FEATURE_EXTRACTOR_NAME = 'preprocessor_config.json';
export const IMAGE_PROCESSOR_NAME = FEATURE_EXTRACTOR_NAME;
export const IMAGE_PROCESSOR_NAME = 'preprocessor_config.json';
export const PROCESSOR_NAME = 'processor_config.json';
export const CHAT_TEMPLATE_NAME = 'chat_template.jinja';
export const GENERATION_CONFIG_NAME = 'generation_config.json';
65 changes: 0 additions & 65 deletions packages/transformers/src/utils/core.js
Comment thread
nico-martin marked this conversation as resolved.
Original file line number Diff line number Diff line change
Expand Up @@ -141,39 +141,6 @@ export class DefaultProgressCallback extends Callable {
}
}

/**
* Reverses the keys and values of an object.
*
* @param {Object} data The object to reverse.
* @returns {Object} The reversed object.
* @see https://ultimatecourses.com/blog/reverse-object-keys-and-values-in-javascript
*/
export function reverseDictionary(data) {
// https://ultimatecourses.com/blog/reverse-object-keys-and-values-in-javascript
return Object.fromEntries(Object.entries(data).map(([key, value]) => [value, key]));
}

/**
* Escapes regular expression special characters from a string by replacing them with their escaped counterparts.
*
* @param {string} string The string to escape.
* @returns {string} The escaped string.
*/
export function escapeRegExp(string) {
return string.replace(/[.*+?^${}()|[\]\\]/g, '\\$&'); // $& means the whole matched string
}

/**
* Check if a value is a typed array.
* @param {*} val The value to check.
* @returns {boolean} True if the value is a `TypedArray`, false otherwise.
*
* Adapted from https://stackoverflow.com/a/71091338/13989043
*/
export function isTypedArray(val) {
return val?.prototype?.__proto__?.constructor?.name === 'TypedArray';
}

/**
* Check if a value is an integer.
* @param {*} x The value to check.
Expand Down Expand Up @@ -208,26 +175,6 @@ export function calculateDimensions(arr) {
return dimensions;
}

/**
* Replicate python's .pop() method for objects.
* @param {Object} obj The object to pop from.
* @param {string} key The key to pop.
* @param {*} defaultValue The default value to return if the key does not exist.
* @returns {*} The value of the popped key.
* @throws {Error} If the key does not exist and no default value is provided.
*/
export function pop(obj, key, defaultValue = undefined) {
const value = obj[key];
if (value !== undefined) {
delete obj[key];
return value;
}
if (defaultValue === undefined) {
throw Error(`Key ${key} does not exist in object.`);
}
return defaultValue;
}

/**
* Efficiently merge arrays, creating a new copy.
* Adapted from https://stackoverflow.com/a/6768642/13989043
Expand Down Expand Up @@ -277,18 +224,6 @@ export function pick(o, props) {
);
}

/**
* Calculate the length of a string, taking multi-byte characters into account.
* This mimics the behavior of Python's `len` function.
* @param {string} s The string to calculate the length of.
* @returns {number} The length of the string.
*/
export function len(s) {
let length = 0;
for (const c of s) ++length;
return length;
}

/**
* Count the occurrences of a value in an array or string.
* This mimics the behavior of Python's `count` method.
Expand Down
4 changes: 2 additions & 2 deletions packages/transformers/src/utils/dtypes.js
Original file line number Diff line number Diff line change
Expand Up @@ -49,8 +49,8 @@ export const DATA_TYPES = Object.freeze({
});
/** @typedef {keyof typeof DATA_TYPES} DataType */

export const DEFAULT_DEVICE_DTYPE = DATA_TYPES.fp32;
export const DEFAULT_DEVICE_DTYPE_MAPPING = Object.freeze({
const DEFAULT_DEVICE_DTYPE = DATA_TYPES.fp32;
const DEFAULT_DEVICE_DTYPE_MAPPING = Object.freeze({
// NOTE: If not specified, will default to fp32
[DEVICE_TYPES.wasm]: DATA_TYPES.q8,
});
Expand Down
4 changes: 2 additions & 2 deletions packages/transformers/src/utils/hub.js
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ export async function checkCachedResource(cache, localPath, proposedCacheKey) {
* @param {PretrainedOptions} [options] Options containing progress callback and context for progress updates.
* @returns {Promise<void>}
*/
export async function storeCachedResource(path_or_repo_id, filename, cache, cacheKey, response, result, options = {}) {
async function storeCachedResource(path_or_repo_id, filename, cache, cacheKey, response, result, options = {}) {
// Check again whether request is in cache. If not, we add the response to the cache
if ((await cache.match(cacheKey)) !== undefined) {
return;
Expand Down Expand Up @@ -249,7 +249,7 @@ export async function storeCachedResource(path_or_repo_id, filename, cache, cach
* @throws Will throw an error if the file is not found and `fatal` is true.
* @returns {Promise<string|Uint8Array|null>} A Promise that resolves with the file content as a Uint8Array if `return_path` is false, or the file path as a string if `return_path` is true.
*/
export async function loadResourceFile(
async function loadResourceFile(
path_or_repo_id,
filename,
fatal = true,
Expand Down
2 changes: 1 addition & 1 deletion packages/transformers/tests/tokenizers.test.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ describe("Tokenizers (model-specific)", () => {
describe(tokenizer_name, () => {
for (const model_id in TEST_CONFIG) {
describe(model_id, () => {
/** @type {import('../src/tokenizers.js').PreTrainedTokenizer} */
/** @type {import('../src/tokenization_utils.js').PreTrainedTokenizer} */
let tokenizer;
beforeAll(async () => {
tokenizer = await TOKENIZER_CLASS.from_pretrained(model_id);
Expand Down
Loading
Loading