feat(mobile-app): ✨ build processing screen#6
Conversation
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughReplaces the AI engine V1 pipeline with a V2 async NMT-first producer/consumer pipeline, adds durable artifact uploads/events/DB helpers, introduces backend WebSocket/Redis wiring, extends media artifact/ETA fields and DTOs, and implements socket-first mobile UI and artifact retrieval endpoints. Changes
Sequence Diagram(s)sequenceDiagram
participant Mobile as Mobile Client
participant SocketGW as Backend Socket Gateway
participant SocketSvc as Backend Socket Service
participant Redis as Redis Pub/Sub
participant AI as AI Engine Worker
participant MinIO as MinIO
participant DB as Postgres
Mobile->>SocketGW: open WS (Bearer token)
SocketGW->>SocketSvc: subscribe user/media rooms (on connect)
AI->>Redis: PUBLISH media_updates (progress/chunk/batch/completed/failed)
Redis->>SocketSvc: message(media_updates)
SocketSvc->>DB: optional refresh artifact summary/status
SocketSvc->>SocketGW: emit event to user_{userId} / media_{mediaId}
SocketGW->>Mobile: socket event (media_progress / chunk_ready / batch_ready / completed / failed)
Mobile->>Mobile: update local React Query caches and UI
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 14
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
apps/mobile-app/src/hooks/useMedia.ts (1)
92-105:⚠️ Potential issue | 🟠 MajorGuard
confirmUploadbehind successful file reads and PUTs.This flow confirms the upload even if the local file fetch or the presigned
PUTfails. That can create a media record for an object that was never stored.Proposed fix
// Step 2 — Upload file blob directly to MinIO (bypasses our API server) const fileResponse = await fetch(file.uri); + if (!fileResponse.ok) { + throw new Error("Failed to read local file for upload"); + } const fileBlob = await fileResponse.blob(); - await fetch(uploadUrl, { + const uploadResponse = await fetch(uploadUrl, { method: "PUT", body: fileBlob, headers: { "Content-Type": file.mimeType }, }); + if (!uploadResponse.ok) { + throw new Error("Failed to upload file to storage"); + } // Step 3 — Confirm upload → creates MediaItem + dispatches queue job const title = file.name.replace(/\.[^.]+$/, ""); // strip file extension for title return mediaService.confirmUpload({ title, - objectKey: objectKey, + objectKey, processingMode: "TRANSCRIBE_TRANSLATE", });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/hooks/useMedia.ts` around lines 92 - 105, The code calls mediaService.confirmUpload regardless of whether the local fetch or the presigned PUT succeeded; ensure confirmUpload is only invoked after successful file read and a successful PUT response. Specifically, check fileResponse.ok after fetch(file.uri) and the PUT response.ok after await fetch(uploadUrl, { method: "PUT", ... }); throw or return early on failure (or propagate the error) so that mediaService.confirmUpload({ title, objectKey, processingMode: "TRANSCRIBE_TRANSLATE", ... }) is only executed when fileResponse and the PUT response are both successful; include the objectKey/title variables in the same scope so they are unchanged when calling confirmUpload.apps/mobile-app/src/services/api.ts (1)
35-39:⚠️ Potential issue | 🟡 MinorNormalize loopback hosts even for
httpsURLs.The
startsWith("https")branch skipsnormalizeApiBaseUrl(), soEXPO_PUBLIC_API_URL=https://localhost:3000/apiorhttps://127.0.0.1:3000/apistill points at the emulator itself on Android. Now that this constant is exported for sockets too, the same bad host breaks both transports.Suggested fix
export const API_BASE_URL = envApiBaseUrl - ? envApiBaseUrl.startsWith("https") - ? envApiBaseUrl - : normalizeApiBaseUrl(envApiBaseUrl) + ? normalizeApiBaseUrl(envApiBaseUrl) : DEFAULT_API_BASE_URL;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/services/api.ts` around lines 35 - 39, The exported API_BASE_URL currently bypasses normalizeApiBaseUrl when envApiBaseUrl startsWith("https"), which leaves loopback hosts like https://localhost or https://127.0.0.1 un-normalized and breaks Android emulator/socket connectivity; update the logic so that when envApiBaseUrl is present you always call normalizeApiBaseUrl(envApiBaseUrl) (falling back to DEFAULT_API_BASE_URL only when envApiBaseUrl is absent) and keep using the same API_BASE_URL constant for sockets and HTTP so both transports get the normalized host.apps/mobile-app/src/stores/media.store.ts (1)
46-60:⚠️ Potential issue | 🟡 MinorUpdate
refreshItem()to persist step/ETA fields.This patch path still overwrites only
status,progress, andfailReason. Any code path that callsgetStatus()will leavecurrentStepandestimatedTimeRemainingstale, so the new processing screen can show outdated pipeline data.Suggested fix
set((state) => ({ items: state.items.map((item) => item.id === id ? { ...item, status: statusRes.status, progress: statusRes.progress, + currentStep: statusRes.currentStep, + estimatedTimeRemaining: statusRes.estimatedTimeRemaining, failReason: statusRes.failReason, } : item, ), }));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/stores/media.store.ts` around lines 46 - 60, The refreshItem function updates only status, progress, and failReason from mediaService.getStatus; modify the state update inside refreshItem to also copy currentStep and estimatedTimeRemaining from the statusRes into the matched item so those pipeline fields are persisted (i.e., when mapping items in refreshItem, set currentStep: statusRes.currentStep and estimatedTimeRemaining: statusRes.estimatedTimeRemaining in addition to status/progress/failReason).
🧹 Nitpick comments (3)
apps/ai-engine/src/main.py (1)
334-340: Throttle chunk-level status pushes before they become control-plane churn.Each
on_chunkcallback now does a Postgres update plus Redis publish, andapps/backend-api/src/modules/socket/socket.service.ts:57-66immediately turns that into another DB read and websocket emit. Coalescing these updates by time or meaningful progress delta is usually enough for a smooth UX without hammering the write/read fan-out path.Also applies to: 416-422
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/main.py` around lines 334 - 340, The on_chunk flow is currently calling update_media_status (e.g., update_media_status(media_id, user_id=user_id, progress=progress, current_step="TRANSCRIBING", estimated_time_remaining=_eta(progress))) on every chunk which triggers DB writes and Redis publishes; modify this to throttle/coalesce updates per media_id by either (a) emitting only when a configurable time interval has passed (debounce every N seconds) or (b) when progress changes by a meaningful delta (e.g., >= 1–5%), and fall back to a final forced update at completion; implement the throttle in the on_chunk handler (or wrap update_media_status) using a per-media_id last-update timestamp and last-progress value so intermediate chunk callbacks skip calling update_media_status until the time or delta threshold is reached.apps/mobile-app/src/app/(app)/upload.tsx (1)
37-48: Trigger the next action from sheet-close completion, not a fixed 200ms delay.These flows are now coupled to a magic duration and already drift from the
BottomSheetclose timing inapps/mobile-app/src/components/BottomSheet.tsx:57-70. If that animation changes, the YouTube modal or media-picker navigation can race the sheet and cause overlap/flicker.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/app/`(app)/upload.tsx around lines 37 - 48, The handlers handleSelectYouTube and handleSelectDevice rely on a hardcoded 200ms timeout; replace that with waiting for the BottomSheet close completion instead. Change onClose to return a promise or provide a completion callback (or expose a closeCompleted event in the BottomSheet component referenced in BottomSheet.tsx), then update handleSelectYouTube to await onClose() (or use onClose().then(...)) before calling setYtVisible(true), and update handleSelectDevice to await onClose() before router.push("/media-picker"); ensure the BottomSheet's close flow (close(), closeCompleted promise/event) is implemented so these handlers no longer use setTimeout.apps/mobile-app/src/hooks/useSocketSync.ts (1)
35-40: Consider adding type safety to the cache update.Using
anytype loses type safety. Consider typing the callback parameter for better maintainability.Suggested improvement
+import type { MediaItem } from "@/types/media"; + // 2. Update the global media list so the Home library updates in background without fetching - queryClient.setQueryData(mediaKeys.all, (oldData: any) => { - if (!oldData || !Array.isArray(oldData)) return oldData; - return oldData.map((item: any) => + queryClient.setQueryData<MediaItem[] | undefined>(mediaKeys.all, (oldData) => { + if (!oldData) return oldData; + return oldData.map((item) => item.id === media.id ? { ...item, ...media } : item, ); });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/hooks/useSocketSync.ts` around lines 35 - 40, The cache update using queryClient.setQueryData(mediaKeys.all, ...) currently types the callback with any and loses type safety; update the callback signature to use the correct cached type (e.g., Array<MediaType> or MediaType[]), replace (oldData: any) with a properly typed parameter and adjust the map callback from (item: any) to (item: MediaType) so comparisons with media.id and the merged object { ...item, ...media } are type-checked; ensure MediaType defines id and other fields used and update imports/types where queryClient.setQueryData and mediaKeys.all are used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/ai-engine/src/main.py`:
- Around line 141-151: The Redis publish should be wrapped in its own try/except
so Redis failures are logged separately from DB errors: after the DB commit and
where redis_client.publish("media_updates", ...) is called, move that publish
into a nested try block and catch Exception as e (or pub_exc) and call
logger.error with a distinct message like "Redis publish failed for media
{media_id}, user {user_id}: {e}" while leaving the outer DB exception handling
intact; reference the redis_client.publish call, the logger.error for DB update,
and the media_id/user_id variables to locate where to add the new try/except.
In `@apps/backend-api/prisma/schema.prisma`:
- Around line 199-200: The currentStep column is defined as an unconstrained
String; change it to a Prisma enum and update DB migration and TypeScript DTOs
to use the closed PipelineStage set: add a Prisma enum PipelineStage with values
AUDIO_PREP, INSPECTING, VAD, TRANSCRIBING, MERGING, TRANSLATING, EXPORTING,
change the model field currentStep from String? to PipelineStage? (preserve
`@map`("current_step")), create and apply a migration that converts the existing
TEXT column to the enum type (migrate safely handling existing values),
regenerate the Prisma client, and update any DTO/type declarations so
currentStep is PipelineStage | null instead of string | null.
In `@apps/backend-api/src/modules/auth/auth.service.ts`:
- Around line 300-313: verifyAccessToken is verifying with JWT_ACCESS_SECRET
while tokens are signed with JWT_SECRET, causing all WebSocket auth to fail; fix
by making secrets consistent: either change verifyAccessToken to call
this.configService.getOrThrow('JWT_SECRET') so verification uses the same secret
as JwtModule, or update the token creation flow in generateTokens to explicitly
sign access tokens with JWT_ACCESS_SECRET (and ensure JwtModule/secret config
matches whichever choice). Update the JwtModule/secret config and any references
to JWT_SECRET/JWT_ACCESS_SECRET so JwtModule, generateTokens, and
verifyAccessToken all use the same secret symbol.
In `@apps/mobile-app/.gitignore`:
- Line 50: Add a short comment above the /stitch gitignore entry explaining what
the /stitch directory is, which tool or process creates it (e.g., "stitcher"
build step, local dev script, CI artifact, or third-party tool), and why it
should be ignored (generated artifacts, large binaries, or sensitive run-time
files), and if applicable mention the command that produces it (e.g., npm run
stitch or CI job name) so future maintainers know its origin and whether it
should be committed or recreated as part of the build/test workflow.
In `@apps/mobile-app/src/app/`(app)/media-picker.tsx:
- Around line 197-206: The CTA currently always calls requestPermission() even
when permissionResponse.canAskAgain is false; change the Pressable's onPress
handler and label logic so it checks permissionResponse.canAskAgain and if true
calls requestPermission(), otherwise opens the app settings (e.g., via
Linking.openSettings or a platform-specific Settings helper) and update the
button text accordingly (e.g., "Grant Permission" vs "Open Settings"); locate
the Pressable and its Text label in media-picker.tsx (the Pressable using
requestPermission and the permissionResponse.canAskAgain check) and implement
the conditional onPress and label change.
- Around line 75-85: The current fallback logic in media-picker.tsx sets
mimeType too coarsely (variables mimeType, asset, ext), causing
.mov/.webm/.aac/.flac/.ogg/etc. to be mislabeled; replace the simple mediaType
overrides with a small, explicit extension-to-MIME mapping (include mp4,
mov->video/quicktime, m4v, webm->video/webm, mp3->audio/mpeg, m4a->audio/mp4,
aac->audio/aac, flac->audio/flac, wav->audio/wav, ogg->audio/ogg,
opus->audio/opus, etc.), prefer using a MIME lookup on asset.filename/asset.uri
when available (or expo FileSystem/getAssetInfoAsync result) and only fall back
to application/octet-stream; update the code around the mimeType/ext logic to
consult this mapping so presign/content-type metadata are correct.
In `@apps/mobile-app/src/app/`(app)/processing.tsx:
- Around line 110-128: The component currently falls through to the normal
screen when media is undefined (so getStatusLabel maps to QUEUED); change the
rendering logic in processing.tsx to detect when the route param `id` is missing
or when `useMediaStatus(id ?? null)` returns no `media` and `isLoading` is
false, and render an explicit error/empty state (instead of the queued UI)
showing a helpful message and actions (e.g., back-to-library via `goToLibrary`)
— update the conditional that now only handles `isLoading && !media` to also
handle `!isLoading && !media` (or missing `id`) so the footer/actions and status
label aren’t incorrectly shown as QUEUED. Ensure this uses the same variables
(`media`, `isLoading`, `id`, `goToLibrary`, `goToPlayer`, and any
`getStatusLabel` usage) so the fix is localized.
- Line 110: The media status screen needs a low-frequency fallback refetch while
media is still active: update the component using useMediaStatus(id ?? null) to
trigger periodic refetches (e.g., via the hook's returned refetch function or
the underlying query's refetch) at a low interval when media?.status is not
"COMPLETED" or "FAILED" and not isLoading, and stop/clear the interval when
status reaches a terminal state or on unmount; this complements useSocketSync
(which invalidates cache on socket events) so the UI recovers if the socket
misses events.
In `@apps/mobile-app/src/components/media/PipelineStepper.tsx`:
- Around line 20-33: The stepper currently marks failed jobs as "completed";
update the StepState union to include "failed" and change getStepState so that
when status === "FAILED" it returns "completed" for stepIndex < activeIndex,
"failed" for stepIndex === activeIndex (the step that errored), and "pending"
for later steps; keep the existing logic for other statuses. Also update the
component rendering that consumes StepState (the PipelineStepper step icon/label
renderer) to treat "failed" with error styling instead of a success checkmark.
In `@apps/mobile-app/src/i18n/locales/vi/processing.json`:
- Around line 4-6: The translations for the keys "queued", "validating", and
"processing" in processing.json currently use the same string for "queued" and
"validating"; change the "validating" value to a distinct Vietnamese phrase that
reflects validation has started (e.g., "Đang xác thực..." or "Đang kiểm tra...")
so users can distinguish validation from queued and processing states; locate
the "queued", "validating", and "processing" keys in the file and replace only
the "validating" string with the new text.
In `@apps/mobile-app/src/services/socket.service.ts`:
- Around line 28-52: The socket is opened with a possibly stale access token
from getTokens and never refreshed, so on auth rejection it keeps retrying with
the same credential; update the connect flow in the SocketService (the code
around getTokens, this.getSocketUrl, this.socket creation and the
"connect_error" handler) to obtain a fresh token before connecting—reuse the
same refresh logic used in apps/mobile-app/src/services/api.ts (call the refresh
routine or refreshTokens helper prior to creating io(...)), and on
"connect_error" when the server rejects auth attempt, trigger a token refresh
and retry connecting (replace or update this.socket auth payload and reconnect)
instead of only logging the error.
- Around line 25-40: connect() currently only checks this.socket?.connected so
concurrent calls can create multiple sockets; add a guard to return early if a
connection is already in-flight or exists by introducing a boolean (e.g.,
this.connecting) and check if (this.socket || this.connecting) return at the top
of connect(); set this.connecting = true before calling io(...), clear it to
false on successful connection, on "connect" and on error/close handlers, and
ensure disconnect() clears both this.socket and this.connecting so subsequent
connect() calls work correctly.
- Around line 55-60: The concise arrow inside the "media_updated" handler is
returning listener(media) and triggering the lint rule; change the forEach
callback to a block-bodied function so it does not return a value — in the
media_updated event handler (the this.socket.on("media_updated", (media:
MediaItem) => { ... }) block) replace this.listeners.forEach((listener) =>
listener(media)) with a block-bodied callback such as
this.listeners.forEach((listener) => { listener(media); }) to satisfy linting.
In `@apps/mobile-app/src/types/media.ts`:
- Around line 25-26: The MediaItem type declares estimatedTimeRemaining as
number | null but the backend list API can omit this field (yielding undefined);
update the MediaItem declaration (and any related client DTO types) so
estimatedTimeRemaining is optional and accepts undefined as well (e.g.,
estimatedTimeRemaining?: number | null) and audit usages of
estimatedTimeRemaining in the codebase to safely handle undefined/null (look for
the MediaItem type and references to estimatedTimeRemaining and align with
MediaListItemDto from the backend).
---
Outside diff comments:
In `@apps/mobile-app/src/hooks/useMedia.ts`:
- Around line 92-105: The code calls mediaService.confirmUpload regardless of
whether the local fetch or the presigned PUT succeeded; ensure confirmUpload is
only invoked after successful file read and a successful PUT response.
Specifically, check fileResponse.ok after fetch(file.uri) and the PUT
response.ok after await fetch(uploadUrl, { method: "PUT", ... }); throw or
return early on failure (or propagate the error) so that
mediaService.confirmUpload({ title, objectKey, processingMode:
"TRANSCRIBE_TRANSLATE", ... }) is only executed when fileResponse and the PUT
response are both successful; include the objectKey/title variables in the same
scope so they are unchanged when calling confirmUpload.
In `@apps/mobile-app/src/services/api.ts`:
- Around line 35-39: The exported API_BASE_URL currently bypasses
normalizeApiBaseUrl when envApiBaseUrl startsWith("https"), which leaves
loopback hosts like https://localhost or https://127.0.0.1 un-normalized and
breaks Android emulator/socket connectivity; update the logic so that when
envApiBaseUrl is present you always call normalizeApiBaseUrl(envApiBaseUrl)
(falling back to DEFAULT_API_BASE_URL only when envApiBaseUrl is absent) and
keep using the same API_BASE_URL constant for sockets and HTTP so both
transports get the normalized host.
In `@apps/mobile-app/src/stores/media.store.ts`:
- Around line 46-60: The refreshItem function updates only status, progress, and
failReason from mediaService.getStatus; modify the state update inside
refreshItem to also copy currentStep and estimatedTimeRemaining from the
statusRes into the matched item so those pipeline fields are persisted (i.e.,
when mapping items in refreshItem, set currentStep: statusRes.currentStep and
estimatedTimeRemaining: statusRes.estimatedTimeRemaining in addition to
status/progress/failReason).
---
Nitpick comments:
In `@apps/ai-engine/src/main.py`:
- Around line 334-340: The on_chunk flow is currently calling
update_media_status (e.g., update_media_status(media_id, user_id=user_id,
progress=progress, current_step="TRANSCRIBING",
estimated_time_remaining=_eta(progress))) on every chunk which triggers DB
writes and Redis publishes; modify this to throttle/coalesce updates per
media_id by either (a) emitting only when a configurable time interval has
passed (debounce every N seconds) or (b) when progress changes by a meaningful
delta (e.g., >= 1–5%), and fall back to a final forced update at completion;
implement the throttle in the on_chunk handler (or wrap update_media_status)
using a per-media_id last-update timestamp and last-progress value so
intermediate chunk callbacks skip calling update_media_status until the time or
delta threshold is reached.
In `@apps/mobile-app/src/app/`(app)/upload.tsx:
- Around line 37-48: The handlers handleSelectYouTube and handleSelectDevice
rely on a hardcoded 200ms timeout; replace that with waiting for the BottomSheet
close completion instead. Change onClose to return a promise or provide a
completion callback (or expose a closeCompleted event in the BottomSheet
component referenced in BottomSheet.tsx), then update handleSelectYouTube to
await onClose() (or use onClose().then(...)) before calling setYtVisible(true),
and update handleSelectDevice to await onClose() before
router.push("/media-picker"); ensure the BottomSheet's close flow (close(),
closeCompleted promise/event) is implemented so these handlers no longer use
setTimeout.
In `@apps/mobile-app/src/hooks/useSocketSync.ts`:
- Around line 35-40: The cache update using
queryClient.setQueryData(mediaKeys.all, ...) currently types the callback with
any and loses type safety; update the callback signature to use the correct
cached type (e.g., Array<MediaType> or MediaType[]), replace (oldData: any) with
a properly typed parameter and adjust the map callback from (item: any) to
(item: MediaType) so comparisons with media.id and the merged object { ...item,
...media } are type-checked; ensure MediaType defines id and other fields used
and update imports/types where queryClient.setQueryData and mediaKeys.all are
used.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8ea4ce7c-1708-49ea-b482-d8b2564b43e1
⛔ Files ignored due to path filters (2)
apps/backend-api/pnpm-lock.yamlis excluded by!**/pnpm-lock.yamlapps/mobile-app/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (39)
apps/ai-engine/last_llm_response.txtapps/ai-engine/src/main.pyapps/backend-api/package.jsonapps/backend-api/prisma/migrations/20260306075830_add_eta_and_step/migration.sqlapps/backend-api/prisma/schema.prismaapps/backend-api/src/app.module.tsapps/backend-api/src/common/constants/enums.tsapps/backend-api/src/common/constants/index.tsapps/backend-api/src/modules/auth/auth.service.tsapps/backend-api/src/modules/media/dto/response.dto.tsapps/backend-api/src/modules/media/media.module.tsapps/backend-api/src/modules/media/media.service.tsapps/backend-api/src/modules/socket/socket.gateway.tsapps/backend-api/src/modules/socket/socket.module.tsapps/backend-api/src/modules/socket/socket.service.tsapps/mobile-app/.gitignoreapps/mobile-app/app.jsonapps/mobile-app/package.jsonapps/mobile-app/src/app/(app)/_layout.tsxapps/mobile-app/src/app/(app)/media-picker.tsxapps/mobile-app/src/app/(app)/processing.tsxapps/mobile-app/src/app/(app)/upload.tsxapps/mobile-app/src/components/StatusBadge.tsxapps/mobile-app/src/components/index.tsapps/mobile-app/src/components/media/PipelineStepper.tsxapps/mobile-app/src/components/media/YouTubeModal.tsxapps/mobile-app/src/constants/pipeline.tsapps/mobile-app/src/hooks/useMedia.tsapps/mobile-app/src/hooks/useSocketSync.tsapps/mobile-app/src/i18n/i18n.tsapps/mobile-app/src/i18n/locales/en/common.jsonapps/mobile-app/src/i18n/locales/en/processing.jsonapps/mobile-app/src/i18n/locales/vi/common.jsonapps/mobile-app/src/i18n/locales/vi/processing.jsonapps/mobile-app/src/services/api.tsapps/mobile-app/src/services/media.services.tsapps/mobile-app/src/services/socket.service.tsapps/mobile-app/src/stores/media.store.tsapps/mobile-app/src/types/media.ts
💤 Files with no reviewable changes (3)
- apps/mobile-app/src/i18n/locales/en/common.json
- apps/mobile-app/src/i18n/locales/vi/common.json
- apps/ai-engine/last_llm_response.txt
|
|
||
| # Publish update event to Redis for NestJS WebSockets | ||
| if user_id: | ||
| logger.debug(f"Publishing media_updated event for {media_id}") | ||
| redis_client.publish("media_updates", json.dumps({ | ||
| "mediaId": media_id, | ||
| "userId": user_id | ||
| })) | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"DB update failed for media {media_id}: {e}") |
There was a problem hiding this comment.
Separate Redis publish failures from database write failures.
The row is already committed before redis_client.publish(...) runs. If Redis is unavailable, this path logs a DB failure even though persistence succeeded, and the realtime update just disappears. Give publish its own try/except so the failure mode is accurate and debuggable.
Suggested fix
with conn:
with conn.cursor() as cur:
cur.execute(sql, values)
conn.close()
-
- # Publish update event to Redis for NestJS WebSockets
- if user_id:
- logger.debug(f"Publishing media_updated event for {media_id}")
- redis_client.publish("media_updates", json.dumps({
- "mediaId": media_id,
- "userId": user_id
- }))
+
+ if user_id:
+ try:
+ logger.debug(f"Publishing media_updated event for {media_id}")
+ redis_client.publish(
+ "media_updates",
+ json.dumps({"mediaId": media_id, "userId": user_id}),
+ )
+ except Exception as e:
+ logger.error(f"Redis publish failed for media {media_id}: {e}")🧰 Tools
🪛 Ruff (0.15.5)
[warning] 150-150: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/ai-engine/src/main.py` around lines 141 - 151, The Redis publish should
be wrapped in its own try/except so Redis failures are logged separately from DB
errors: after the DB commit and where redis_client.publish("media_updates", ...)
is called, move that publish into a nested try block and catch Exception as e
(or pub_exc) and call logger.error with a distinct message like "Redis publish
failed for media {media_id}, user {user_id}: {e}" while leaving the outer DB
exception handling intact; reference the redis_client.publish call, the
logger.error for DB update, and the media_id/user_id variables to locate where
to add the new try/except.
| currentStep String? @map("current_step") // Active pipeline stage: AUDIO_PREP, VAD, TRANSCRIBING, etc. | ||
| estimatedTimeRemaining Int? @map("estimated_time_remaining") // Seconds remaining (ETA), null when idle/done |
There was a problem hiding this comment.
❓ Verification inconclusive
Script executed:
#!/bin/bash
rg -n 'enum PipelineStage|currentStep|current_step' \
apps/backend-api/prisma/schema.prisma \
apps/backend-api/src \
apps/mobile-app/srcRepository: KMA-Final-Project/project
Repository: KMA-Final-Project/project
Exit code: 0
stdout:
apps/backend-api/prisma/schema.prisma:199: currentStep String? `@map`("current_step") // Active pipeline stage: AUDIO_PREP, VAD, TRANSCRIBING, etc.
apps/mobile-app/src/types/media.ts:25: currentStep: string | null; // e.g. 'TRANSCRIBING', null when idle/completed
apps/mobile-app/src/types/media.ts:77: currentStep: string | null;
apps/mobile-app/src/components/media/PipelineStepper.tsx:5: * Maps the AI engine's `currentStep` string constants to user-friendly labels.
apps/mobile-app/src/components/media/PipelineStepper.tsx:16: currentStep: string | null;
apps/mobile-app/src/components/media/PipelineStepper.tsx:36:export function PipelineStepper({ currentStep, status }: PipelineStepperProps) {
apps/mobile-app/src/components/media/PipelineStepper.tsx:40: const activeIndex = currentStep
apps/mobile-app/src/components/media/PipelineStepper.tsx:41: ? PIPELINE_STEPS.findIndex((s) => s.key === currentStep)
apps/backend-api/src/modules/media/media.service.ts:260: currentStep: true,
apps/backend-api/src/modules/media/media.service.ts:293: currentStep: true,
apps/backend-api/src/common/constants/enums.ts:1:export enum PipelineStage {
apps/backend-api/src/modules/media/dto/response.dto.ts:142: currentStep: string | null;
apps/backend-api/src/modules/media/dto/response.dto.ts:199: currentStep: string | null;
apps/mobile-app/src/app/(app)/processing.tsx:282: currentStep={isDone ? "EXPORTING" : (media?.currentStep ?? null)}
Script executed:
cat -n apps/backend-api/src/common/constants/enums.ts | head -30Repository: KMA-Final-Project/project
Repository: KMA-Final-Project/project
Exit code: 0
stdout:
1 export enum PipelineStage {
2 AUDIO_PREP = 'AUDIO_PREP',
3 INSPECTING = 'INSPECTING',
4 VAD = 'VAD',
5 TRANSCRIBING = 'TRANSCRIBING',
6 MERGING = 'MERGING',
7 TRANSLATING = 'TRANSLATING',
8 EXPORTING = 'EXPORTING',
9 }
Script executed:
rg -n 'currentStep' apps/backend-api/src -A 3 -B 1 | head -50Repository: KMA-Final-Project/project
Repository: KMA-Final-Project/project
Exit code: 0
stdout:
apps/backend-api/src/modules/media/dto/response.dto.ts-141- })
apps/backend-api/src/modules/media/dto/response.dto.ts:142: currentStep: string | null;
apps/backend-api/src/modules/media/dto/response.dto.ts-143-
apps/backend-api/src/modules/media/dto/response.dto.ts-144- `@ApiProperty`({
apps/backend-api/src/modules/media/dto/response.dto.ts-145- description:
--
apps/backend-api/src/modules/media/dto/response.dto.ts-198- })
apps/backend-api/src/modules/media/dto/response.dto.ts:199: currentStep: string | null;
apps/backend-api/src/modules/media/dto/response.dto.ts-200-
apps/backend-api/src/modules/media/dto/response.dto.ts-201- `@ApiProperty`()
apps/backend-api/src/modules/media/dto/response.dto.ts-202- createdAt: Date;
--
apps/backend-api/src/modules/media/media.service.ts-259- originType: true,
apps/backend-api/src/modules/media/media.service.ts:260: currentStep: true,
apps/backend-api/src/modules/media/media.service.ts-261- estimatedTimeRemaining: true,
apps/backend-api/src/modules/media/media.service.ts-262- transcriptS3Key: true,
apps/backend-api/src/modules/media/media.service.ts-263- subtitleS3Key: true,
--
apps/backend-api/src/modules/media/media.service.ts-292- durationSeconds: true,
apps/backend-api/src/modules/media/media.service.ts:293: currentStep: true,
apps/backend-api/src/modules/media/media.service.ts-294- createdAt: true,
apps/backend-api/src/modules/media/media.service.ts-295- },
apps/backend-api/src/modules/media/media.service.ts-296- orderBy: { createdAt: 'desc' },
Script executed:
cat -n apps/backend-api/prisma/schema.prisma | grep -A 30 "enum\|MediaItem"Repository: KMA-Final-Project/project
Repository: KMA-Final-Project/project
Exit code: 0
stdout:
18 enum Role {
19 USER
20 ADMIN
21 }
22
23 enum SubscriptionStatus {
24 ACTIVE
25 EXPIRED
26 CANCELLED
27 }
28
29 enum MediaOriginType {
30 LOCAL
31 YOUTUBE
32 }
33
34 enum MediaStatus {
35 QUEUED
36 VALIDATING
37 PROCESSING
38 COMPLETED
39 FAILED
40 }
41
42 enum ProcessingMode {
43 TRANSCRIBE
44 TRANSCRIBE_TRANSLATE
45 }
46
47 enum OtpType {
48 REGISTER
49 FORGOT_PASSWORD
50 }
51
52 // --- MODELS ---
53
54 model User {
55 id String `@id` `@default`(uuid())
56 email String `@unique`
57 fullName String `@map`("full_name")
58 passwordHash String `@map`("password_hash")
59 emailVerified Boolean `@default`(false) `@map`("email_verified")
60
61 // Subscription Info
62 currentSubscriptionId String? `@map`("current_subscription_id")
63 quotaUsageCurrentMonth Int `@default`(0) `@map`("quota_usage_current_month")
64 quotaUsageCurrentMonthSeconds Int `@default`(0) `@map`("quota_usage_current_month_seconds") // Track duration
65 lastQuotaResetDate DateTime `@default`(now()) `@map`("last_quota_reset_date")
66 role Role `@default`(USER)
67
68 // Relationships
69 currentSubscription Subscription? `@relation`("UserCurrentSub", fields: [currentSubscriptionId], references: [id])
70 subscriptions Subscription[] `@relation`("UserHistorySub")
71 usageHistories UsageHistory[]
72 mediaItems MediaItem[]
73 vocabularies UserVocabulary[]
74 otps Otp[]
75 refreshTokens RefreshToken[]
76
77 createdAt DateTime `@default`(now()) `@map`("created_at")
78 updatedAt DateTime `@updatedAt` `@map`("updated_at")
79
80 @@map("users")
81 }
82
83 model SubscriptionPlan {
84 id String `@id` // e.g. "FREE", "PRO_MONTHLY"
85 code String `@unique`
86 name String
87 description String? `@db.Text`
88
89 // Marketing & Display
90 features Json? `@map`("features") // ["No Ads", "High Quality", "Fast Processing", etc]
91 tierLevel Int `@default`(1)
92 isActive Boolean `@default`(true) `@map`("is_active")
93
94 variants PlanVariant[]
95
96 createdAt DateTime `@default`(now()) `@map`("created_at")
97 updatedAt DateTime `@updatedAt` `@map`("updated_at")
98
99 @@map("subscription_plans")
100 }
101
102 model PlanVariant {
--
127 enum BillingCycleType {
128 MONTHLY
129 SIX_MONTHS
130 YEARLY
131 LIFETIME
132 }
133
134 model Subscription {
135 id String `@id` `@default`(uuid())
136 userId String `@map`("user_id")
137
138 variantId String `@map`("variant_id")
139
140 startDate DateTime `@map`("start_date")
141 endDate DateTime `@map`("end_date")
142
143 status SubscriptionStatus `@default`(ACTIVE)
144
145 // Snapshot
146 priceSnapshot Decimal `@map`("price_snapshot") `@db.Decimal`(10, 2)
147 maxDurationPerFileSnapshot Int `@map`("max_duration_per_file_snapshot")
148 monthlyQuotaSecondsSnapshot Int `@map`("monthly_quota_seconds_snapshot")
149
150 // Relationships
151 user User `@relation`("UserHistorySub", fields: [userId], references: [id])
152 users User[] `@relation`("UserCurrentSub")
153 variant PlanVariant `@relation`(fields: [variantId], references: [id])
154
155 usageHistories UsageHistory[]
156
157 createdAt DateTime `@default`(now()) `@map`("created_at")
--
179 model MediaItem {
180 id String `@id` `@default`(uuid())
181 userId String `@map`("user_id")
182
183 title String
184 originType MediaOriginType `@map`("origin_type")
185 originUrl String? `@map`("origin_url") // Youtube URL or null if LOCAL
186
187 // Processing Configuration
188 processingMode ProcessingMode `@default`(TRANSCRIBE) `@map`("processing_mode")
189 sourceLanguage String? `@map`("source_language") // Auto-detected: "en", "zh", "vi"
190
191 // Storage Keys (MinIO/S3 Paths)
192 audioS3Key String `@map`("audio_s3_key")
193 transcriptS3Key String? `@map`("transcript_s3_key") // Transcription-only JSON path
194 subtitleS3Key String? `@map`("subtitle_s3_key") // Translated subtitle JSON path
195
196 durationSeconds Int `@default`(0) `@map`("duration_seconds")
197 status MediaStatus `@default`(QUEUED)
198 progress Float `@default`(0) `@map`("progress") // 0.0 to 1.0
199 currentStep String? `@map`("current_step") // Active pipeline stage: AUDIO_PREP, VAD, TRANSCRIBING, etc.
200 estimatedTimeRemaining Int? `@map`("estimated_time_remaining") // Seconds remaining (ETA), null when idle/done
201 failReason String? `@map`("fail_reason") // Human-readable error
202
203 // Logic flag
204 countedInQuota Boolean `@default`(false) `@map`("counted_in_quota")
205
206 // Relationships
207 user User `@relation`(fields: [userId], references: [id])
208 userVocabularies UserVocabulary[]
209
--
239 mediaItem MediaItem? `@relation`(fields: [mediaItemId], references: [id])
240
241 @@unique([userId, vocabularyId]) // Prevent duplicate saves per user
242 @@map("user_vocabularies")
243 }
244
245 model Otp {
246 id String `@id` `@default`(uuid())
247 email String // Always required for email delivery
248 userId String? `@map`("user_id") // Optional - null for REGISTER, set for FORGOT_PASSWORD
249 code String // 6-digit OTP code
250 type OtpType
251
252 expiresAt DateTime `@map`("expires_at")
253 verified Boolean `@default`(false)
254 createdAt DateTime `@default`(now()) `@map`("created_at")
255
256 user User? `@relation`(fields: [userId], references: [id])
257
258 @@index([email, type])
259 @@index([userId, type])
260 @@map("otps")
261 }
262
263 model RefreshToken {
264 id String `@id` `@default`(uuid())
265 token String `@unique`
266 userId String `@map`("user_id")
267 deviceInfo String? `@map`("device_info") // User Agent
268 ip String? // Client IP address
269 expiresAt DateTime `@map`("expires_at")
Script executed:
rg -n 'PipelineStage\|validate.*currentStep\|currentStep.*valid' apps/backend-api/src --type ts -B 2 -A 2Repository: KMA-Final-Project/project
Repository: KMA-Final-Project/project
Exit code: 1
Script executed:
rg -n 'currentStep|PipelineStage' apps/backend-api/src/modules/media/media.service.ts -B 3 -A 5 | head -80Repository: KMA-Final-Project/project
Repository: KMA-Final-Project/project
Exit code: 0
stdout:
257- durationSeconds: true,
258- failReason: true,
259- originType: true,
260: currentStep: true,
261- estimatedTimeRemaining: true,
262- transcriptS3Key: true,
263- subtitleS3Key: true,
264- createdAt: true,
265- },
--
290- originType: true,
291- originUrl: true,
292- durationSeconds: true,
293: currentStep: true,
294- createdAt: true,
295- },
296- orderBy: { createdAt: 'desc' },
297- });
298- }
Script executed:
find apps/backend-api -name "*migration*" -o -name "*prisma*" -type f | grep -E "\.(sql|ts)$" | head -20Repository: KMA-Final-Project/project
Repository: KMA-Final-Project/project
Exit code: 0
stdout:
apps/backend-api/prisma.config.ts
apps/backend-api/prisma/migrations/20260201140422_init/migration.sql
apps/backend-api/prisma/migrations/20260204140449_add_otp_model/migration.sql
apps/backend-api/prisma/migrations/20260205162132_add_refresh_token/migration.sql
apps/backend-api/prisma/migrations/20260208142341_update_subcriptions/migration.sql
apps/backend-api/prisma/migrations/20260211192356_add_processing_fields/migration.sql
apps/backend-api/prisma/migrations/20260306075830_add_eta_and_step/migration.sql
apps/backend-api/src/prisma/prisma.module.ts
apps/backend-api/src/prisma/prisma.service.ts
Script executed:
rg -n 'currentStep|current_step' apps/backend-api/prisma -t sql -t ts -B 2 -A 2 2>/dev/null | head -50Repository: KMA-Final-Project/project
Repository: KMA-Final-Project/project
Exit code: 0
stdout:
apps/backend-api/prisma/migrations/20260306075830_add_eta_and_step/migration.sql-1--- AlterTable
apps/backend-api/prisma/migrations/20260306075830_add_eta_and_step/migration.sql:2:ALTER TABLE "media_items" ADD COLUMN "current_step" TEXT,
apps/backend-api/prisma/migrations/20260306075830_add_eta_and_step/migration.sql-3-ADD COLUMN "estimated_time_remaining" INTEGER;
Persist currentStep as a Prisma enum, not free-form text.
The codebase defines PipelineStage as a closed set of values in TypeScript (AUDIO_PREP, INSPECTING, VAD, TRANSCRIBING, MERGING, TRANSLATING, EXPORTING), but the Prisma schema allows any string here. Without type safety at the database layer, invalid values can be stored permanently and sent to clients that only recognize the supported stages. Currently, the migration creates current_step as unconstrained TEXT with no validation preventing bad data.
Suggested fix
enum MediaStatus {
QUEUED
VALIDATING
PROCESSING
COMPLETED
FAILED
}
+
+enum PipelineStage {
+ AUDIO_PREP
+ INSPECTING
+ VAD
+ TRANSCRIBING
+ MERGING
+ TRANSLATING
+ EXPORTING
+}
@@
- currentStep String? `@map`("current_step") // Active pipeline stage: AUDIO_PREP, VAD, TRANSCRIBING, etc.
+ currentStep PipelineStage? `@map`("current_step") // Active pipeline stageUpdate the corresponding migration to use the enum type, and consider updating the DTOs to type currentStep as PipelineStage | null instead of string | null to enforce the contract end-to-end.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/backend-api/prisma/schema.prisma` around lines 199 - 200, The
currentStep column is defined as an unconstrained String; change it to a Prisma
enum and update DB migration and TypeScript DTOs to use the closed PipelineStage
set: add a Prisma enum PipelineStage with values AUDIO_PREP, INSPECTING, VAD,
TRANSCRIBING, MERGING, TRANSLATING, EXPORTING, change the model field
currentStep from String? to PipelineStage? (preserve `@map`("current_step")),
create and apply a migration that converts the existing TEXT column to the enum
type (migrate safely handling existing values), regenerate the Prisma client,
and update any DTO/type declarations so currentStep is PipelineStage | null
instead of string | null.
|
|
||
| /my-scripts | ||
|
|
||
| /stitch No newline at end of file |
There was a problem hiding this comment.
Clarify the purpose of the /stitch directory.
The addition of /stitch to the gitignore file lacks context. It's unclear what this directory is for or how it relates to the PR's objectives (real-time media processing progress tracking). Please clarify:
- What generates or uses the
/stitchdirectory? - Why should it be excluded from version control?
- Is this related to a specific tool, build process, or development workflow?
Consider adding a comment above the entry to document its purpose for future maintainers.
📝 Suggested improvement with documentation
/my-scripts
+# stitch - [Add explanation here: e.g., "generated by XYZ tool" or "local development artifacts"]
/stitch🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/mobile-app/.gitignore` at line 50, Add a short comment above the /stitch
gitignore entry explaining what the /stitch directory is, which tool or process
creates it (e.g., "stitcher" build step, local dev script, CI artifact, or
third-party tool), and why it should be ignored (generated artifacts, large
binaries, or sensitive run-time files), and if applicable mention the command
that produces it (e.g., npm run stitch or CI job name) so future maintainers
know its origin and whether it should be committed or recreated as part of the
build/test workflow.
| // Determine an appropriate mimeType fallback since getAssetInfoAsync doesn't always return it | ||
| let mimeType = "application/octet-stream"; | ||
| if (asset.mediaType === "video") mimeType = "video/mp4"; | ||
| if (asset.mediaType === "audio") mimeType = "audio/mpeg"; | ||
|
|
||
| // Extension fallback | ||
| const ext = asset.filename.split(".").pop()?.toLowerCase(); | ||
| if (ext === "mp3") mimeType = "audio/mpeg"; | ||
| if (ext === "m4a") mimeType = "audio/mp4"; | ||
| if (ext === "wav") mimeType = "audio/wav"; | ||
|
|
There was a problem hiding this comment.
Broaden MIME detection before uploading device files.
Lines 77-85 collapse almost every video to video/mp4 and almost every audio file to audio/mpeg. Common library assets like .mov, .aac, .flac, or .ogg will be mislabeled here, which can break presign validation or store the wrong content type metadata.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/mobile-app/src/app/`(app)/media-picker.tsx around lines 75 - 85, The
current fallback logic in media-picker.tsx sets mimeType too coarsely (variables
mimeType, asset, ext), causing .mov/.webm/.aac/.flac/.ogg/etc. to be mislabeled;
replace the simple mediaType overrides with a small, explicit extension-to-MIME
mapping (include mp4, mov->video/quicktime, m4v, webm->video/webm,
mp3->audio/mpeg, m4a->audio/mp4, aac->audio/aac, flac->audio/flac,
wav->audio/wav, ogg->audio/ogg, opus->audio/opus, etc.), prefer using a MIME
lookup on asset.filename/asset.uri when available (or expo
FileSystem/getAssetInfoAsync result) and only fall back to
application/octet-stream; update the code around the mimeType/ext logic to
consult this mapping so presign/content-type metadata are correct.
| "queued": "Đang chờ bắt đầu...", | ||
| "validating": "Đang chờ bắt đầu...", | ||
| "processing": "Đang tạo phụ đề...", |
There was a problem hiding this comment.
Give validating its own copy.
queued and validating currently render the same text, so users never see that validation has actually started.
Suggested fix
"status": {
"queued": "Đang chờ bắt đầu...",
- "validating": "Đang chờ bắt đầu...",
+ "validating": "Đang kiểm tra tệp...",
"processing": "Đang tạo phụ đề...",📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "queued": "Đang chờ bắt đầu...", | |
| "validating": "Đang chờ bắt đầu...", | |
| "processing": "Đang tạo phụ đề...", | |
| "queued": "Đang chờ bắt đầu...", | |
| "validating": "Đang kiểm tra tệp...", | |
| "processing": "Đang tạo phụ đề...", |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/mobile-app/src/i18n/locales/vi/processing.json` around lines 4 - 6, The
translations for the keys "queued", "validating", and "processing" in
processing.json currently use the same string for "queued" and "validating";
change the "validating" value to a distinct Vietnamese phrase that reflects
validation has started (e.g., "Đang xác thực..." or "Đang kiểm tra...") so users
can distinguish validation from queued and processing states; locate the
"queued", "validating", and "processing" keys in the file and replace only the
"validating" string with the new text.
| async connect() { | ||
| if (this.socket?.connected) return; | ||
|
|
||
| const tokens = await getTokens(); | ||
| if (!tokens?.accessToken) { | ||
| console.warn("[Socket] Cannot connect without access token"); | ||
| return; | ||
| } | ||
|
|
||
| const socketUrl = this.getSocketUrl(); | ||
|
|
||
| this.socket = io(socketUrl, { | ||
| auth: { token: `Bearer ${tokens.accessToken}` }, | ||
| reconnection: true, | ||
| transports: ["websocket"], | ||
| }); |
There was a problem hiding this comment.
Guard connect() against duplicate in-flight connections.
Line 26 only checks this.socket?.connected, so a second connect() call before the first handshake finishes will create another client. The older socket then keeps running outside this singleton and every media_updated event can be delivered twice.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/mobile-app/src/services/socket.service.ts` around lines 25 - 40,
connect() currently only checks this.socket?.connected so concurrent calls can
create multiple sockets; add a guard to return early if a connection is already
in-flight or exists by introducing a boolean (e.g., this.connecting) and check
if (this.socket || this.connecting) return at the top of connect(); set
this.connecting = true before calling io(...), clear it to false on successful
connection, on "connect" and on error/close handlers, and ensure disconnect()
clears both this.socket and this.connecting so subsequent connect() calls work
correctly.
| const tokens = await getTokens(); | ||
| if (!tokens?.accessToken) { | ||
| console.warn("[Socket] Cannot connect without access token"); | ||
| return; | ||
| } | ||
|
|
||
| const socketUrl = this.getSocketUrl(); | ||
|
|
||
| this.socket = io(socketUrl, { | ||
| auth: { token: `Bearer ${tokens.accessToken}` }, | ||
| reconnection: true, | ||
| transports: ["websocket"], | ||
| }); | ||
|
|
||
| this.socket.on("connect", () => { | ||
| console.log("[Socket] Connected to server", this.socket?.id); | ||
| }); | ||
|
|
||
| this.socket.on("disconnect", (reason) => { | ||
| console.log("[Socket] Disconnected:", reason); | ||
| }); | ||
|
|
||
| this.socket.on("connect_error", (error) => { | ||
| console.error("[Socket] Connection error:", error.message); | ||
| }); |
There was a problem hiding this comment.
Don't open the socket with a stale access token.
This path reads the stored token directly, but unlike apps/mobile-app/src/services/api.ts:126-142 it never refreshes it. Once apps/backend-api/src/modules/socket/socket.gateway.ts:33-66 rejects an expired token, this service just logs connect_error and keeps retrying with the same bad credential, so live processing updates silently stop.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/mobile-app/src/services/socket.service.ts` around lines 28 - 52, The
socket is opened with a possibly stale access token from getTokens and never
refreshed, so on auth rejection it keeps retrying with the same credential;
update the connect flow in the SocketService (the code around getTokens,
this.getSocketUrl, this.socket creation and the "connect_error" handler) to
obtain a fresh token before connecting—reuse the same refresh logic used in
apps/mobile-app/src/services/api.ts (call the refresh routine or refreshTokens
helper prior to creating io(...)), and on "connect_error" when the server
rejects auth attempt, trigger a token refresh and retry connecting (replace or
update this.socket auth payload and reconnect) instead of only logging the
error.
| this.socket.on("media_updated", (media: MediaItem) => { | ||
| console.log( | ||
| `[Socket] Media ${media.id} updated to status ${media.status}`, | ||
| ); | ||
| this.listeners.forEach((listener) => listener(media)); | ||
| }); |
There was a problem hiding this comment.
Use a block-bodied callback here.
Biome is right on Line 59: the concise arrow returns listener(media), which trips lint/suspicious/useIterableCallbackReturn and can keep this file failing lint.
♻️ Proposed fix
- this.listeners.forEach((listener) => listener(media));
+ this.listeners.forEach((listener) => {
+ listener(media);
+ });📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| this.socket.on("media_updated", (media: MediaItem) => { | |
| console.log( | |
| `[Socket] Media ${media.id} updated to status ${media.status}`, | |
| ); | |
| this.listeners.forEach((listener) => listener(media)); | |
| }); | |
| this.socket.on("media_updated", (media: MediaItem) => { | |
| console.log( | |
| `[Socket] Media ${media.id} updated to status ${media.status}`, | |
| ); | |
| this.listeners.forEach((listener) => { | |
| listener(media); | |
| }); | |
| }); |
🧰 Tools
🪛 Biome (2.4.6)
[error] 59-59: This callback passed to forEach() iterable method should not return a value.
(lint/suspicious/useIterableCallbackReturn)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/mobile-app/src/services/socket.service.ts` around lines 55 - 60, The
concise arrow inside the "media_updated" handler is returning listener(media)
and triggering the lint rule; change the forEach callback to a block-bodied
function so it does not return a value — in the media_updated event handler (the
this.socket.on("media_updated", (media: MediaItem) => { ... }) block) replace
this.listeners.forEach((listener) => listener(media)) with a block-bodied
callback such as this.listeners.forEach((listener) => { listener(media); }) to
satisfy linting.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Phase A (B1) — Backend presigned GET URL infrastructure:
- MinioService: add `bucketProcessed` field + `generatePresignedGetUrl()`
- MediaService: add `getProcessedFileUrl()` method
- MediaController: add `GET :id/download-url` endpoint
- response.dto.ts: add `DownloadUrlResponseDto`
Phase B (A1+A2) — AI Engine presigned URLs + rich Redis payloads:
- config.py: add `MINIO_PUBLIC_ENDPOINT` setting
- minio_client.py: add `get_presigned_url()`, update `upload_chunk()`,
`upload_translated_batch()`, and `upload_final_result()` to return
`(s3_key, presigned_url)` tuples
- main.py: remove legacy `{mediaId, userId}` Redis publish from
`update_media_status()`; add typed publishers (publish_progress,
publish_chunk_ready, publish_batch_ready, publish_completed,
publish_failed); wire them throughout both pipeline modes
Phase C (A3) — SocketService refactored to thin event router:
- socket.types.ts: new file with MediaEvent discriminated union
- socket.service.ts: switch on event.type, emit separated event names
(media_progress, media_chunk_ready, media_batch_ready, media_completed,
media_failed); remove DB query on every tick; remove MediaService dep
- socket.module.ts: remove MediaModule import
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…'source_lang' in system prompt building
Co-authored-by: sondoannam <80660666+sondoannam@users.noreply.github.com>
Co-authored-by: sondoannam <80660666+sondoannam@users.noreply.github.com>
…ructions-file Add repository-level Copilot onboarding instructions for the monorepo
…processing medias
- apps/ai-engine/tests/test_streaming_contracts.py - apps/ai-engine/src/minio_client.py - apps/ai-engine/src/schemas.py - apps/ai-engine/ARCHITECTURE_CONTEXT.md - .gsd/milestones/M001/slices/S02/tasks/T01-SUMMARY.md - .gsd/milestones/M001/slices/S02/S02-PLAN.md - .gsd/STATE.md
- apps/ai-engine/src/schemas.py - apps/ai-engine/src/async_pipeline.py - apps/mobile-app/src/types/subtitle.ts - apps/ai-engine/tests/test_streaming_contracts.py
- apps/mobile-app/src/types/subtitle.ts - apps/mobile-app/src/hooks/useProcessingSubtitles.ts - apps/ai-engine/ARCHITECTURE_CONTEXT.md
- apps/ai-engine/src/scripts/test_v2_pipeline.py - apps/ai-engine/tests/test_first_batch_streaming.py - apps/ai-engine/ARCHITECTURE_CONTEXT.md - .gsd/KNOWLEDGE.md - .gsd/DECISIONS.md
- apps/ai-engine/tests/test_event_discipline.py - .gsd/milestones/M001/slices/S03/tasks/T01-PLAN.md
- apps/ai-engine/src/scripts/test_v2_pipeline.py - .gsd/KNOWLEDGE.md
feat(pipeline): ✨ rebuild AI Engine pipeline
…line feat(pipeline): ✨ implement two-tier subtitle streaming
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
apps/mobile-app/src/hooks/useMedia.ts (1)
157-170:⚠️ Potential issue | 🔴 CriticalAdd response validation to both fetch calls before proceeding to
confirmUpload().The code lacks HTTP status checks on both fetch operations.
fetch()resolves on all HTTP status codes—only network errors throw. Without validatingresponse.ok, a failed local file read (404/403) or failed MinIO PUT will still proceed toconfirmUpload(), creating a media record for a file that was never stored.🛠️ Suggested guard
// Step 2 — Upload file blob directly to MinIO (bypasses our API server) const fileResponse = await fetch(file.uri); + if (!fileResponse.ok) { + throw new Error(`Failed to read local file: ${fileResponse.status}`); + } const fileBlob = await fileResponse.blob(); - await fetch(uploadUrl, { + const uploadResponse = await fetch(uploadUrl, { method: "PUT", body: fileBlob, headers: { "Content-Type": file.mimeType }, }); + if (!uploadResponse.ok) { + throw new Error(`Upload failed: ${uploadResponse.status}`); + } // Step 3 — Confirm upload → creates MediaItem + dispatches queue job const title = file.name.replace(/\.[^.]+$/, ""); // strip file extension for title return mediaService.confirmUpload({🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/hooks/useMedia.ts` around lines 157 - 170, Add HTTP response validation for the two fetch calls before calling mediaService.confirmUpload: after the initial fetch(file.uri) check fileResponse.ok (and handle/throw/log a descriptive error referencing file.uri or file.name) and similarly check the PUT response from fetch(uploadUrl, { method: "PUT", ... }) for ok before proceeding; if either response is not ok, abort and do not call confirmUpload (the code around fileResponse, fileBlob, uploadUrl, and mediaService.confirmUpload should perform these guards and surface the error).
♻️ Duplicate comments (2)
apps/mobile-app/src/app/(app)/processing.tsx (1)
73-75:⚠️ Potential issue | 🟠 MajorHandle missing media before rendering the queued state.
If
idis absent or the status query finishes withoutmedia, this component falls through withstatus === undefinedand shows the normal queued UI with the wrong footer/actions. Return an explicit empty/error state for!id || (!isLoading && !media)instead of using the processing layout.Also applies to: 115-121
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/app/`(app)/processing.tsx around lines 73 - 75, When rendering the processing UI, guard against missing id or missing media by returning an explicit empty/error state when id is falsy or when the status query finished without media (i.e., !id || (!isLoading && !media)); update the logic around useMediaStatus and useMediaArtifacts so you don't fall through to the queued UI when status is undefined, and apply the same guard to the other processing render branch that currently shows the queued footer/actions (the second block that uses status to decide queued vs processing). Ensure you check id and media before using status and render a clear empty/error component instead of the processing layout.apps/mobile-app/src/hooks/useMedia.ts (1)
60-65:⚠️ Potential issue | 🟠 MajorKeep a recovery path for missed socket updates.
staleTime: Infinityplus all refetch triggers disabled makesuseMediaStatus()anduseMediaArtifacts()effectively socket-only. If the app backgrounds, reconnects, or misses an invalidation, the processing screen can freeze on stale data. Keep a low-frequency refetch for non-terminal items, or at least re-enable reconnect/focus refetch.Also applies to: 85-100
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/hooks/useMedia.ts` around lines 60 - 65, socketFirstQueryOptions currently disables all refetching (staleTime: Infinity and all refetch triggers false), making useMediaStatus() and useMediaArtifacts() rely solely on sockets and risk freezing on missed updates; modify socketFirstQueryOptions to allow a low-frequency recovery refetch (e.g., set staleTime to a finite value like 30_000 ms) or re-enable refetchOnReconnect/refetchOnWindowFocus (or both), and if possible make the shorter staleTime or re-enable triggers conditional for non-terminal items only so terminal items remain stable; apply the same change pattern where socketFirstQueryOptions is reused (also referenced in the other block around useMediaStatus/useMediaArtifacts).
🧹 Nitpick comments (26)
apps/mobile-app/src/hooks/useLanguagePreference.ts (1)
57-69: Unnecessaryi18nin dependency array.The effect body only calls
hydrateLanguagePreference()which uses the module-leveli18nInstance, not thei18nfromuseTranslation(). Havingi18nin the dependency array could cause the effect to re-run unnecessarily if the i18n instance reference changes.♻️ Proposed fix
useEffect(() => { let isMounted = true; hydrateLanguagePreference().finally(() => { if (isMounted) { setIsLoading(false); } }); return () => { isMounted = false; }; - }, [i18n]); + }, []);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/hooks/useLanguagePreference.ts` around lines 57 - 69, The effect in useEffect currently lists i18n in its dependency array but the effect only calls hydrateLanguagePreference() (which relies on the module-level i18nInstance) and therefore should not re-run when the useTranslation() i18n ref changes; remove i18n from the dependency array and use an empty array [] so hydrateLanguagePreference() runs once on mount (keep the isMounted cleanup logic as-is), referencing the useEffect, hydrateLanguagePreference, and i18n symbols to locate the change.apps/backend-api/src/modules/socket/socket.types.ts (1)
74-76:isRecordreturns true for arrays.
typeof [] === 'object'and[] !== null, so arrays pass this check. While not a bug (subsequent field checks will fail for arrays), consider tightening if you want explicit rejection.♻️ Optional: exclude arrays explicitly
function isRecord(value: unknown): value is UnknownRecord { - return typeof value === 'object' && value !== null; + return typeof value === 'object' && value !== null && !Array.isArray(value); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend-api/src/modules/socket/socket.types.ts` around lines 74 - 76, isRecord currently returns true for arrays because typeof [] === 'object'; update the isRecord type guard to explicitly reject arrays by adding an Array.isArray check (e.g., return typeof value === 'object' && value !== null && !Array.isArray(value)) so that isRecord(value: unknown): value is UnknownRecord only passes for plain objects; adjust any callers relying on arrays being accepted if needed.apps/mobile-app/src/hooks/useThrottle.ts (1)
57-57: Consider warning aboutbypassFnstability in JSDoc.Including
bypassFnin the dependency array is correct, but if callers pass an inline function, it will cause the effect to re-run on every render (potentially resetting the throttle state). The JSDoc should advise callers to memoizebypassFnwithuseCallback, or you can wrap it internally withuseRef.📝 Update JSDoc to warn about stability
* `@param` interval Minimum ms between updates (e.g. 1500) * `@param` bypassFn Optional predicate — when true the value is emitted - * immediately regardless of throttle (e.g. terminal progress) + * immediately regardless of throttle (e.g. terminal progress). + * Should be memoized with useCallback to avoid resetting throttle state. */🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/hooks/useThrottle.ts` at line 57, The JSDoc for useThrottle should warn that the bypassFn dependency must be stable to avoid re-running the effect on every render; update the comment for the useThrottle hook (and mention bypassFn) to advise callers to memoize bypassFn with useCallback or that the hook could internally stabilize it (e.g., store it in a ref) to prevent unintended throttle resets. Include a short example suggestion: "memoize bypassFn with useCallback" and note the current dependency array ([value, interval, bypassFn]) relies on bypassFn stability.apps/backend-api/src/modules/media/workers/media.processor.ts (1)
117-123: Consider documenting or validating thetargetLanguagedefault behavior.The
targetLanguagefield is passed directly from the job payload without validation or a default value here. According toqueue.types.ts, it "defaults to 'vi' if omitted", but this default appears to be applied downstream in the AI engine rather than here.This is acceptable if the AI engine handles the default, but consider adding a brief comment clarifying where the default is applied to avoid confusion during future maintenance.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend-api/src/modules/media/workers/media.processor.ts` around lines 117 - 123, The aiPayload creation is passing job.data.targetLanguage directly without validation or local default; either validate/normalize it here or add a clear inline comment stating that the "targetLanguage" default ('vi') is applied downstream in the AI engine/queue types; update the AiProcessingJobPayload population around aiPayload (and reference job.data.targetLanguage) to either set a local default/validation or add the comment indicating where the default originates (queue.types.ts / AI engine) so future maintainers know why it's omitted here.apps/ai-engine/src/config.py (1)
119-122: Consider validating NMT_MODEL_DIR exists at startup.
NMT_MODEL_DIRis typed asPathwith a default of"temp/models/nllb-200-3.3B-ct2", but unlikeTEMP_DIRandOUTPUT_DIR, it's not created bycreate_dirs(). If this directory doesn't exist (e.g., model not downloaded), the NMT translator will fail at runtime.Consider either:
- Adding validation at startup to check if the model directory exists
- Documenting the model download step prominently in the instructions
- Adding a clear error message in the translator initialization if the model is missing
💡 Option: Add validation in create_dirs or a separate check
def create_dirs(self): """Initialize directories""" os.makedirs(self.TEMP_DIR, exist_ok=True) os.makedirs(self.OUTPUT_DIR, exist_ok=True) def validate_nmt_model(self): """Check NMT model directory exists""" if not self.NMT_MODEL_DIR.exists(): from loguru import logger logger.warning( f"NMT model directory not found: {self.NMT_MODEL_DIR}. " "Translation will fail until the model is downloaded." )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/config.py` around lines 119 - 122, Add startup validation for the NMT model directory so runtime failures are avoided: in config.py update create_dirs() or add a new validate_nmt_model() method to check Path NMT_MODEL_DIR.exists() and, if missing, log a clear error/warning (using the project logger) and either raise a ConfigError or return a failure state so translator initialization (where the NMT translator is constructed) can surface a user-friendly message; reference NMT_MODEL_DIR, create_dirs(), and the translator initialization code path to ensure the check runs before the translator is instantiated.apps/mobile-app/src/app/(app)/player.tsx (1)
11-89: Clarify the purpose of this placeholder screen.Both the back button (Lines 23-28) and the primary action button (Lines 73-78) navigate to the same destination (
ROUTES.PROCESSING). This makes the screen effectively a redirect/informational screen.If this is intentional (e.g., a temporary placeholder for a future player implementation), consider adding a code comment explaining its purpose. Otherwise, users might be confused by a screen that immediately redirects them.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/app/`(app)/player.tsx around lines 11 - 89, PlayerPlaceholderScreen currently has both the back Pressable and the primary action Pressable calling router.replace to ROUTES.PROCESSING, making the screen behave like an immediate redirect or purely informational; if this is intentional (a temporary stub), add a clear code comment above the PlayerPlaceholderScreen function explaining it's a placeholder for a future player and why both controls return to ROUTES.PROCESSING, otherwise update one or both handlers (the two onPress callbacks calling router.replace with ROUTES.PROCESSING) to perform the correct distinct actions (e.g., navigate back, open player, or show a toast) so the UI behavior matches user expectations.apps/mobile-app/src/components/media/UploadModals.tsx (3)
52-55: Remove commented-out navigation code.The commented
router.replace(ROUTES.HOME)should be removed if the current behavior (just closing the modal) is intentional.Suggested fix
/** Closes the YouTube modal and goes back to library */ const handleCloseYT = () => { setYtVisible(false); - // router.replace(ROUTES.HOME); };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/components/media/UploadModals.tsx` around lines 52 - 55, The commented-out navigation line inside handleCloseYT should be removed to avoid dead code; edit the handleCloseYT function (which currently calls setYtVisible(false)) and delete the commented `router.replace(ROUTES.HOME)` so the handler only closes the modal without unused/commented navigation code.
25-26: Remove commented-out code.Leftover commented code (
sheetVisiblestate) should be removed if no longer needed.Suggested fix
- // const [sheetVisible, setSheetVisible] = useState(true); const [ytVisible, setYtVisible] = useState(false);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/components/media/UploadModals.tsx` around lines 25 - 26, Remove the leftover commented-out state declaration for sheetVisible to keep the component clean: delete the line "// const [sheetVisible, setSheetVisible] = useState(true);" in UploadModals.tsx so only the active state declarations like "const [ytVisible, setYtVisible] = useState(false);" remain; no other changes are required.
64-67: Avoidas anytype assertion.The
as anycast bypasses type checking for route params. Consider defining proper route param types for type safety.Suggested approach
Define route params in your route types:
// In your route types file export type ProcessingRouteParams = { id: string; };Then use typed navigation:
router.replace({ pathname: ROUTES.PROCESSING, params: { id: newItem.id }, - } as any); + });This requires ensuring your route configuration properly types the
ROUTES.PROCESSINGroute params.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/components/media/UploadModals.tsx` around lines 64 - 67, Remove the unsafe "as any" on the router.replace call and instead declare and use a proper typed route params type for the processing route (e.g., declare a ProcessingRouteParams = { id: string } in your route types), update your route configuration to associate ROUTES.PROCESSING with that params type, and then call router.replace with the correctly typed params object (router.replace({ pathname: ROUTES.PROCESSING, params: { id: newItem.id } })) so TypeScript enforces that newItem.id matches the expected string id type; keep references to ROUTES.PROCESSING, router.replace and newItem.id when applying the change.apps/ai-engine/src/core/smart_aligner.py (3)
329-334: Remove redundantimport re.The
remodule is already imported at the module level (line 16). This local import is unnecessary.Suggested fix
def _split_cjk_words(self, sentence: Sentence): """Pillar 3: Split CJK words into characters.""" - import re - new_words = [] cjk_pattern = re.compile(r"[\u4e00-\u9fff]")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/core/smart_aligner.py` around lines 329 - 334, The local import of the regular expression module inside _split_cjk_words is redundant; remove the line "import re" from the _split_cjk_words method so the function uses the module-level re already imported, leaving the cjk_pattern = re.compile(...) and rest of the function unchanged.
154-156: Unused variablesrfromlibrosa.load.The sample rate is returned but never used. Use underscore prefix to indicate intentional discard.
Suggested fix
- audio_full, sr = librosa.load(str(path), sr=16000) + audio_full, _sr = librosa.load(str(path), sr=16000)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/core/smart_aligner.py` around lines 154 - 156, The call to librosa.load returns a sample rate into variable sr which is never used; replace the unused variable name sr with _sr (or prefix it with an underscore) where librosa.load is called so intent to discard is clear — locate the librosa.load invocation in smart_aligner.py (the line assigning audio_full, sr) and change the second assigned variable to _sr.
236-244: Variable shadowing:sentreused in nested loop.The inner loop reuses
sentfrom the outerfor sent in res_sentencesloop. While functionally correct, this shadows the outer variable and reduces readability.Suggested fix for clarity
if split_sents: previous_text_context += " " + split_sents[-1].text detected_lang = transcription_result["info"].language self._add_phonemes(split_sents, detected_lang) - for sent in split_sents: - sent.detected_lang = detected_lang + for s in split_sents: + s.detected_lang = detected_lang🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/core/smart_aligner.py` around lines 236 - 244, The inner loop reuses the variable name "sent" and shadows the outer "sent" from the surrounding loop; to fix, rename the inner-loop iteration variable (e.g., to "sub_sent" or "split_sent") in the block that assigns detected_lang and sets attributes (the for sent in split_sents loop) and update references inside that loop accordingly, leaving the outer loop variable intact; this preserves readability while keeping logic in self._add_phonemes(split_sents, detected_lang), the assignments sent.detected_lang, sentences.extend(split_sents), and pending_chunk.extend(split_sents unchanged.apps/ai-engine/src/events.py (1)
46-47: Broad exception catching is intentional but could be more specific.The static analysis flags
except Exceptionas too broad (Ruff BLE001). While the defensive approach is reasonable for fire-and-forget event publishing (preventing pipeline crashes), consider catching(redis.RedisError, TypeError, ValueError)to be more explicit about expected failures (Redis errors and JSON serialization issues).♻️ Optional: More specific exception handling
- except Exception as e: + except (redis.RedisError, TypeError, ValueError) as e: logger.error(f"Failed to publish progress event for {media_id}: {e}")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/events.py` around lines 46 - 47, Replace the broad "except Exception as e" in the publish/progress event publishing block with a targeted exception tuple like "except (redis.RedisError, TypeError, ValueError) as e" to satisfy the linter and explicitly handle Redis and JSON/serialization errors; also ensure "redis" is imported at top of the module and keep the existing logger.error call (referencing logger and media_id) unchanged so the error details are still logged.apps/ai-engine/src/scripts/test_v2_pipeline.py (1)
786-799: Clarify why module reload is necessary after patching.The pattern of
importlib.reload(async_mod)after patchingdb_modandevents_modis necessary because theasync_pipelinemodule imports these at load time. Consider adding a brief comment explaining this for maintainers.events_mod.publish_batch_ready = _fake_publish_batch_ready + # Reload async_mod so it picks up the patched publish_* functions + # (it imports them at module load time) async_mod = importlib.import_module("src.async_pipeline") async_mod = importlib.reload(async_mod)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/scripts/test_v2_pipeline.py` around lines 786 - 799, Add a brief inline comment above the importlib.reload(async_mod) call explaining that reload is necessary because src.async_pipeline (referenced here as async_mod) imports db_mod and events_mod at module load time, so after patching those modules we must reload async_mod to pick up the patched objects; mention importlib.reload(async_mod), db_mod, and events_mod by name so future maintainers understand the dependency and ordering.apps/backend-api/src/modules/socket/socket.gateway.ts (2)
131-147: Verify whether unauthenticated sockets should be allowed to leave rooms.The
media_leavehandler does not checkgetAuthenticatedUserId(client)before allowing the socket to leave. While leaving a room is generally safe, an unauthenticated socket shouldn't have been able to join a media room in the first place. Consider adding an auth check for consistency withmedia_join.🛡️ Optional: Add authentication check for consistency
`@SubscribeMessage`('media_leave') async handleLeaveMedia( `@ConnectedSocket`() client: Socket, `@MessageBody`() payload: MediaRoomPayload, ): Promise<RoomSubscriptionAck> { + const userId = this.getAuthenticatedUserId(client); + if (!userId) { + return { ok: false, error: 'Unauthenticated socket' }; + } + const mediaId = payload.mediaId?.trim(); if (!mediaId) { return { ok: false, error: 'mediaId is required' }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend-api/src/modules/socket/socket.gateway.ts` around lines 131 - 147, The media_leave handler (handleLeaveMedia) currently lets any socket leave a room without verifying authentication; mirror the media_join behavior by calling getAuthenticatedUserId(client) at the start of handleLeaveMedia and return { ok: false, error: 'unauthenticated' } (or the existing error shape) if no user id is present, then proceed to compute room via getMediaRoom(mediaId), await client.leave(roomName), log and return; this keeps join/leave auth consistent and uses the same error handling flow as media_join.
26-32: Consider restricting CORS origin for production.Setting
origin: '*'allows connections from any domain. For a production deployment, this should be restricted to your trusted frontend domains to prevent unauthorized WebSocket connections.🔒 Proposed configuration using environment variable
`@WebSocketGateway`({ cors: { - origin: '*', + origin: process.env.CORS_ORIGINS?.split(',') || '*', }, pingInterval: 10000, pingTimeout: 5000, })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend-api/src/modules/socket/socket.gateway.ts` around lines 26 - 32, The WebSocketGateway decorator currently sets cors.origin: '*' which is insecure for production; update the WebSocketGateway configuration in socket.gateway.ts (the `@WebSocketGateway`({ ... }) block) to read allowed origins from an environment variable (e.g., process.env.ALLOWED_WS_ORIGINS) and use that value (or a parsed array / whitelist check or function that returns true only for matching origins) instead of '*', falling back to a safe default for development; ensure the cors.origin field is set to the parsed whitelist or origin-check function so only trusted frontend domains can connect.apps/ai-engine/tests/test_first_batch_streaming.py (1)
147-152: UseClassVarannotation or instance-level tracking for mutable class attribute.The class-level
instances: list["FakeMinioClient"] = []is a mutable default shared across all instances. While the test at line 248 clears it, this pattern is fragile and can cause test pollution if tests run in unexpected order or parallel.🛠️ Proposed fix using ClassVar annotation
+from typing import ClassVar + class FakeMinioClient: - instances: list["FakeMinioClient"] = [] + instances: ClassVar[list["FakeMinioClient"]] = [] def __init__(self) -> None: self.uploads: list[dict] = [] FakeMinioClient.instances.append(self)The
ClassVarannotation makes the intent explicit and satisfies Ruff's RUF012 rule. Alternatively, consider using a pytest fixture to manage instance tracking per test.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/tests/test_first_batch_streaming.py` around lines 147 - 152, The class attribute instances on FakeMinioClient is a mutable list shared across instances which can cause test pollution; update the declaration to explicitly annotate it as a ClassVar (e.g., instances: ClassVar[list["FakeMinioClient"]]) to indicate module-level/stateful intent and satisfy RUF012, or alternatively remove the shared class list and manage per-test instance tracking (e.g., make instances an instance attribute or use a pytest fixture) while keeping references to FakeMinioClient and its instances attribute so the tests still clear or inspect it appropriately.apps/ai-engine/src/schemas.py (1)
116-165: Consider consolidatingContextAnalysisResultandContextAnalysis.These two models serve similar purposes (context analysis output) with slightly different field structures. Having both may cause confusion for maintainers.
If
ContextAnalysisResultis for backward compatibility with existing LLM prompts, consider adding a comment explaining when to use each. Otherwise, consider deprecating one in favor of the other.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/schemas.py` around lines 116 - 165, ContextAnalysisResult and ContextAnalysis overlap and should be consolidated or clarified: either deprecate ContextAnalysisResult (add a deprecation comment/DeprecationWarning in its docstring) and provide a conversion helper (e.g., ContextAnalysis.from_result or ContextAnalysisResult.to_context_analysis) that maps detected_style→detected_style, summary→summary, keywords→keywords, and detected_pronouns→language_specific["pronouns"]; or keep both but add a clear comment above ContextAnalysisResult explaining it exists only for backward compatibility with LLM prompts and show the explicit field mapping to ContextAnalysis so maintainers know when to use each.checkpoint.md (1)
447-458: Add language specifier to fenced code block.The MinIO storage convention code block should have a language specifier for proper rendering.
📝 Proposed fix
-``` +```text processed/{mediaId}/ ├── chunks/ # Tier 1: raw transcription (from SmartAligner)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@checkpoint.md` around lines 447 - 458, Update the fenced code block showing the MinIO storage convention to include a language specifier by changing the opening fence from ``` to ```text so it renders properly; locate the block that begins with the tree starting "processed/{mediaId}/" (contains "chunks/", "translated_batches/", "final.json") and add "text" after the opening backticks.apps/mobile-app/src/services/socket.service.ts (2)
100-105:disconnect()should clear connection state completely.If the duplicate-connection guard fix is applied, ensure
disconnect()also resets theconnectingflag.♻️ Proposed fix
disconnect() { if (this.socket) { this.socket.disconnect(); this.socket = null; } + this.connecting = false; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/services/socket.service.ts` around lines 100 - 105, The disconnect() method currently nulls this.socket but doesn't reset the connection state; update disconnect() (in the same class where this.socket and the connecting flag are defined) to also set this.connecting = false (and clear any related connection state such as pending reconnection timers or listeners if present) so the duplicate-connection guard can work correctly after a disconnect.
79-97: Consider block-bodied callbacks to avoid lint warnings.The concise arrow
(cb) => cb(event)returns a value which can triggeruseIterableCallbackReturnlint rules. Using block-bodied callbacks is more explicit.♻️ Proposed fix
this.socket.on("media_progress", (event: MediaProgressEvent) => { - this.progressListeners.forEach((cb) => cb(event)); + this.progressListeners.forEach((cb) => { cb(event); }); }); this.socket.on("media_chunk_ready", (event: MediaChunkReadyEvent) => { - this.chunkReadyListeners.forEach((cb) => cb(event)); + this.chunkReadyListeners.forEach((cb) => { cb(event); }); }); // ... apply same pattern to other handlers🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/services/socket.service.ts` around lines 79 - 97, The inline arrow callbacks in the socket.on handlers (e.g., the handlers for "media_progress", "media_chunk_ready", "media_batch_ready", "media_completed", "media_failed") return the result of cb(event) which triggers the lint rule; change each listener iteration from a concise arrow (cb => cb(event)) to a block-bodied callback that calls cb(event) without returning a value (e.g., (cb) => { cb(event); }) for progressListeners, chunkReadyListeners, batchReadyListeners, completedListeners, and failedListeners to satisfy useIterableCallbackReturn.apps/ai-engine/src/core/llm_provider.py (1)
147-147: Prefernext(iter(...))over single element slice.More efficient for extracting the first key.
♻️ Proposed fix
- first_key = list(parsed[0].keys())[0] + first_key = next(iter(parsed[0].keys()))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/core/llm_provider.py` at line 147, Replace the single-element list allocation used to get the first key (first_key = list(parsed[0].keys())[0]) with an iterator-based access to avoid building a new list; e.g., use next(iter(parsed[0].keys())) or simply next(iter(parsed[0])) in the same scope where parsed is used (e.g., inside the function or method that contains first_key) so the first key is extracted without unnecessary allocation.apps/ai-engine/src/scripts/eval_nllb.py (2)
88-94: Hardcoded debug batch path may not exist on other environments.The path references a specific UUID that likely only exists on your development machine. While
load_real_sentenceshandles the missing file gracefully with a warning, consider making this configurable via CLI argument or environment variable for broader usability.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/scripts/eval_nllb.py` around lines 88 - 94, BATCH_JSON_PATH is hardcoded to a single debug UUID path; make this configurable by adding a CLI argument (e.g., --batch-json) or reading an environment variable (e.g., BATCH_JSON_PATH) and falling back to the current default value so other environments can supply their own file; update the code that defines BATCH_JSON_PATH and the code path that calls load_real_sentences to use the provided value (or default) and keep the existing warning behavior if the file is missing.
226-226: Consider addingstrict=Truetozip()for defensive consistency.While
translate_batchalready asserts 1:1 mapping, addingstrict=Truemakes the contract explicit at the call site and aligns with modern Python best practices.♻️ Proposed fix
- for idx, (src, tgt) in enumerate(zip(sentences, translations), start=1): + for idx, (src, tgt) in enumerate(zip(sentences, translations, strict=True), start=1):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/src/scripts/eval_nllb.py` at line 226, The for-loop that iterates over zip(sentences, translations) in the evaluation loop should use zip(..., strict=True) to enforce the 1:1 mapping at the call site (even though translate_batch asserts it); update the loop in the function handling evaluation (the enumerate over sentences and translations) to call zip(sentences, translations, strict=True) so mismatched lengths raise immediately and make the contract explicit.apps/ai-engine/tests/test_streaming_contracts.py (1)
398-402: Consider addingstrict=Trueto the finalzip()for defensive testing.While the test already validates
len(matched) == 3, addingstrict=Truewould make the test fail immediately if the segment counts diverge, providing a clearer error message.♻️ Proposed fix
- for batch_seg, matched_seg in zip(batch.segments, matched): + for batch_seg, matched_seg in zip(batch.segments, matched, strict=True): assert batch_seg.segment_index == matched_seg.segment_index🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/ai-engine/tests/test_streaming_contracts.py` around lines 398 - 402, Add the strict=True parameter to the zip used in the loop that compares batch.segments and matched so the loop becomes zip(batch.segments, matched, strict=True); this enforces equal lengths and will fail fast if counts diverge—update the for loop that currently reads for batch_seg, matched_seg in zip(batch.segments, matched): to use strict=True on the zip call.apps/mobile-app/src/types/media.ts (1)
33-54: Extract a shared artifact base interface.These three DTOs repeat
objectKey,url,size, andlastModified. Pulling those into one base shape will reduce drift if the backend contract changes later.♻️ Suggested refactor
+interface MediaArtifactBase { + objectKey: string; + url: string; + size: number; + lastModified: string | null; +} + -export interface MediaChunkArtifact { +export interface MediaChunkArtifact extends MediaArtifactBase { chunkIndex: number; - objectKey: string; - url: string; - size: number; - lastModified: string | null; } -export interface MediaTranslatedBatchArtifact { +export interface MediaTranslatedBatchArtifact extends MediaArtifactBase { batchIndex: number; - objectKey: string; - url: string; - size: number; - lastModified: string | null; } -export interface MediaFinalArtifact { - objectKey: string; - url: string; - size: number; - lastModified: string | null; -} +export interface MediaFinalArtifact extends MediaArtifactBase {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/mobile-app/src/types/media.ts` around lines 33 - 54, Extract the repeated fields into a new base interface (e.g., MediaArtifactBase) containing objectKey, url, size, and lastModified, and have MediaChunkArtifact, MediaTranslatedBatchArtifact, and MediaFinalArtifact extend that base; update MediaChunkArtifact to only declare chunkIndex, MediaTranslatedBatchArtifact to only declare batchIndex, and remove the duplicated fields from those three interfaces so they inherit the shared shape.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/ai-engine/src/async_pipeline.py`:
- Around line 390-431: The loop that applies translations uses zip(sentences,
final_translations) which can silently drop items if the lengths differ; update
the loop to use zip(..., strict=True) so it raises a ValueError immediately on
length mismatch between sentences and final_translations (reference: the for
loop pairing sentences and final_translations, and variables sentences,
final_translations, and zip), and ensure any surrounding try/except (if present)
lets that error surface or is handled where appropriate so mismatches are caught
early.
In `@apps/ai-engine/src/core/llm_provider.py`:
- Line 60: The assignment to user_msg uses an unnecessary f-string prefix:
change the line that sets user_msg (user_msg = f"Text Samples:\n" +
"\n".join(text_samples)) to use a plain string literal without the f prefix so
it becomes a normal concatenation (i.e., remove the leading "f" from the string
literal assigned to user_msg).
- Line 178: The generate signature uses a default None for system_prompt but
lacks an explicit Optional annotation; update the function signature for
generate in llm_provider.py to annotate system_prompt as Optional[str] (or str |
None for Python 3.10+) so the type reflects the None default, e.g., change def
generate(self, prompt: str, system_prompt: str = None) -> str to use the
explicit Optional/union type and adjust any imports (typing.Optional) if needed.
In `@apps/ai-engine/src/core/nmt_translator.py`:
- Around line 141-153: The code currently falls back to passing raw
source_lang/target_lang strings to NLLB_LANG_MAP, which can silently allow
invalid Flores-200 tokens; update NMTTranslator to normalize and validate
language codes before mapping: implement normalization of source_lang and
target_lang (e.g., lowercasing and stripping regional variants like splitting on
'-' or '_', mapping "en-US" -> "en"), then lookup normalized codes in
NLLB_LANG_MAP (referencing NLLB_LANG_MAP, source_lang, target_lang, src_code,
tgt_code) and if a normalized code is not present raise a clear exception (e.g.,
ValueError) or return a descriptive error instead of passing the raw token to
the model so failures are explicit and early.
In `@apps/ai-engine/src/core/prompts.py`:
- Around line 124-132: The refinement prompt has a contradiction between the
“preserve proper nouns and key terms” rule and the strict ban on source-language
characters; update the prompt template so rule 6 allows unchanged proper
nouns/brand names and key terms when they appear verbatim and use the same
script as the target language while still forbidding other source-language
content. Concretely, modify the Rules block (the prompt string that contains
"Rules:" and the {pronoun_rule} placeholder) to change rule 6 into a conditional
allowance like “6. Output must be PURE target language — except unchanged proper
nouns, brand names, and key terms that appear verbatim in the same script as the
target language; all other source-language characters are prohibited.” Ensure
the change references the existing preservation rule (rule 4) so the two rules
are consistent and update any tests or examples that assert the previous
absolute ban.
In `@apps/ai-engine/src/db.py`:
- Around line 108-119: The try/except in update_media_status is swallowing DB
write failures (the block using _get_db_pool(), pool.getconn(), cur.execute(sql,
values), media_id and logger) so callers may broadcast terminal events despite
Postgres not being updated; change this block to either implement a small retry
loop with backoff around cur.execute(sql, values) (and still putconn in the
finally) or, if retries fail, re-raise the exception after logging so the caller
sees the failure instead of continuing; ensure update_media_status surfaces
errors (raises) so the worker can avoid emitting COMPLETED/FAILED when the DB
write failed.
In `@apps/backend-api/scripts/sync-schema.ts`:
- Around line 5-9: The path resolution for sourcePath and targetPath currently
uses process.cwd(), which is fragile; change the script to compute a scriptDir
anchored to the file's location (e.g., via import.meta.url or
fileURLToPath(import.meta.url) + path.dirname) and then build sourcePath and
targetPath relative to that scriptDir instead of process.cwd(), updating any
references to sourcePath/targetPath to use the new anchored paths.
In `@apps/backend-api/src/modules/media/media-artifact-summary.ts`:
- Around line 91-96: Replace the manual guards that compute
safeChunkIndex/safeBatchIndex with the existing toNonNegativeInteger() helper to
avoid propagating NaN/Infinity into summaries: wherever chunkIndex or batchIndex
is currently normalized via Math.trunc/Math.max (e.g., the computation of
safeChunkIndex and subsequent uses updating chunkCount and latestChunkIndex, and
the analogous block around lines 105-113 for batchIndex), call
toNonNegativeInteger(chunkIndex) / toNonNegativeInteger(batchIndex) and use
those results to compute chunkCount, latestChunkIndex, batchCount, and
latestBatchIndex so non-finite inputs are converted to 0 and cannot corrupt the
summary object.
In `@apps/backend-api/src/modules/socket/socket.service.ts`:
- Around line 59-68: handleMediaUpdate currently calls JSON.parse(message) which
can throw a SyntaxError; wrap the parse in a try/catch inside handleMediaUpdate,
and on catch call this.logger.warn (or error) with a descriptive message
including the redacted payload via redactMediaPayload(message or parsed attempt)
and the error details, then return early so parseMediaEvent is not called;
ensure references to parseMediaEvent and redactMediaPayload remain unchanged.
In `@apps/mobile-app/src/components/media/ProcessingProgressRing.tsx`:
- Around line 30-36: withAlpha currently only appends an alpha hex to
7-character hex strings and will produce invalid colors for other formats;
update withAlpha (used by ProcessingProgressRing) to safely handle formats:
detect and expand 4-char hex (`#RGB` -> `#RRGGBB`) then append/replace alpha for 6-
or 8-char hex, preserve rgb()/rgba()/hsl() strings (or convert using a color
library), and return the original color unchanged if unsupported; alternatively
replace the helper with a small color utility (e.g., tinycolor2 or polished) to
parse any CSS color, set the desired alpha, and return a valid CSS color string.
---
Outside diff comments:
In `@apps/mobile-app/src/hooks/useMedia.ts`:
- Around line 157-170: Add HTTP response validation for the two fetch calls
before calling mediaService.confirmUpload: after the initial fetch(file.uri)
check fileResponse.ok (and handle/throw/log a descriptive error referencing
file.uri or file.name) and similarly check the PUT response from
fetch(uploadUrl, { method: "PUT", ... }) for ok before proceeding; if either
response is not ok, abort and do not call confirmUpload (the code around
fileResponse, fileBlob, uploadUrl, and mediaService.confirmUpload should perform
these guards and surface the error).
---
Duplicate comments:
In `@apps/mobile-app/src/app/`(app)/processing.tsx:
- Around line 73-75: When rendering the processing UI, guard against missing id
or missing media by returning an explicit empty/error state when id is falsy or
when the status query finished without media (i.e., !id || (!isLoading &&
!media)); update the logic around useMediaStatus and useMediaArtifacts so you
don't fall through to the queued UI when status is undefined, and apply the same
guard to the other processing render branch that currently shows the queued
footer/actions (the second block that uses status to decide queued vs
processing). Ensure you check id and media before using status and render a
clear empty/error component instead of the processing layout.
In `@apps/mobile-app/src/hooks/useMedia.ts`:
- Around line 60-65: socketFirstQueryOptions currently disables all refetching
(staleTime: Infinity and all refetch triggers false), making useMediaStatus()
and useMediaArtifacts() rely solely on sockets and risk freezing on missed
updates; modify socketFirstQueryOptions to allow a low-frequency recovery
refetch (e.g., set staleTime to a finite value like 30_000 ms) or re-enable
refetchOnReconnect/refetchOnWindowFocus (or both), and if possible make the
shorter staleTime or re-enable triggers conditional for non-terminal items only
so terminal items remain stable; apply the same change pattern where
socketFirstQueryOptions is reused (also referenced in the other block around
useMediaStatus/useMediaArtifacts).
---
Nitpick comments:
In `@apps/ai-engine/src/config.py`:
- Around line 119-122: Add startup validation for the NMT model directory so
runtime failures are avoided: in config.py update create_dirs() or add a new
validate_nmt_model() method to check Path NMT_MODEL_DIR.exists() and, if
missing, log a clear error/warning (using the project logger) and either raise a
ConfigError or return a failure state so translator initialization (where the
NMT translator is constructed) can surface a user-friendly message; reference
NMT_MODEL_DIR, create_dirs(), and the translator initialization code path to
ensure the check runs before the translator is instantiated.
In `@apps/ai-engine/src/core/llm_provider.py`:
- Line 147: Replace the single-element list allocation used to get the first key
(first_key = list(parsed[0].keys())[0]) with an iterator-based access to avoid
building a new list; e.g., use next(iter(parsed[0].keys())) or simply
next(iter(parsed[0])) in the same scope where parsed is used (e.g., inside the
function or method that contains first_key) so the first key is extracted
without unnecessary allocation.
In `@apps/ai-engine/src/core/smart_aligner.py`:
- Around line 329-334: The local import of the regular expression module inside
_split_cjk_words is redundant; remove the line "import re" from the
_split_cjk_words method so the function uses the module-level re already
imported, leaving the cjk_pattern = re.compile(...) and rest of the function
unchanged.
- Around line 154-156: The call to librosa.load returns a sample rate into
variable sr which is never used; replace the unused variable name sr with _sr
(or prefix it with an underscore) where librosa.load is called so intent to
discard is clear — locate the librosa.load invocation in smart_aligner.py (the
line assigning audio_full, sr) and change the second assigned variable to _sr.
- Around line 236-244: The inner loop reuses the variable name "sent" and
shadows the outer "sent" from the surrounding loop; to fix, rename the
inner-loop iteration variable (e.g., to "sub_sent" or "split_sent") in the block
that assigns detected_lang and sets attributes (the for sent in split_sents
loop) and update references inside that loop accordingly, leaving the outer loop
variable intact; this preserves readability while keeping logic in
self._add_phonemes(split_sents, detected_lang), the assignments
sent.detected_lang, sentences.extend(split_sents), and
pending_chunk.extend(split_sents unchanged.
In `@apps/ai-engine/src/events.py`:
- Around line 46-47: Replace the broad "except Exception as e" in the
publish/progress event publishing block with a targeted exception tuple like
"except (redis.RedisError, TypeError, ValueError) as e" to satisfy the linter
and explicitly handle Redis and JSON/serialization errors; also ensure "redis"
is imported at top of the module and keep the existing logger.error call
(referencing logger and media_id) unchanged so the error details are still
logged.
In `@apps/ai-engine/src/schemas.py`:
- Around line 116-165: ContextAnalysisResult and ContextAnalysis overlap and
should be consolidated or clarified: either deprecate ContextAnalysisResult (add
a deprecation comment/DeprecationWarning in its docstring) and provide a
conversion helper (e.g., ContextAnalysis.from_result or
ContextAnalysisResult.to_context_analysis) that maps
detected_style→detected_style, summary→summary, keywords→keywords, and
detected_pronouns→language_specific["pronouns"]; or keep both but add a clear
comment above ContextAnalysisResult explaining it exists only for backward
compatibility with LLM prompts and show the explicit field mapping to
ContextAnalysis so maintainers know when to use each.
In `@apps/ai-engine/src/scripts/eval_nllb.py`:
- Around line 88-94: BATCH_JSON_PATH is hardcoded to a single debug UUID path;
make this configurable by adding a CLI argument (e.g., --batch-json) or reading
an environment variable (e.g., BATCH_JSON_PATH) and falling back to the current
default value so other environments can supply their own file; update the code
that defines BATCH_JSON_PATH and the code path that calls load_real_sentences to
use the provided value (or default) and keep the existing warning behavior if
the file is missing.
- Line 226: The for-loop that iterates over zip(sentences, translations) in the
evaluation loop should use zip(..., strict=True) to enforce the 1:1 mapping at
the call site (even though translate_batch asserts it); update the loop in the
function handling evaluation (the enumerate over sentences and translations) to
call zip(sentences, translations, strict=True) so mismatched lengths raise
immediately and make the contract explicit.
In `@apps/ai-engine/src/scripts/test_v2_pipeline.py`:
- Around line 786-799: Add a brief inline comment above the
importlib.reload(async_mod) call explaining that reload is necessary because
src.async_pipeline (referenced here as async_mod) imports db_mod and events_mod
at module load time, so after patching those modules we must reload async_mod to
pick up the patched objects; mention importlib.reload(async_mod), db_mod, and
events_mod by name so future maintainers understand the dependency and ordering.
In `@apps/ai-engine/tests/test_first_batch_streaming.py`:
- Around line 147-152: The class attribute instances on FakeMinioClient is a
mutable list shared across instances which can cause test pollution; update the
declaration to explicitly annotate it as a ClassVar (e.g., instances:
ClassVar[list["FakeMinioClient"]]) to indicate module-level/stateful intent and
satisfy RUF012, or alternatively remove the shared class list and manage
per-test instance tracking (e.g., make instances an instance attribute or use a
pytest fixture) while keeping references to FakeMinioClient and its instances
attribute so the tests still clear or inspect it appropriately.
In `@apps/ai-engine/tests/test_streaming_contracts.py`:
- Around line 398-402: Add the strict=True parameter to the zip used in the loop
that compares batch.segments and matched so the loop becomes zip(batch.segments,
matched, strict=True); this enforces equal lengths and will fail fast if counts
diverge—update the for loop that currently reads for batch_seg, matched_seg in
zip(batch.segments, matched): to use strict=True on the zip call.
In `@apps/backend-api/src/modules/media/workers/media.processor.ts`:
- Around line 117-123: The aiPayload creation is passing job.data.targetLanguage
directly without validation or local default; either validate/normalize it here
or add a clear inline comment stating that the "targetLanguage" default ('vi')
is applied downstream in the AI engine/queue types; update the
AiProcessingJobPayload population around aiPayload (and reference
job.data.targetLanguage) to either set a local default/validation or add the
comment indicating where the default originates (queue.types.ts / AI engine) so
future maintainers know why it's omitted here.
In `@apps/backend-api/src/modules/socket/socket.gateway.ts`:
- Around line 131-147: The media_leave handler (handleLeaveMedia) currently lets
any socket leave a room without verifying authentication; mirror the media_join
behavior by calling getAuthenticatedUserId(client) at the start of
handleLeaveMedia and return { ok: false, error: 'unauthenticated' } (or the
existing error shape) if no user id is present, then proceed to compute room via
getMediaRoom(mediaId), await client.leave(roomName), log and return; this keeps
join/leave auth consistent and uses the same error handling flow as media_join.
- Around line 26-32: The WebSocketGateway decorator currently sets cors.origin:
'*' which is insecure for production; update the WebSocketGateway configuration
in socket.gateway.ts (the `@WebSocketGateway`({ ... }) block) to read allowed
origins from an environment variable (e.g., process.env.ALLOWED_WS_ORIGINS) and
use that value (or a parsed array / whitelist check or function that returns
true only for matching origins) instead of '*', falling back to a safe default
for development; ensure the cors.origin field is set to the parsed whitelist or
origin-check function so only trusted frontend domains can connect.
In `@apps/backend-api/src/modules/socket/socket.types.ts`:
- Around line 74-76: isRecord currently returns true for arrays because typeof
[] === 'object'; update the isRecord type guard to explicitly reject arrays by
adding an Array.isArray check (e.g., return typeof value === 'object' && value
!== null && !Array.isArray(value)) so that isRecord(value: unknown): value is
UnknownRecord only passes for plain objects; adjust any callers relying on
arrays being accepted if needed.
In `@apps/mobile-app/src/app/`(app)/player.tsx:
- Around line 11-89: PlayerPlaceholderScreen currently has both the back
Pressable and the primary action Pressable calling router.replace to
ROUTES.PROCESSING, making the screen behave like an immediate redirect or purely
informational; if this is intentional (a temporary stub), add a clear code
comment above the PlayerPlaceholderScreen function explaining it's a placeholder
for a future player and why both controls return to ROUTES.PROCESSING, otherwise
update one or both handlers (the two onPress callbacks calling router.replace
with ROUTES.PROCESSING) to perform the correct distinct actions (e.g., navigate
back, open player, or show a toast) so the UI behavior matches user
expectations.
In `@apps/mobile-app/src/components/media/UploadModals.tsx`:
- Around line 52-55: The commented-out navigation line inside handleCloseYT
should be removed to avoid dead code; edit the handleCloseYT function (which
currently calls setYtVisible(false)) and delete the commented
`router.replace(ROUTES.HOME)` so the handler only closes the modal without
unused/commented navigation code.
- Around line 25-26: Remove the leftover commented-out state declaration for
sheetVisible to keep the component clean: delete the line "// const
[sheetVisible, setSheetVisible] = useState(true);" in UploadModals.tsx so only
the active state declarations like "const [ytVisible, setYtVisible] =
useState(false);" remain; no other changes are required.
- Around line 64-67: Remove the unsafe "as any" on the router.replace call and
instead declare and use a proper typed route params type for the processing
route (e.g., declare a ProcessingRouteParams = { id: string } in your route
types), update your route configuration to associate ROUTES.PROCESSING with that
params type, and then call router.replace with the correctly typed params object
(router.replace({ pathname: ROUTES.PROCESSING, params: { id: newItem.id } })) so
TypeScript enforces that newItem.id matches the expected string id type; keep
references to ROUTES.PROCESSING, router.replace and newItem.id when applying the
change.
In `@apps/mobile-app/src/hooks/useLanguagePreference.ts`:
- Around line 57-69: The effect in useEffect currently lists i18n in its
dependency array but the effect only calls hydrateLanguagePreference() (which
relies on the module-level i18nInstance) and therefore should not re-run when
the useTranslation() i18n ref changes; remove i18n from the dependency array and
use an empty array [] so hydrateLanguagePreference() runs once on mount (keep
the isMounted cleanup logic as-is), referencing the useEffect,
hydrateLanguagePreference, and i18n symbols to locate the change.
In `@apps/mobile-app/src/hooks/useThrottle.ts`:
- Line 57: The JSDoc for useThrottle should warn that the bypassFn dependency
must be stable to avoid re-running the effect on every render; update the
comment for the useThrottle hook (and mention bypassFn) to advise callers to
memoize bypassFn with useCallback or that the hook could internally stabilize it
(e.g., store it in a ref) to prevent unintended throttle resets. Include a short
example suggestion: "memoize bypassFn with useCallback" and note the current
dependency array ([value, interval, bypassFn]) relies on bypassFn stability.
In `@apps/mobile-app/src/services/socket.service.ts`:
- Around line 100-105: The disconnect() method currently nulls this.socket but
doesn't reset the connection state; update disconnect() (in the same class where
this.socket and the connecting flag are defined) to also set this.connecting =
false (and clear any related connection state such as pending reconnection
timers or listeners if present) so the duplicate-connection guard can work
correctly after a disconnect.
- Around line 79-97: The inline arrow callbacks in the socket.on handlers (e.g.,
the handlers for "media_progress", "media_chunk_ready", "media_batch_ready",
"media_completed", "media_failed") return the result of cb(event) which triggers
the lint rule; change each listener iteration from a concise arrow (cb =>
cb(event)) to a block-bodied callback that calls cb(event) without returning a
value (e.g., (cb) => { cb(event); }) for progressListeners, chunkReadyListeners,
batchReadyListeners, completedListeners, and failedListeners to satisfy
useIterableCallbackReturn.
In `@apps/mobile-app/src/types/media.ts`:
- Around line 33-54: Extract the repeated fields into a new base interface
(e.g., MediaArtifactBase) containing objectKey, url, size, and lastModified, and
have MediaChunkArtifact, MediaTranslatedBatchArtifact, and MediaFinalArtifact
extend that base; update MediaChunkArtifact to only declare chunkIndex,
MediaTranslatedBatchArtifact to only declare batchIndex, and remove the
duplicated fields from those three interfaces so they inherit the shared shape.
In `@checkpoint.md`:
- Around line 447-458: Update the fenced code block showing the MinIO storage
convention to include a language specifier by changing the opening fence from
``` to ```text so it renders properly; locate the block that begins with the
tree starting "processed/{mediaId}/" (contains "chunks/", "translated_batches/",
"final.json") and add "text" after the opening backticks.
| # ── Step B: Context analysis (once per job, on first chunk) ── | ||
| if not context_analyzed: | ||
| context_analyzed = True | ||
| try: | ||
| sample = texts[:10] | ||
| ctx = llm.analyze_context(sample, tgt) | ||
| context_result = ctx | ||
| logger.info( | ||
| f"📊 Context analysis: style={ctx.detected_style}, " | ||
| f"pronouns={ctx.detected_pronouns}" | ||
| ) | ||
| except Exception as e: | ||
| logger.warning(f"Context analysis failed (continuing without): {e}") | ||
| context_result = None | ||
|
|
||
| # ── Step C: LLM refinement (optional, best-effort) ── | ||
| final_translations = nmt_translations | ||
| if context_result is not None and src != tgt: | ||
| try: | ||
| t1 = _time.time() | ||
| refined = llm.refine_batch( | ||
| texts, nmt_translations, context_result, tgt | ||
| ) | ||
| if refined is not None: | ||
| final_translations = refined | ||
| logger.info( | ||
| f"✨ LLM refinement batch {batch_index}: " | ||
| f"{len(sentences)} segments in {_time.time() - t1:.2f}s" | ||
| ) | ||
| else: | ||
| logger.debug( | ||
| f"⏭️ Refinement returned None for batch {batch_index} " | ||
| f"— using NMT output" | ||
| ) | ||
| except Exception as e: | ||
| logger.warning( | ||
| f"Refinement failed for batch {batch_index} — using NMT: {e}" | ||
| ) | ||
|
|
||
| # Apply translations to sentences | ||
| for s, t in zip(sentences, final_translations): | ||
| s.translation = t |
There was a problem hiding this comment.
Add strict=True to zip() to catch length mismatches early.
The zip(sentences, final_translations) at line 430 should use strict=True to fail fast if the NMT or LLM refinement returns a different number of translations than input sentences.
🐛 Proposed fix
# Apply translations to sentences
- for s, t in zip(sentences, final_translations):
+ for s, t in zip(sentences, final_translations, strict=True):
s.translation = tThe broad Exception catches at lines 401 and 424 are acceptable here since they're for optional best-effort operations (context analysis, LLM refinement) with proper logging and fallback to NMT output.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # ── Step B: Context analysis (once per job, on first chunk) ── | |
| if not context_analyzed: | |
| context_analyzed = True | |
| try: | |
| sample = texts[:10] | |
| ctx = llm.analyze_context(sample, tgt) | |
| context_result = ctx | |
| logger.info( | |
| f"📊 Context analysis: style={ctx.detected_style}, " | |
| f"pronouns={ctx.detected_pronouns}" | |
| ) | |
| except Exception as e: | |
| logger.warning(f"Context analysis failed (continuing without): {e}") | |
| context_result = None | |
| # ── Step C: LLM refinement (optional, best-effort) ── | |
| final_translations = nmt_translations | |
| if context_result is not None and src != tgt: | |
| try: | |
| t1 = _time.time() | |
| refined = llm.refine_batch( | |
| texts, nmt_translations, context_result, tgt | |
| ) | |
| if refined is not None: | |
| final_translations = refined | |
| logger.info( | |
| f"✨ LLM refinement batch {batch_index}: " | |
| f"{len(sentences)} segments in {_time.time() - t1:.2f}s" | |
| ) | |
| else: | |
| logger.debug( | |
| f"⏭️ Refinement returned None for batch {batch_index} " | |
| f"— using NMT output" | |
| ) | |
| except Exception as e: | |
| logger.warning( | |
| f"Refinement failed for batch {batch_index} — using NMT: {e}" | |
| ) | |
| # Apply translations to sentences | |
| for s, t in zip(sentences, final_translations): | |
| s.translation = t | |
| # ── Step B: Context analysis (once per job, on first chunk) ── | |
| if not context_analyzed: | |
| context_analyzed = True | |
| try: | |
| sample = texts[:10] | |
| ctx = llm.analyze_context(sample, tgt) | |
| context_result = ctx | |
| logger.info( | |
| f"📊 Context analysis: style={ctx.detected_style}, " | |
| f"pronouns={ctx.detected_pronouns}" | |
| ) | |
| except Exception as e: | |
| logger.warning(f"Context analysis failed (continuing without): {e}") | |
| context_result = None | |
| # ── Step C: LLM refinement (optional, best-effort) ── | |
| final_translations = nmt_translations | |
| if context_result is not None and src != tgt: | |
| try: | |
| t1 = _time.time() | |
| refined = llm.refine_batch( | |
| texts, nmt_translations, context_result, tgt | |
| ) | |
| if refined is not None: | |
| final_translations = refined | |
| logger.info( | |
| f"✨ LLM refinement batch {batch_index}: " | |
| f"{len(sentences)} segments in {_time.time() - t1:.2f}s" | |
| ) | |
| else: | |
| logger.debug( | |
| f"⏭️ Refinement returned None for batch {batch_index} " | |
| f"— using NMT output" | |
| ) | |
| except Exception as e: | |
| logger.warning( | |
| f"Refinement failed for batch {batch_index} — using NMT: {e}" | |
| ) | |
| # Apply translations to sentences | |
| for s, t in zip(sentences, final_translations, strict=True): | |
| s.translation = t |
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 401-401: Do not catch blind exception: Exception
(BLE001)
[warning] 424-424: Do not catch blind exception: Exception
(BLE001)
[warning] 430-430: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/ai-engine/src/async_pipeline.py` around lines 390 - 431, The loop that
applies translations uses zip(sentences, final_translations) which can silently
drop items if the lengths differ; update the loop to use zip(..., strict=True)
so it raises a ValueError immediately on length mismatch between sentences and
final_translations (reference: the for loop pairing sentences and
final_translations, and variables sentences, final_translations, and zip), and
ensure any surrounding try/except (if present) lets that error surface or is
handled where appropriate so mismatches are caught early.
| ) | ||
|
|
||
| user_msg = f"Text Samples:\n" + "\n".join(text_samples) |
There was a problem hiding this comment.
Remove extraneous f prefix from string without placeholders.
This f-string has no interpolation expressions.
🔧 Proposed fix
- user_msg = f"Text Samples:\n" + "\n".join(text_samples)
+ user_msg = "Text Samples:\n" + "\n".join(text_samples)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| user_msg = f"Text Samples:\n" + "\n".join(text_samples) | |
| user_msg = "Text Samples:\n" + "\n".join(text_samples) |
🧰 Tools
🪛 Ruff (0.15.6)
[error] 60-60: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/ai-engine/src/core/llm_provider.py` at line 60, The assignment to
user_msg uses an unnecessary f-string prefix: change the line that sets user_msg
(user_msg = f"Text Samples:\n" + "\n".join(text_samples)) to use a plain string
literal without the f prefix so it becomes a normal concatenation (i.e., remove
the leading "f" from the string literal assigned to user_msg).
| texts: List[str], | ||
| context: ContextAnalysisResult | ||
| ) -> List[str]: | ||
| def generate(self, prompt: str, system_prompt: str = None) -> str: |
There was a problem hiding this comment.
Use explicit Optional or | None type annotation.
PEP 484 prohibits implicit Optional when default is None.
🔧 Proposed fix
- def generate(self, prompt: str, system_prompt: str = None) -> str:
+ def generate(self, prompt: str, system_prompt: str | None = None) -> str:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def generate(self, prompt: str, system_prompt: str = None) -> str: | |
| def generate(self, prompt: str, system_prompt: str | None = None) -> str: |
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 178-178: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/ai-engine/src/core/llm_provider.py` at line 178, The generate signature
uses a default None for system_prompt but lacks an explicit Optional annotation;
update the function signature for generate in llm_provider.py to annotate
system_prompt as Optional[str] (or str | None for Python 3.10+) so the type
reflects the None default, e.g., change def generate(self, prompt: str,
system_prompt: str = None) -> str to use the explicit Optional/union type and
adjust any imports (typing.Optional) if needed.
| if source_lang not in NLLB_LANG_MAP: | ||
| logger.warning( | ||
| f"NMTTranslator: source_lang '{source_lang}' not in NLLB_LANG_MAP — " | ||
| "using raw string as language token" | ||
| ) | ||
| if target_lang not in NLLB_LANG_MAP: | ||
| logger.warning( | ||
| f"NMTTranslator: target_lang '{target_lang}' not in NLLB_LANG_MAP — " | ||
| "using raw string as language token" | ||
| ) | ||
|
|
||
| src_code: str = NLLB_LANG_MAP.get(source_lang, source_lang) | ||
| tgt_code: str = NLLB_LANG_MAP.get(target_lang, target_lang) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
find . -name "nmt_translator.py" -type fRepository: KMA-Final-Project/project
Length of output: 112
🏁 Script executed:
cat -n ./apps/ai-engine/src/core/nmt_translator.py | head -200Repository: KMA-Final-Project/project
Length of output: 8368
🏁 Script executed:
grep -r "translate_batch\|NMTTranslator" --include="*.py" | head -50Repository: KMA-Final-Project/project
Length of output: 3647
🏁 Script executed:
grep -B5 -A5 "nmt.translate_batch" ./apps/ai-engine/src/async_pipeline.pyRepository: KMA-Final-Project/project
Length of output: 526
🏁 Script executed:
grep -B20 "nmt.translate_batch" ./apps/ai-engine/src/async_pipeline.py | grep -E "(src|tgt)\s*=" | head -10Repository: KMA-Final-Project/project
Length of output: 158
🏁 Script executed:
grep -B50 "nmt.translate_batch" ./apps/ai-engine/src/async_pipeline.py | head -80Repository: KMA-Final-Project/project
Length of output: 2228
🏁 Script executed:
grep -B100 "def _source_lang\|def _translate_and_upload" ./apps/ai-engine/src/async_pipeline.py | grep -E "(_source_lang|target_lang|i18n)" | head -20Repository: KMA-Final-Project/project
Length of output: 231
🏁 Script executed:
grep -A10 "def _source_lang" ./apps/ai-engine/src/async_pipeline.pyRepository: KMA-Final-Project/project
Length of output: 477
🏁 Script executed:
grep -B30 "def _source_lang" ./apps/ai-engine/src/async_pipeline.py | head -50Repository: KMA-Final-Project/project
Length of output: 1197
🏁 Script executed:
grep -B100 "source_lang_holder\|async def consumer" ./apps/ai-engine/src/async_pipeline.py | grep -E "(source_lang_holder|target_lang|def.*process|mobile|i18n)" | head -20Repository: KMA-Final-Project/project
Length of output: 292
🏁 Script executed:
grep -n "source_lang_holder\|target_lang" ./apps/ai-engine/src/async_pipeline.py | head -30Repository: KMA-Final-Project/project
Length of output: 560
🏁 Script executed:
sed -n '100,250p' ./apps/ai-engine/src/async_pipeline.pyRepository: KMA-Final-Project/project
Length of output: 5433
🏁 Script executed:
grep -n "detected_source_lang\|_detect_source_language" ./apps/ai-engine/src/async_pipeline.pyRepository: KMA-Final-Project/project
Length of output: 535
🏁 Script executed:
sed -n '70,95p' ./apps/ai-engine/src/async_pipeline.pyRepository: KMA-Final-Project/project
Length of output: 1035
🏁 Script executed:
grep -r "detected_lang\|i18n\.language\|Whisper" ./apps/ai-engine/src --include="*.py" | grep -E "(detected_lang|i18n|mobile)" | head -20Repository: KMA-Final-Project/project
Length of output: 915
🏁 Script executed:
grep -B5 -A5 "transcription_result\[\"info\"\].language" ./apps/ai-engine/src/core/smart_aligner.pyRepository: KMA-Final-Project/project
Length of output: 1081
Fail fast on unsupported language codes.
Lines 152-153 silently accept any unknown language code by passing it raw to the NLLB model. If an upstream caller passes an unsupported locale (e.g., "en-US", "fr-CA" from user configuration), the model receives an invalid Flores-200 token instead of failing explicitly. Add language-code normalization and validation before tokenization to catch invalid codes early.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/ai-engine/src/core/nmt_translator.py` around lines 141 - 153, The code
currently falls back to passing raw source_lang/target_lang strings to
NLLB_LANG_MAP, which can silently allow invalid Flores-200 tokens; update
NMTTranslator to normalize and validate language codes before mapping: implement
normalization of source_lang and target_lang (e.g., lowercasing and stripping
regional variants like splitting on '-' or '_', mapping "en-US" -> "en"), then
lookup normalized codes in NLLB_LANG_MAP (referencing NLLB_LANG_MAP,
source_lang, target_lang, src_code, tgt_code) and if a normalized code is not
present raise a clear exception (e.g., ValueError) or return a descriptive error
instead of passing the raw token to the model so failures are explicit and
early.
| Rules: | ||
| 1. Translate the meaning of sentences naturally. | ||
| 2. DO NOT return the original text. You MUST translate it to {target_lang}. | ||
| 3. Maintain the approximate length but prioritize meaning. | ||
| 4. Use the "Context-First" approach: | ||
| - Apply the specific Tone/Style. | ||
| - {pronoun_enforcement} | ||
| 5. Do not merge sentences unless necessary for grammar. | ||
| 6. STRICTLY Return a JSON list. Do not wrap it in a dictionary key like "translations". | ||
| 7. **LANGUAGE ENFORCEMENT**: The output must be PURE {target_lang}. Do NOT include any source language characters (e.g. No Chinese chars in Vietnamese output). | ||
| 1. Return EXACTLY {count} refined translations as a JSON array of strings. | ||
| 2. NEVER merge or split lines — one input line = one output line. | ||
| 3. If the DRAFT is already good, return it unchanged. | ||
| 4. Preserve numbers, proper nouns, and key terms from the keyword list. | ||
| 5. Fix awkward phrasing, wrong pronouns, or unnatural word order. | ||
| {pronoun_rule} | ||
| 6. Output must be PURE target language — no source-language characters. | ||
| 7. Return ONLY a JSON array. No explanation, no wrapping object. |
There was a problem hiding this comment.
Remove the proper-noun conflict in the refinement rules.
Line 128 says to preserve proper nouns and key terms, but Line 131 bans source-language characters outright. For same-script pairs and brand names, that contradiction pushes the model to rewrite or drop names. Relax the ban so unchanged names/brands are still allowed when appropriate.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/ai-engine/src/core/prompts.py` around lines 124 - 132, The refinement
prompt has a contradiction between the “preserve proper nouns and key terms”
rule and the strict ban on source-language characters; update the prompt
template so rule 6 allows unchanged proper nouns/brand names and key terms when
they appear verbatim and use the same script as the target language while still
forbidding other source-language content. Concretely, modify the Rules block
(the prompt string that contains "Rules:" and the {pronoun_rule} placeholder) to
change rule 6 into a conditional allowance like “6. Output must be PURE target
language — except unchanged proper nouns, brand names, and key terms that appear
verbatim in the same script as the target language; all other source-language
characters are prohibited.” Ensure the change references the existing
preservation rule (rule 4) so the two rules are consistent and update any tests
or examples that assert the previous absolute ban.
| try: | ||
| pool = _get_db_pool() | ||
| conn = pool.getconn() | ||
| try: | ||
| with conn: | ||
| with conn.cursor() as cur: | ||
| cur.execute(sql, values) | ||
| finally: | ||
| pool.putconn(conn) | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"DB update failed for media {media_id}: {e}") |
There was a problem hiding this comment.
Do not silently continue after a status write failure.
If update_media_status() only logs here, the worker can still emit COMPLETED/FAILED while Postgres keeps the row in its previous state. After a refresh or missed socket event, clients fall back to stale status indefinitely. Bubble this up or add a retry path before broadcasting terminal events.
🧰 Tools
🪛 Ruff (0.15.6)
[warning] 118-118: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/ai-engine/src/db.py` around lines 108 - 119, The try/except in
update_media_status is swallowing DB write failures (the block using
_get_db_pool(), pool.getconn(), cur.execute(sql, values), media_id and logger)
so callers may broadcast terminal events despite Postgres not being updated;
change this block to either implement a small retry loop with backoff around
cur.execute(sql, values) (and still putconn in the finally) or, if retries fail,
re-raise the exception after logging so the caller sees the failure instead of
continuing; ensure update_media_status surfaces errors (raises) so the worker
can avoid emitting COMPLETED/FAILED when the DB write failed.
| const sourcePath = path.resolve(process.cwd(), 'prisma/schema.prisma'); | ||
| const targetPath = path.resolve( | ||
| process.cwd(), | ||
| '../../../final_project/schema.prisma', | ||
| ); |
There was a problem hiding this comment.
Fragile path resolution depends on working directory.
The script uses process.cwd() which assumes it's always run from apps/backend-api/. If invoked from a different directory (e.g., monorepo root), both source and target paths will resolve incorrectly.
Consider using import.meta.dirname (Node 20.11+) or path.dirname(fileURLToPath(import.meta.url)) to anchor paths relative to the script's location rather than the working directory.
🛠️ Proposed fix using import.meta
import { copyFile, mkdir } from 'node:fs/promises';
import path from 'node:path';
+import { fileURLToPath } from 'node:url';
+
+const __dirname = path.dirname(fileURLToPath(import.meta.url));
async function main() {
- const sourcePath = path.resolve(process.cwd(), 'prisma/schema.prisma');
+ const sourcePath = path.resolve(__dirname, '../prisma/schema.prisma');
const targetPath = path.resolve(
- process.cwd(),
- '../../../final_project/schema.prisma',
+ __dirname,
+ '../../../../final_project/schema.prisma',
);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const sourcePath = path.resolve(process.cwd(), 'prisma/schema.prisma'); | |
| const targetPath = path.resolve( | |
| process.cwd(), | |
| '../../../final_project/schema.prisma', | |
| ); | |
| import { copyFile, mkdir } from 'node:fs/promises'; | |
| import path from 'node:path'; | |
| import { fileURLToPath } from 'node:url'; | |
| const __dirname = path.dirname(fileURLToPath(import.meta.url)); | |
| async function main() { | |
| const sourcePath = path.resolve(__dirname, '../prisma/schema.prisma'); | |
| const targetPath = path.resolve( | |
| __dirname, | |
| '../../../../final_project/schema.prisma', | |
| ); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/backend-api/scripts/sync-schema.ts` around lines 5 - 9, The path
resolution for sourcePath and targetPath currently uses process.cwd(), which is
fragile; change the script to compute a scriptDir anchored to the file's
location (e.g., via import.meta.url or fileURLToPath(import.meta.url) +
path.dirname) and then build sourcePath and targetPath relative to that
scriptDir instead of process.cwd(), updating any references to
sourcePath/targetPath to use the new anchored paths.
| const safeChunkIndex = Math.max(0, Math.trunc(chunkIndex)); | ||
|
|
||
| return { | ||
| ...normalized, | ||
| chunkCount: Math.max(normalized.chunkCount, safeChunkIndex + 1), | ||
| latestChunkIndex: maxNullable(normalized.latestChunkIndex, safeChunkIndex), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find the file and examine the merge functions
fd media-artifact-summary.tsRepository: KMA-Final-Project/project
Length of output: 129
🏁 Script executed:
# Also search for the toNonNegativeInteger function
rg "toNonNegativeInteger" --type tsRepository: KMA-Final-Project/project
Length of output: 419
🏁 Script executed:
# Read the file to examine the implementation
cat -n apps/backend-api/src/modules/media/media-artifact-summary.ts | head -120Repository: KMA-Final-Project/project
Length of output: 4175
🏁 Script executed:
# Quick verification of JavaScript Math behavior
node << 'EOF'
console.log('Math.trunc(NaN):', Math.trunc(NaN));
console.log('Math.trunc(Infinity):', Math.trunc(Infinity));
console.log('Math.max(0, NaN):', Math.max(0, NaN));
console.log('Math.max(0, Infinity):', Math.max(0, Infinity));
console.log('Number.isFinite(NaN):', Number.isFinite(NaN));
console.log('Number.isFinite(Infinity):', Number.isFinite(Infinity));
EOFRepository: KMA-Final-Project/project
Length of output: 235
Replace unsafe manual guards with toNonNegativeInteger() in merge functions.
Math.trunc(NaN) returns NaN, and Math.max(0, NaN) returns NaN. Non-finite chunkIndex/batchIndex parameters will propagate invalid values into chunkCount/latestChunkIndex and corrupt the summary object. The helper toNonNegativeInteger() already exists and properly guards against non-finite numbers using Number.isFinite().
Suggested fix
export function mergeChunkArtifactSummary(
current: unknown,
chunkIndex: number,
): ProcessedArtifactSummary {
const normalized = normalizeArtifactSummary(current);
- const safeChunkIndex = Math.max(0, Math.trunc(chunkIndex));
+ const safeChunkIndex = toNonNegativeInteger(chunkIndex);
return {
...normalized,
chunkCount: Math.max(normalized.chunkCount, safeChunkIndex + 1),
latestChunkIndex: maxNullable(normalized.latestChunkIndex, safeChunkIndex),
@@
export function mergeBatchArtifactSummary(
current: unknown,
batchIndex: number,
): ProcessedArtifactSummary {
const normalized = normalizeArtifactSummary(current);
- const safeBatchIndex = Math.max(0, Math.trunc(batchIndex));
+ const safeBatchIndex = toNonNegativeInteger(batchIndex);
return {
...normalized,
translatedBatchCount: Math.max(
normalized.translatedBatchCount,Also applies to: 105-113
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/backend-api/src/modules/media/media-artifact-summary.ts` around lines 91
- 96, Replace the manual guards that compute safeChunkIndex/safeBatchIndex with
the existing toNonNegativeInteger() helper to avoid propagating NaN/Infinity
into summaries: wherever chunkIndex or batchIndex is currently normalized via
Math.trunc/Math.max (e.g., the computation of safeChunkIndex and subsequent uses
updating chunkCount and latestChunkIndex, and the analogous block around lines
105-113 for batchIndex), call toNonNegativeInteger(chunkIndex) /
toNonNegativeInteger(batchIndex) and use those results to compute chunkCount,
latestChunkIndex, batchCount, and latestBatchIndex so non-finite inputs are
converted to 0 and cannot corrupt the summary object.
| private async handleMediaUpdate(message: string): Promise<void> { | ||
| const parsedPayload: unknown = JSON.parse(message); | ||
| const event = parseMediaEvent(parsedPayload); | ||
|
|
||
| if (!event) { | ||
| this.logger.warn( | ||
| `Received invalid media_updates payload: ${JSON.stringify(redactMediaPayload(parsedPayload))}`, | ||
| ); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Wrap JSON.parse in try-catch for clearer error handling.
JSON.parse(message) on line 60 can throw a SyntaxError if the Redis message contains malformed JSON. While the outer promise chain catches this, the error message will be generic. Consider wrapping the parse in a try-catch to log the specific parsing failure with the redacted payload.
🛡️ Proposed fix
private async handleMediaUpdate(message: string): Promise<void> {
- const parsedPayload: unknown = JSON.parse(message);
+ let parsedPayload: unknown;
+ try {
+ parsedPayload = JSON.parse(message);
+ } catch {
+ this.logger.warn(`Received malformed JSON on media_updates channel`);
+ return;
+ }
const event = parseMediaEvent(parsedPayload);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private async handleMediaUpdate(message: string): Promise<void> { | |
| const parsedPayload: unknown = JSON.parse(message); | |
| const event = parseMediaEvent(parsedPayload); | |
| if (!event) { | |
| this.logger.warn( | |
| `Received invalid media_updates payload: ${JSON.stringify(redactMediaPayload(parsedPayload))}`, | |
| ); | |
| return; | |
| } | |
| private async handleMediaUpdate(message: string): Promise<void> { | |
| let parsedPayload: unknown; | |
| try { | |
| parsedPayload = JSON.parse(message); | |
| } catch { | |
| this.logger.warn(`Received malformed JSON on media_updates channel`); | |
| return; | |
| } | |
| const event = parseMediaEvent(parsedPayload); | |
| if (!event) { | |
| this.logger.warn( | |
| `Received invalid media_updates payload: ${JSON.stringify(redactMediaPayload(parsedPayload))}`, | |
| ); | |
| return; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/backend-api/src/modules/socket/socket.service.ts` around lines 59 - 68,
handleMediaUpdate currently calls JSON.parse(message) which can throw a
SyntaxError; wrap the parse in a try/catch inside handleMediaUpdate, and on
catch call this.logger.warn (or error) with a descriptive message including the
redacted payload via redactMediaPayload(message or parsed attempt) and the error
details, then return early so parseMediaEvent is not called; ensure references
to parseMediaEvent and redactMediaPayload remain unchanged.
| function withAlpha(color: string, alphaHex: string): string { | ||
| if (color.startsWith("#") && color.length === 7) { | ||
| return `${color}${alphaHex}`; | ||
| } | ||
|
|
||
| return color; | ||
| } |
There was a problem hiding this comment.
withAlpha only handles 7-character hex colors.
The helper assumes #RRGGBB format, but theme colors could be rgb(), rgba(), hsl(), shorthand hex (#RGB), or 8-character hex with existing alpha. This could produce invalid color strings like rgb(255, 0, 0)16 or #FFF16.
Consider using a color library or extending the helper to handle other formats gracefully.
🛠️ Safer fallback approach
function withAlpha(color: string, alphaHex: string): string {
if (color.startsWith("#") && color.length === 7) {
return `${color}${alphaHex}`;
}
-
- return color;
+ // For non-hex formats, return original color unchanged
+ // (animations will still work, just without alpha modification)
+ console.warn(`withAlpha: unsupported color format "${color}"`);
+ return color;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| function withAlpha(color: string, alphaHex: string): string { | |
| if (color.startsWith("#") && color.length === 7) { | |
| return `${color}${alphaHex}`; | |
| } | |
| return color; | |
| } | |
| function withAlpha(color: string, alphaHex: string): string { | |
| if (color.startsWith("#") && color.length === 7) { | |
| return `${color}${alphaHex}`; | |
| } | |
| // For non-hex formats, return original color unchanged | |
| // (animations will still work, just without alpha modification) | |
| console.warn(`withAlpha: unsupported color format "${color}"`); | |
| return color; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/mobile-app/src/components/media/ProcessingProgressRing.tsx` around lines
30 - 36, withAlpha currently only appends an alpha hex to 7-character hex
strings and will produce invalid colors for other formats; update withAlpha
(used by ProcessingProgressRing) to safely handle formats: detect and expand
4-char hex (`#RGB` -> `#RRGGBB`) then append/replace alpha for 6- or 8-char hex,
preserve rgb()/rgba()/hsl() strings (or convert using a color library), and
return the original color unchanged if unsupported; alternatively replace the
helper with a small color utility (e.g., tinycolor2 or polished) to parse any
CSS color, set the desired alpha, and return a valid CSS color string.
feat(ai-engine): ⚡ add prewarming heavy components
Summary by CodeRabbit
New Features
Improvements