Feat: ai embedding by jwfing · Pull Request #730 · InsForge/InsForge

jwfing · Jan 22, 2026

Summary

add new endpoint to support embedding:
POST /api/ai/embeddings

How did you test this change?

run test-ai-embedding.sh manually.

Summary by CodeRabbit

New Features
- Added embeddings API (POST /api/ai/embeddings) supporting single or batched text, float/base64 encoding, optional dimensions; returns standardized list with model and usage metadata; requires authentication.
Documentation
- OpenAPI updated with EmbeddingsRequest and EmbeddingsResponse schemas; public types documented.
Tests
- Manual integration script added to validate success and error cases.
Chores
- Embedding usage tracking integrated; logging adjusted for usage events.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · Jan 22, 2026

Walkthrough

Adds embeddings support: new EmbeddingService, new POST /api/ai/embeddings route with validation and error handling (route appears duplicated in-file), Zod/OpenAPI schemas and TS types, a manual integration test script, and reduced AI usage log verbosity.

Changes

Cohort / File(s)	Summary
Core Embeddings Service `backend/src/services/ai/embedding.service.ts`	New singleton `EmbeddingService` implementing `createEmbeddings(...)`: calls OpenRouter embeddings API, maps response to local `EmbeddingsResponse`, extracts token usage, logs results, and conditionally reports usage via AIUsageService.
API Route `backend/src/api/routes/ai/index.routes.ts`	Added `POST /api/ai/embeddings` endpoint using `embeddingsRequestSchema`: validates input, invokes `EmbeddingService.createEmbeddings`, returns embeddings, and uses AppError-based error handling; endpoint appears duplicated within the file.
Usage Logging `backend/src/services/ai/ai-usage.service.ts`	Lowered log level for chat and image usage tracking from info → debug.
Schemas & OpenAPI `shared-schemas/src/ai-api.schema.ts`, `openapi/ai.yaml`	New Zod schemas and exported TS types: `embeddingsRequestSchema`, `embeddingObjectSchema`, `embeddingsResponseSchema` and corresponding types; OpenAPI adds `POST /api/ai/embeddings` with request/response schemas and standard errors.
Integration Tests `backend/tests/manual/test-ai-embeddings.sh`	New manual Bash script exercising single and batch inputs, model/encoding variants, validation failure cases, and unauthenticated requests; validates responses and prints previews.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

tonychang04
Fermionic-Lyu

Poem

🐇 I hop through lines of code tonight,

Vectors bloom in rows so bright.
Routes and schemas, tests in tune,
I nibble tokens by the moon — 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name	Status	Explanation	Resolution
Title check	❓ Inconclusive	The title 'Feat: ai embedding' is vague and uses generic phrasing that doesn't clearly convey the specific change; while it references AI embeddings, it lacks concrete detail about what was actually implemented.	Consider a more specific title such as 'Feat: add POST /api/ai/embeddings endpoint for generating text embeddings' or 'Feat: implement embeddings service and API endpoint' to better communicate the primary change.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@backend/src/api/routes/ai/index.routes.ts`:
- Around line 477-513: Add a rate limiter middleware to the embeddings route:
create an express-rate-limit instance (e.g., embeddingsRateLimiter using
rateLimit with sensible windowMs/max settings) and insert it into the middleware
chain for the router.post('/embeddings') route (for example:
router.post('/embeddings', embeddingsRateLimiter, verifyUser, ...)). Ensure you
import rateLimit from 'express-rate-limit', configure the limiter, and apply it
before the handler that calls EmbeddingService.getInstance().createEmbeddings to
enforce limits on this cost-heavy endpoint.

In `@backend/src/services/ai/ai-usage.service.ts`:
- Around line 93-121: In trackEmbeddingUsage, stop computing outputTokens when
either totalTokens or inputTokens is undefined and ensure outputTokens is
clamped to >=0 so zeros are preserved; specifically, change the logic in
trackEmbeddingUsage to: only compute outputTokens when both inputTokens and
totalTokens are numbers, set outputTokens = Math.max(0, totalTokens -
inputTokens), and pass null for input_tokens/output_tokens in the INSERT only
when the corresponding value is undefined (not when it's 0), updating the values
array and logger usage accordingly.

In `@backend/tests/manual/test-ai-embeddings.sh`:
- Line 219: The script always prints "print_success" and exits 0 even when
track_test_failure sets TEST_FAILED; update the end of the test (around the
print_success "🎉 AI Embeddings test completed!" call) to check the TEST_FAILED
variable and if set exit with a non-zero status (e.g., call print_failure or
echo an error and exit 1), otherwise print_success and exit 0; reference the
track_test_failure function and TEST_FAILED variable to locate where to add this
conditional exit.

In `@shared-schemas/src/ai-api.schema.ts`:
- Around line 159-189: The embeddingObjectSchema currently forces embedding:
z.array(z.number()) which breaks for encoding_format="base64"; change
embeddingObjectSchema to use a union for embedding:
z.union([z.array(z.number()), z.string()]) so embeddings can be either number[]
or base64 string, and update any code that maps/normalizes responses (the
service mapping that currently does an unsafe "as number[]" cast) to stop
casting—preserve the embedding field as number[] | string and add explicit
handling: if embedding is a string decode base64 to numbers in a dedicated
helper, otherwise pass through the number[]; update signatures/types accordingly
so embedding is typed as number[] | string (symbols: embeddingObjectSchema,
embeddingsResponseSchema and the service mapping that performs the unsafe cast).

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@backend/src/services/ai/embedding.service.ts`:
- Line 4: The import for AIConfigService in embedding.service.ts is missing the
.js extension which breaks ESM resolution; update the relative import statement
that references AIConfigService (symbol: AIConfigService) to include the .js
extension (e.g., change './ai-config.service' to './ai-config.service.js') so it
matches other imports like './ai-usage.service.js' and '@/utils/logger.js' and
ensures consistent ESM module resolution at runtime.

♻️ Duplicate comments (1)

backend/src/services/ai/embedding.service.ts (1)

48-67: Duplicate: guard and preserve token counts for embeddings usage.

This hits the same concern already raised about embeddings usage tracking—token counts should be guarded and zero values preserved so usage data isn’t skewed. Please apply the earlier fix in AIUsageService to this path as well.

…ESM module resolution

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@backend/src/services/ai/embedding.service.ts`:
- Around line 56-68: The embedding service is incorrectly calling
this.aiUsageService.trackChatUsage(...) and using chat-oriented token math;
replace that with a dedicated embedding tracking call: add a new method on
AiUsageService (e.g., trackEmbeddingUsage(aiConfigId, inputTokens, model,
metadata?)) and call it from embedding.service.ts instead of trackChatUsage,
passing tokenUsage.promptTokens as inputTokens and a fixed 0 for outputTokens
(or omit output tokens), and ensure the new method persists a
usage_type='embedding' (or adds a usage_type column in the ai_usage schema) so
embedding usage can be distinguished in analytics/billing; update any
callers/tests accordingly to use trackEmbeddingUsage and options.model for the
model id.

🧹 Nitpick comments (2)

backend/src/services/ai/embedding.service.ts (2)
31-31: Consider logging when AI config is not found for the model.

If findByModelId returns null, the code proceeds silently without usage tracking. Adding a debug/warning log here would help identify misconfigured models during development.
♻️ Suggested improvement
       const aiConfig = await this.aiConfigService.findByModelId(options.model);
+      if (!aiConfig) {
+        logger.debug('No AI config found for model, usage tracking will be skipped', {
+          model: options.model,
+        });
+      }
       const response = await this.openRouterProvider.sendRequest((client) =>
83-91: Consider using logger.error for exceptions.

Using warn level for errors that cause exceptions to be thrown may make production issues harder to detect in alerting systems that filter by log level.
♻️ Suggested change
     } catch (error) {
-      logger.warn('Embedding error', {
+      logger.error('Embedding generation failed', {
         error: error instanceof Error ? error.message : String(error),
         model: options.model,
       });

Fermionic-Lyu

Code looks good 2 me

jwfing added 3 commits January 21, 2026 22:51

feat: add new endpoint /api/ai/embeddings

9fb5e02

close: #727

8150a65

chore: modify logging format

b870a7f

coderabbitai Bot reviewed Jan 22, 2026

View reviewed changes

Comment thread backend/src/api/routes/ai/index.routes.ts

Comment thread backend/src/services/ai/ai-usage.service.ts Outdated

Comment thread backend/tests/manual/test-ai-embeddings.sh

Comment thread shared-schemas/src/ai-api.schema.ts

jwfing requested review from Fermionic-Lyu and tonychang04 January 22, 2026 18:36

jwfing added 2 commits January 22, 2026 10:41

chore

6a4fca3

address coderabbit reviews

b40f076

Fermionic-Lyu reviewed Jan 22, 2026

View reviewed changes

Comment thread backend/src/services/ai/ai-usage.service.ts Outdated

remove duplicated ai-usage-tracking function

5a263f0

coderabbitai Bot reviewed Jan 22, 2026

View reviewed changes

Comment thread backend/src/services/ai/embedding.service.ts Outdated

Added the .js extension to the AIConfigService import for consistent …

ae89000

…ESM module resolution

coderabbitai Bot reviewed Jan 22, 2026

View reviewed changes

Comment thread backend/src/services/ai/embedding.service.ts

modify shared-schemas for encoding_format default desclaration

22a2709

Fermionic-Lyu approved these changes Jan 22, 2026

View reviewed changes

jwfing merged commit 3c0a42b into main Jan 22, 2026
5 checks passed

coderabbitai Bot mentioned this pull request Jan 22, 2026

docs: update multi framework documents #726

Merged

This was referenced Mar 20, 2026

Feat byok openrouter #966

Merged

feat(ai-usage): track embedding usage separately with usage_type column #981

Open

coderabbitai Bot mentioned this pull request Mar 30, 2026

feat(storage): allow renaming files in dashboard #1040

Open

coderabbitai Bot mentioned this pull request Apr 18, 2026

feat: add /api/ai/vector/search and /api/ai/query endpoints #1115

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feat: ai embedding#730

Feat: ai embedding#730
jwfing merged 8 commits intomainInsForge/InsForge:mainfrom
feat/ai-embeddingInsForge/InsForge:feat/ai-embeddingCopy head branch name to clipboard

jwfing commented Jan 22, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jan 22, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Fermionic-Lyu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Search code, repositories, users, issues, pull requests...

Uh oh!

Conversation

jwfing commented Jan 22, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

How did you test this change?

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Fermionic-Lyu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jwfing commented Jan 22, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 22, 2026 •

edited

Loading