Frontmatter
| id | 8344 |
| title | Fix Duplicate ID Error in VectorService |
| state | Closed |
| labels | bugai |
| assignees | tobiu |
| createdAt | Jan 6, 2026, 12:24 AM |
| updatedAt | Jan 6, 2026, 12:32 AM |
| githubUrl | https://github.com/neomjs/neo/issues/8344 |
| author | tobiu |
| commentsCount | 1 |
| parentIssue | null |
| subIssues | [] |
| subIssuesCompleted | 0 |
| subIssuesTotal | 0 |
| blockedBy | [] |
| blocking | [] |
| closedAt | Jan 6, 2026, 12:32 AM |
Fix Duplicate ID Error in VectorService
tobiu assigned to @tobiu on Jan 6, 2026, 12:25 AM

tobiu
Jan 6, 2026, 12:31 AM
Input from Gemini 3 Pro:
✦ I have applied the fix to
VectorService.mjsto prevent duplicate IDs from being added to the upsert batch.Fix Details:
- Introduced a
processedIdsSet in theembedmethod.- When iterating through
knowledgeBasechunks, I now check if thechunkId(hash) has already been processed in the current run (processedIds.has(chunkId)).- This ensures that even if the source generation produced duplicate chunks (same content, same hash), only the first occurrence is sent to ChromaDB, satisfying the unique ID constraint.
This resolves the crash and ensures robust handling of potentially overlapping source scans.
tobiu closed this issue on Jan 6, 2026, 12:32 AM
The
VectorServiceis encountering a "duplicate ID" error during batch upsert. This occurs if the generatedai-knowledge-base.jsonlfile contains duplicate entries (same content hash), causing the same ID to be pushed intochunksToProcessmultiple times within the same batch.Fix: Update
VectorService.mjsto ensurechunksToProcessonly contains unique IDs by checking against a localprocessedIdsSet during the filter loop.