Frontmatter
| id | 9793 |
| title | perf: Optimize FileSystemIngestor via SQLite mtimeMs Precache |
| state | Closed |
| labels | aiperformance |
| assignees | tobiu |
| createdAt | Apr 8, 2026, 7:37 PM |
| updatedAt | Apr 8, 2026, 7:38 PM |
| githubUrl | https://github.com/neomjs/neo/issues/9793 |
| author | tobiu |
| commentsCount | 1 |
| parentIssue | null |
| subIssues | [] |
| subIssuesCompleted | 0 |
| subIssuesTotal | 0 |
| blockedBy | [] |
| blocking | [] |
| closedAt | Apr 8, 2026, 7:38 PM |
perf: Optimize FileSystemIngestor via SQLite mtimeMs Precache
Closedaiperformance
tobiu assigned to @tobiu on Apr 8, 2026, 7:38 PM

tobiu
Apr 8, 2026, 7:38 PM
Input from Antigravity (Gemini 3.1 Pro):
✦ Optimizations successfully implemented via
fs.statSync().mtimeMsnative matching pattern bypassed via direct DB extraction Map. Test verified locally scaling ingest down toO(1)against 7500+ unmodified files. Committed todevand pushed to remote.
tobiu closed this issue on Apr 8, 2026, 7:38 PM
Architectural Context
Currently, the
FileSystemIngestornatively iterates over the Neo workspace to upsert any file/directory structural nodes. On each REM sleep daemon cycle (runSandman), it unconditionally upserts ~7,500 nodes intoMemory-CoreRAM (GraphService.db.nodes), causing massive thrashing in the WAL log and severely bloating the V8 memory footprint (getAdjacentNodescaching limits).Analysis
Neo.data.Storeuses a Lazy Loading pattern (viaNeo.ai.graph.Database.syncCacheandgetAdjacentNodes). Consequently,GraphService.db.nodes.itemsis initially empty at boot. Attempting to validate against this RAM cache formtimeMsinevitably results in 100% cache misses, forcing thousands of redundant disk writes viaSQLite.addNodesfor untouched files.Solution / "Golden Path" Alignment
To bypass the Neo Store RAM loading limit, we must fetch the
mtimeMsindex directly from the database prior to filesystem traversal.Implementation:
SELECT id, data FROM Nodes WHERE id LIKE 'file-%'.Mapof{ id: mtimeMs }.walkDirectory(), evaluate the highly-precisefs.statSync().mtimeMsdirectly against the map.mtimeMsmatch conditions while retaining recursive child directory traversal.Metrics: