Frontmatter
| id | 9089 |
| title | Epic: Engine-Level Streaming Data Loader (JSONL) |
| state | Closed |
| labels | epicaiperformancecorefeature |
| assignees | tobiu |
| createdAt | Feb 10, 2026, 6:26 PM |
| updatedAt | Feb 10, 2026, 8:51 PM |
| githubUrl | https://github.com/neomjs/neo/issues/9089 |
| author | tobiu |
| commentsCount | 3 |
| parentIssue | null |
| subIssues | 9090 Feat: Implement Neo.data.proxy.Base and Neo.data.proxy.Stream 9091 Feat: Enhance Neo.data.Store to support Proxy loading 9092 Refactor: Convert DevRank to use Streaming Proxy 9093 Refactor: DevRank Backend JSONL Persistence 9094 Feat: Batch Streaming & Progressive Rendering for Proxy Loader 9095 Docs: "Streaming Data & Progressive Rendering" Guide |
| subIssuesCompleted | 6 |
| subIssuesTotal | 6 |
| blockedBy | [] |
| blocking | [] |
| closedAt | Feb 10, 2026, 8:51 PM |
Epic: Engine-Level Streaming Data Loader (JSONL)

Input from Gemini 3 Pro:
✦ ## Architectural Decision: The Proxy Pattern
To modernize the data layer for v12, we will reintroduce the Store / Proxy separation pattern:
Neo.data.proxy.Base: Abstract interface for data fetching strategies.Neo.data.proxy.Stream: Concrete implementation for the App Worker that usesfetch+ReadableStream+TextDecoderStreamto parse NDJSON/JSONL.Neo.data.Store: Updated to accept aproxyconfiguration. If present, the Store delegates data loading to the proxy.Why? This decoupling allows us to switch transport layers (Ajax, Stream, WebSocket, LocalStorage) without changing the Store's logic. It also paves the way for a potential future "Remote Proxy" that could live in the Data Worker, although Phase 1 will keep the Stream Proxy in the App Worker to avoid serialization overhead.

Input from Gemini 3 Pro:
✦ ## Context for Implementation
We have extensively researched and prototyped the technical foundation for this Epic.
Key Findings:
- Browser Support: The Streams API (
ReadableStream,TextDecoderStream,TransformStream) is fully supported in modern browsers and Web Workers (including SharedWorkers).- Performance: Initial benchmarks show a ~6x speedup in "Time to First Record" (44ms vs 7ms on local disk) by streaming 10k records vs parsing the full JSON.
- Architecture:
- Phase 1 (Target): Implement the
StreamProxy in the App Worker.- Why? Transferring a
ReadableStream(bytes) to the App Worker is cheap. Parsing it there spreads the CPU load via the microtask queue (thanks to the stream). Transferring parsed Objects from a Data Worker would incur high serialization costs (Structured Clone) for 10k+ records, potentially negating the benefits.- Phase 2 (Future): If CPU load becomes a bottleneck for 100k+ records, we can introduce a
RemoteProxy that parses in the Data Worker and batches records viapostMessage.Implementation Notes:
- We need
src/data/proxy/Base.mjsas a common interface.- We need
src/data/proxy/Stream.mjsas the NDJSON implementation.Neo.data.Storeneeds to handle incremental data arrival (append-only initially, sort-at-end strategy).- The backend (Spider/Updater) must switch to appending JSONL to support the new format efficiently.

Input from Gemini 3 Pro:
✦ I have successfully completed the Streaming Data Loader epic.
Summary of Work
- Architecture: Implemented
Neo.data.proxy.StreamusingfetchandTextDecoderStreamfor efficient, non-blocking data ingestion in the App Worker.- Core Support: Updated
Neo.data.Storeto support theproxyconfig and "Progressive Loading," allowing the grid to render chunks of data as they arrive.- Optimization: Implemented batching (chunkSize: 500) to solve the initial performance regression, reducing overhead and improving TTFR by ~10x compared to standard JSON parsing.
- DevRank Integration: Migrated the DevRank backend to
.jsonlpersistence and updated the frontend Store to use the new streaming architecture.- Documentation: Created a comprehensive guide
learn/guides/datahandling/Streaming.mdand enhanced class-level JSDocs.The infrastructure is now in place for high-performance data handling in Neo.mjs v12.
Objective
Enable
Neo.data.Storeto stream and parse data incrementally (NDJSON/JSONL), allowing for O(1) memory usage and immediate rendering of large datasets (10k+ records) without blocking the UI or waiting for the full download.Why?
Scope
Framework Core:
src/data/readeror directly inStore) that usesfetch,ReadableStream, andTextDecoderStream.format: 'jsonl'in Store configuration.autoLoad: true.DevRank Integration:
users.jsontousers.jsonl.Documentation:
Technical Approach
Leverage modern browser APIs (
TransformStream) which are available in Workers. Use the prototype from the DevRank experiment as a baseline.