Shard
Neural Mesh
unreachableBootstrapping
Neural Gateway
default-modelv0.0.0
SYNCING MESH
0 NODES Β· 100% RELIABILITY
Press Enter to send, Shift+Enter for new line
Live Network Status
Network data unavailable β single-node mode active.
Traditional Cloud AI vs. Shard Network
π° Cost$0.002β$0.06 / 1K tokensFree (compute-for-access)
π PrivacyYour data on someone else's serverLocalhost-first routing
π ScalabilityBuy more GPUsMore users = more GPUs
π‘οΈ ResilienceSingle point of failureSelf-healing P2P mesh
β‘ LatencyNetwork RTT + queue waitLocal draft + network verification
π APIProprietaryOpenAI-compatible drop-in
How Contribution Works
Scouts draft likely next tokens in-browser, shard nodes verify them in parallel, and clients receive trusted output with lower latency.
Interactive Inference Demo
Output will stream hereβ¦
0Draft tokens contributed by Scouts
0Verified by Shard node
Built With
- GGUF Runtime: runs quantized model weights efficiently so verifier nodes can serve high-quality output with lower memory pressure.
- libp2p: provides peer discovery and resilient transport so nodes can coordinate over a decentralized mesh.
- WebLLM: powers browser-based draft generation so contributors can help without installing native runtimes.
- WebGPU: lets modern browsers execute local draft inference quickly with GPU acceleration.
- Rust: drives the daemon and networking layer for deterministic performance and safety under load.
- FastAPI: supplies OpenAI-compatible HTTP endpoints and orchestration hooks for app integrations.
Project Signals
Benchmark callout: documented API performance sample shows 850 tokens/sec distributed throughput and 245 req/min in live cluster mode, compared with a nominal single-node baseline of ~320 tokens/sec (~2.6x uplift).