Shard

Neural Mesh

unreachableBootstrapping

Secure Distributed Inference

Your requests are processed across a trustless P2P mesh. Select a prompt below or start a conversation.

Press Enter to send, Shift+Enter for new line

Live Network Status

Network data unavailable — single-node mode active.

Traditional Cloud AI vs. Shard Network

💰 Cost$0.002–$0.06 / 1K tokensFree (compute-for-access)

🔒 PrivacyYour data on someone else's serverLocalhost-first routing

📈 ScalabilityBuy more GPUsMore users = more GPUs

🛡️ ResilienceSingle point of failureSelf-healing P2P mesh

⚡ LatencyNetwork RTT + queue waitLocal draft + network verification

🔌 APIProprietaryOpenAI-compatible drop-in

How Contribution Works

Scouts draft likely next tokens in-browser, shard nodes verify them in parallel, and clients receive trusted output with lower latency.

Interactive Inference Demo

Prompt

Output will stream here…

0Draft tokens contributed by Scouts

0Verified by Shard node

Built With

GGUF Runtime: runs quantized model weights efficiently so verifier nodes can serve high-quality output with lower memory pressure.
libp2p: provides peer discovery and resilient transport so nodes can coordinate over a decentralized mesh.
WebLLM: powers browser-based draft generation so contributors can help without installing native runtimes.
WebGPU: lets modern browsers execute local draft inference quickly with GPU acceleration.
Rust: drives the daemon and networking layer for deterministic performance and safety under load.
FastAPI: supplies OpenAI-compatible HTTP endpoints and orchestration hooks for app integrations.

Project Signals

213+GitHub commits

v0.4.9Current version

BUSL-1.1Apache 2.0 in 2036

White PaperArchitecture deep dive

Benchmark callout: documented API performance sample shows 850 tokens/sec distributed throughput and 245 req/min in live cluster mode, compared with a nominal single-node baseline of ~320 tokens/sec (~2.6x uplift).

Neural Gateway