← Back to omkarray.com
PM Analysis · Full-Stack Voice AI · February 2026

Smallest.ai
Deep Dive

The company building the anti-wrapper — proprietary models across the full voice stack from waveform to response. How defensible is it really?
Not a Wrapper
$8.26M Seed · Sierra Ventures · 3one4
Awaaz Labs Pvt Ltd · SF + Bangalore

"Smallest.ai is the only voice AI company in India that owns the full inference loop — ASR, LLM, and TTS — under one roof. This is simultaneously their strongest moat and their greatest execution risk."

100ms
TTS latency
(Lightning V2)
20×
cost reduction
achieved ($0.20→$0.01/min)

Core Claim

3
Proprietary models: Lightning (TTS), Electron (LLM), Pulse (ASR) — all built in-house

Architecture Type

Full
Stack
Own inference loop vs. Bolna's multi-vendor orchestration

Positioning

Model
Company
Research papers published, CPAL 2026 presence, ASI framework proposed
01
What Smallest.ai Actually Builds
The model stack — theirs vs. vendor — and what's truly proprietary

Unlike Bolna's orchestration play, Smallest.ai's thesis is vertical integration — owning every layer of the voice pipeline. Their model family now covers the full stack, with Hydra (speech-to-speech) announced as a next-generation architecture.

Exhibit A — Smallest.ai Model Stack: Owned vs. Optional Third-Party
Telephony
resold / 3rd party
Twilio (+1, +91) Custom BYOT SIP trunking EU/UK/LATAM/APAC custom vendor dependency
ASR / STT
Pulse — Proprietary
Pulse (36 languages) Code-switching support World's fastest RTF claimed Streaming + batch Deepgram (optional fallback) own inference
Agent Logic
Atoms — Proprietary
Multi-node graph architecture Per-node fallback definitions Real-time barge-in 100+ corner case handlers Simulated pre-deployment testing ← Core Moat #1
LLM
Electron V2 + BYO
Electron V2 (beats GPT mini) 45ms TTFT Fine-tunable on enterprise data OpenAI GPT-4 (optional) Anthropic Claude (optional) Bring Your Own LLM ← Core Moat #2
TTS / Synthesis
Lightning V2 — Proprietary
Lightning V2 (100ms TTFB) Non-autoregressive architecture <1GB VRAM 30 languages, 1000s of accents Voice clone from 10s audio $0.05 per 10K chars ← Core Moat #3
Post-Call
Analytics + Insights
Custom analytics dashboard Call logs + disposition tracking Per-token observability Model stall detection CRM via API / webhooks growing capability
Hydra
Speech-to-Speech
Full duplex multimodal Long context + tool calling Emotional voice output Eliminates ASR→LLM→TTS hops ← Next-gen architecture (announced)
Owned = Smallest.ai-trained proprietary model. Mixed = own model + optional 3rd party. Vendor = resold/API. The dark border indicates proprietary moat layers.
"This design makes Smallest.ai one of the few companies globally to own the entire conversational loop from waveform to response at production scale." — 3one4 Capital investor memo
02
Vertical Integration Score
How much of the stack does each company actually own?

This is the defining difference between Smallest and every other voice AI company in India. The comparison below shows proprietary ownership by stack layer — this is what the "not a wrapper" claim rests on.

Exhibit B — Vertical Integration by Layer: Smallest.ai vs. Bolna vs. Retell
Smallest.ai
Bolna
Retell/Bland
ASR / STT
Pulse (own)
vendor
vendor
LLM / Brain
Electron V2 + BYO
vendor
vendor
TTS / Voice
Lightning V2 (own)
partial
vendor
Orchestration
Atoms (own)
orchestration layer
partial
On-Prem Deploy
full support
enterprise only
cloud only
Fine-tuning
enterprise + data
limited
not offered
Bar width ≈ degree of proprietary ownership / control of that layer. Smallest.ai's ownership is disproportionately deeper across 5 of 6 dimensions.
03
Smallest vs. Bolna — Head to Head
Two different bets on how to win voice AI in India

These companies are not competing on the same strategy. Bolna is betting on distribution + orchestration. Smallest is betting on model ownership + vertical integration. One will look smarter in 3 years depending on how the AI model commodity curve plays out.

Exhibit C — Strategic Comparison: Smallest.ai vs. Bolna
Dimension
Smallest.ai
Bolna
Architecture
Full vertical stackWIN
Owns ASR, LLM, TTS
Orchestration layer
Stitches 3rd-party APIs
TTS Latency
100ms TTFBWIN
Lightning V2 benchmarked vs ElevenLabs
~150–250ms via ElevenLabs
Pre-recorded buffers reduce perceived delay
LLM Control
Electron V2 (own) + BYOWIN
Fine-tunable on private data
100% 3rd-party (OpenAI, Claude)
No fine-tuning pathway today
India Languages
16 languages
Strong Hindi/English
10+ Indian languagesWIN
50+ accents, Hinglish, TRAI compliance
On-Prem / Security
Full on-prem + air-gapWIN
SOC2, HIPAA, ISO 27001, GDPR, PCI
On-prem enterprise only
India + US data residency
No-Code Builder
3-click agent creationWIN
Simulated pre-deployment testing
Agent builder exists
No visual flow designer (gap)
Traction
Millions of calls/month
BFSI + healthcare strong
200K calls/dayWIN
1,050 paying customers, 25+ case studies
Pricing
$0.01/min at scaleWIN
4× cheaper than peers; transparent tiers
$0.03/min starting
Volume discounts available
Research Depth
Published papers, CPAL 2026WIN
ASI framework, SonoEdit, Hydra
No research output
Engineering-led
WIN badge indicates which company holds the structural advantage on that dimension today. This is not a zero-sum race — both can win in different enterprise segments.
04
The Moat Map
Where defensibility actually lives in Smallest's stack
Exhibit D — Smallest.ai's Moat Layers (Bottom = Commodity → Top = Deepest Defense)
Telephony Infrastructure
Twilio/Plivo resold. Same as every competitor.
COMMODITY
Agent Platform (Atoms)
Graph-based orchestration with node-level fallbacks. Replicable in 6–9 months.
LOW MOAT
Lightning V2 TTS Performance
Non-autoregressive architecture. 100ms TTFB benchmark lead. 4× cost advantage.
REAL MOAT
Electron V2 — Fine-Tunable Voice LLM
SLM purpose-built for spoken language. Enterprises train on proprietary data → lock-in.
STRONG MOAT
Vertical Fine-Tuning + Enterprise Data Loops
BFSI, healthcare, retail-specific models trained on customer data. Compounding advantage.
DATA FLYWHEEL
Hydra — Speech-to-Speech Architecture
Full-duplex multimodal. Eliminates the ASR→LLM→TTS latency chain entirely. If this ships, moat widens to years.
POTENTIAL FORTRESS
Research Credibility + Talent Loop
Published papers, CPAL 2026, ASI framework. Attracts researchers other voice AI companies can't hire.
DEEPEST MOAT
The top two layers are speculative but structurally important. If Hydra ships and works, Smallest leapfrogs all orchestration-based competitors simultaneously.
05
Product Gaps & Vulnerabilities
Where the full-stack bet creates exposure

The same vertical integration that creates Smallest's moat also creates a specific class of risk: everything is their problem. When Bolna's TTS degrades, it's ElevenLabs' problem. When Smallest's does, it's theirs. Here are the real gaps a PM would flag.

Exhibit E — Gap Analysis: Where Smallest.ai is Exposed
Gap Current State Risk to Business Severity
Customer Traction Disclosure "Millions of calls/month" — no customer count, no named case studies equivalent to Bolna's 1,050 customers. Stealth go-to-market. Harder to win enterprise deals without social proof. Sales cycle drags. Harder to raise Series A on ARR story. HIGH
Indian Language Depth vs. Bolna Pulse supports 36 languages but the India-specific depth (Hinglish, 50+ accents, TRAI compliance, DND) lags Bolna's 2+ year head start. Loses India enterprise deals in tier-2/3 cities and vernacular-heavy sectors (rural BFSI, agri-fintech) HIGH
Integration Ecosystem No native CRM connectors. No Zapier/Make listed. API-only for most enterprise integrations. Non-technical buyers cannot deploy without developer support. Blocks SMB/mid-market. HIGH
Hydra Execution Risk Full-duplex speech-to-speech announced but not production-ready. Speech-to-speech is a hard ML problem — GPT-4o Voice has shown its own limitations. If Hydra is delayed, the roadmap premium embedded in their valuation deflates. HIGH
Outbound Campaign Infrastructure No mention of batch calling campaigns, concurrent call tiers, or campaign management UI. Bolna does 200K calls/day with campaign tools. Misses the large SMB outbound use case (sales, collections, reminders) — biggest call volume segment in India. MED
Model Maintenance Burden 3 proprietary models (Lightning, Electron, Pulse) to maintain + Hydra in development. Each needs continuous retraining as data and use cases evolve. Engineering bandwidth gets consumed by model upkeep rather than product features. Bolna can ship faster at the application layer. MED
Multi-Channel (Email, Chat, Social) Website mentions "voice, email, chat, social" but actual product is voice-first. Channel breadth appears aspirational. Credibility gap if enterprise evaluators test omnichannel claims and find voice-only depth. LOW
06
How Hard Is This to Clone?
The replication cost curve — model research is a different beast

This is where Smallest diverges sharply from Bolna. You cannot replicate Smallest by reading a GitHub repo. The proprietary models require ML research talent, compute, and curated speech data that take years — not months — to accumulate.

Exhibit F — Replication Effort by Layer (3-person engineering team + 2 ML researchers)
Layer Effort Visualization Time Est. Hardest Part
Basic agent pipeline
ASR + LLM + TTS (3rd party)
2 weeks
2 wks Nothing hard
Atoms-style graph
Multi-node orchestration
5–8 wks
5–8 wks Edge case handling
TTS model (match quality)
Like Lightning V2
6–12 months · 100K+ hrs speech data
6–12 mo Training data + VRAM optimization
SLM (match Electron V2)
Voice-optimized reasoning
9–18 months · PhD-level work
9–18 mo Benchmarking + hallucination control
ASR (Pulse-quality)
36 language streaming
8–14 months · multilingual data
8–14 mo Code-switching training data
On-prem + compliance
ISO 27001, HIPAA, air-gap
4–6 months
4–6 mo Audit + certification cost
Speech-to-speech (Hydra)
Full duplex multimodal
18–36 months · frontier ML research
18–36 mo No proven blueprint exists yet
Total to replicate: 24–36 months for a well-funded team (5+ people including 2 senior ML researchers). Unlike Bolna, there is no open-source shortcut. The models are the product.
07
Strategic Risk Map
The full-stack bet's specific failure modes
Exhibit G — Risk Matrix: Smallest.ai's Existential Threats
← Impact →
High Impact / High Likelihood
OpenAI releases cheap S2S Hydra ships late Engineering bottleneck from 3-model upkeep
High Impact / Low Likelihood
Google enters India voice AI ElevenLabs acquires Retell TRAI bans automated outbound calls
Low Impact / High Likelihood
Bolna launches no-code builder (hurts conversion) Competitor copies TTS architecture Pricing pressure from Vapi/Bland
Low Impact / Low Likelihood
Key researcher leaves On-prem customer security breach
← Likelihood →
The top-left quadrant is the most critical: if OpenAI's Realtime API (or similar) becomes cheap enough, it may commoditize the SLM advantage that Electron V2 provides. Hydra must ship before that window closes.
Final PM Verdict — Smallest.ai
Replication Time
30mo

For a well-funded 5-person team including 2 ML researchers. No open-source shortcut. The models ARE the product. This is a moat in time.

Defensibility Horizon
5yr

Deeper than Bolna's 3-year window. If Hydra ships and Electron fine-tuning compounds per enterprise customer, this becomes a 5–7 year structural advantage.

The Bull Case

Smallest.ai is building the infrastructure layer of Indian voice AI — not the application. They own TTS, ASR, and LLM simultaneously, which means their cost structure at scale is fundamentally better than every orchestration-based competitor. If Hydra ships, they leapfrog the entire ASR→LLM→TTS latency chain and become the only real-time speech-to-speech platform in production. The 20× cost reduction they've already demonstrated (from $0.20 to $0.01/min) is the template for what happens next.

The Bear Case

Full-stack ownership means full-stack maintenance burden. Three proprietary models to continuously retrain, benchmark, and defend — while Bolna ships product features twice as fast by composing APIs. The traction gap is real: Bolna has 1,050 named customers and 200K calls/day; Smallest says "millions of calls/month" with no customer names. In enterprise sales, social proof is a product feature. The clock is ticking: OpenAI Realtime, Gemini Live, and similar products are making the SLM advantage narrower every quarter.


Analysis based on: smallest.ai product pages, GitHub SDK, 3one4 Capital investor memo, $8M seed announcement, comparison blogs (Smallest vs Retell, vs Poly AI, vs Observe.AI), ElevenLabs benchmark data, and Tracxn profile. Feb 2026. Legal entity: Awaaz Labs Private Limited.