How calls are extracted
A "call" in SWARM is a structured record that says this member predicted X would do Y at time T. We extract calls from group messages in three layers.
Layer 1 — free filter
Pure regex + rules. Drops obvious noise (gm/wagmi/lfg, emoji-only, short all-caps, duplicates within 5 min) before any AI call. Costs zero. Filters about 35-50% of typical group traffic.
Layer 2 — classification
Every minute, the classifier cron batches up to 25 unclassified messages per group and sends them to Claude Haiku 4.5 with a locked system prompt. Output schema:
{
msgId, category, confidence,
usefulness_score, risk_score, sentiment,
tickers[], contracts[], narratives[],
reasoning
}Categories: alpha_call, market_warning, research, token_mention, narrative, meme, question, answer, noise, social, admin.
Layer 2.5 — call extraction
When Layer 2 categorizes a message as alpha_call with confidence >= 0.65 AND a contract address is present, the extractor pulls the structured Call record:
{
ticker, chain, contractAddress,
direction: "long" | "short" | "watch",
confidence: 0..1,
rejectedReason: null | "..."
}Common rejectedReason values: low_confidence (extractor < 0.5), ambiguous_direction, bot_link_spam, nonexistent_contract (DexScreener can't resolve), duplicate_of_user (same caller called same token in last 48h).
Why the layering
One LLM call per message at 50k messages/day per group is cost- prohibitive. Layer 1 removes the obvious junk for free. Layer 2 batches and uses the cheap model. Only Layer 2.5 — call extraction — runs the higher-precision pass, and only on the small fraction of messages classified as alpha_call.
Net cost: under $0.20/day for a 250-member group with 1500 messages/day. Pro and Elite tiers raise the cap to $1 and $3, sufficient for groups generating 10-50k messages/day.
