Docs / call extraction

How calls are extracted

A "call" in SWARM is a structured record that says this member predicted X would do Y at time T. We extract calls from group messages in three layers.

Layer 1 — free filter

Pure regex + rules. Drops obvious noise (gm/wagmi/lfg, emoji-only, short all-caps, duplicates within 5 min) before any AI call. Costs zero. Filters about 35-50% of typical group traffic.

Layer 2 — classification

Every minute, the classifier cron batches up to 25 unclassified messages per group and sends them to Claude Haiku 4.5 with a locked system prompt. Output schema:

{
  msgId, category, confidence,
  usefulness_score, risk_score, sentiment,
  tickers[], contracts[], narratives[],
  reasoning
}

Categories: alpha_call, market_warning, research, token_mention, narrative, meme, question, answer, noise, social, admin.

Layer 2.5 — call extraction

When Layer 2 categorizes a message as alpha_call with confidence >= 0.65 AND a contract address is present, the extractor pulls the structured Call record:

{
  ticker, chain, contractAddress,
  direction: "long" | "short" | "watch",
  confidence: 0..1,
  rejectedReason: null | "..."
}

Common rejectedReason values: low_confidence (extractor < 0.5), ambiguous_direction, bot_link_spam, nonexistent_contract (DexScreener can't resolve), duplicate_of_user (same caller called same token in last 48h).

Why the layering

One LLM call per message at 50k messages/day per group is cost- prohibitive. Layer 1 removes the obvious junk for free. Layer 2 batches and uses the cheap model. Only Layer 2.5 — call extraction — runs the higher-precision pass, and only on the small fraction of messages classified as alpha_call.

Net cost: under $0.20/day for a 250-member group with 1500 messages/day. Pro and Elite tiers raise the cap to $1 and $3, sufficient for groups generating 10-50k messages/day.