02Viral
One source. Many forms. A loop, not a launch.
The artifact has to travel and produce signal — citations, embeds, lab patches, policymaker briefings — the kind of feedback that comes back and shapes the next cycle.
Failure mode · results too flat to drive attention
Frontier labs cluster, the news angle disappears.
3 mitigations: per-attack-class breakdowns, defense-in-depth framing, divergence fallback
+
If the headline number suggests "all models look about the same," reporters won't bite. The structural finding has to survive the possibility that aggregate scores are close.
- →Per-attack-class breakdowns surface divergence even when averages are similar.
- →Defense-in-depth framing: Anthropic deploys layered defense, Google deploys none. That gap doesn't show up in any single overall score.
- →Frontier-divergence fallback ready if Gemini outperforms expectations, plus model-vs-model variance ready as the lede.
1
source
Technical summary
Defense-in-depth findings
Robustness data
Co-authored methodology
→
1
engine
Claim library + production stack
Source-span grounding
Multi-model orchestration
Reviewer surface (SSO)
Custom evals before ship
→
10+
artifacts
Launch outputs
Leaderboard site
Methodology page
Press kit + spokespeople
Hero video + infographics
Embed widgets + social cards
Op-ed + podcast talking points
See the engine on a single research paper
Paper example →
Same engine adapts to other source types too — video segments (interview cuts → editorial), podcast appearances (transcript → talking points + clip kits), research datasets (data → embedded explainer + press visuals), multi-language (single source → localized variants). Beyond the leaderboard launch but available when FAR.AI's catalog grows that direction.
02a · See it applied
The leaderboard, end to end.
Open the full mock below, then explore the surfaces inside. Each one is a concrete output of the engine.
Open the leaderboardLaunch →
A concrete mock of the AI Safety Leaderboard.
Built off the FAR.AI farai/ template: defense-in-depth framing across four frontier models × seven risk domains × three defensive layers. The full launch surface, not a landing page.
farai.makeitresonate.com/leaderboard
01 →
Defense-in-depth heatmap
Three stacked layers (input moderation, model-level refusal, output moderation) × seven risk domains. Anthropic's coverage vs. Google's zero, visible at a glance.
02 →
Press kit
60-second reporter pull, spokesperson panel, launch sequencing pinned to dates, briefing-readiness panel, four story angles, op-ed thesis. Reporter-grade.
03 →
Methodology page
Defense-in-depth explainer in plain language, operational layer-scoring definitions, limitations, methodology FAQ. The structural answer to methodology disputes.
04 →
What-If sandbox
Click any cell to cycle layer state. Preset scenarios including "universal jailbreak breaks model-level refusal." Lets reporters discover the structural finding themselves.
02b · The optimization loop
Each cycle compounds. The engine gets sharper, not just newer.
Most launches ship and walk away. Ours is built so the signal that comes back — what got cited, what got embedded, what fell flat — tunes the engine for the next batch. Variants we generate by default; experiments we bake in; underperformers we retire fast.
01 · Source
Research findings
FAR.AI technical work
Methodology
Robustness data
→
02 · Engine
Variants generated by default
Multi-format production
A/B-ready hooks
Source-span grounded
Reviewer in the loop
→
03 · Distribution
Right artifact, right door
Tier-1 embargo
Press kit + embeds
Audience-tailored ship
→
04 · Signal → tune
Measurement, then iterate
Citations + embeds + briefings
Per-variant performance
Retire underperformers
Double on what worked
↷
Signal flows back to the engine. Next batch ships sharper hooks, tuned voice, retired losers. Cost per cited artifact drops; hit rate rises. Signal also flows back to FAR.AI — next research is calibrated by what landed.
02c · Multi-wave, not one day
A durable story arc, not a one-day hit.
The earned-media plan is built around four waves. Each new model release re-enters the cycle. The leaderboard becomes the reference, not the launch.
Wave 1 · Pre-launch
T −30 → T −1
- Tier 1 print exclusive locked under embargo
- T −14: formal lab disclosure with 14-day factual-review window
- Government briefings, influencer mapping, asset finalization
Wave 2 · Launch day
Tuesday, July 14 · 9:00 AM ET
- Embargo lifts. Print exclusive breaks.
- Tier 2 simultaneous pitch (~25 named outlets)
- Owned channels go live. Spokespeople on standby.
Wave 3 · Second wave
T +1 → T +30
- Adam G. op-ed placed (post-coverage cycle)
- Podcast circuit + influencer engagement
- Vertical follow-ups: cyber, science, policy, international
Wave 4 · Living rhythm
T +30 → ongoing
- Each new frontier-model release = new test cycle = new media hook
- Quarterly state-of-defense-in-depth update
- Embed and citation tracking compounds the reference
A living leaderboard
“The results reflect a point in time. When a model is released, we test it and capture how it performs at that moment. As new models come out, we test those as well. Over time, that creates a record of how systems evolve and whether safety is actually improving.”
Ed Yee · Head of Strategic Projects, FAR.AI
02d · The other cadence
Not every finding is a launch. The engine has a fast-response track too.
The leaderboard’s waves cover the planned launch and its quarterly updates. Between cycles, FAR.AI’s researchers will keep finding things — new jailbreaks, new exploits, new red-team results. Adam Penenberg’s framing: those shouldn’t go silent or wait for the next big drop. A simple, repeatable flow turns each meaningful finding into a potential news event.
01
Finding identified
Short internal summary: what it is, why it matters, who it affects.
→
02
Company outreach
Optional but documented. Track response or non-response.
→
03
Rapid comms draft
3–5 sentence plain-English summary, one takeaway, one quote (Adam G. or relevant researcher).
→
04
Internal handoff
Research → FAR.AI comms → Thunder11 (PR) + Newsroom Studio (Adam P., long-form). Roles defined upfront, not negotiated each time.
→
05
Distribution
Targeted reporters and newsletter writers. Framed as a discrete “finding” or “incident,” not a full report.
Open questions
Five things we want your judgment on.
Anchored on the work in our scope: visual + interactive + content engine. PR sequencing, reporter pitch, and op-ed placement are Thunder11’s lane — we’re not asking you to litigate those here.
- Child safety prominence. The heatmap has a toggle: default treats child safety as one of seven domains, “child safety forward” pins it as the lead row. Which version is right for the public surface, and how forward should that finding be in the visual framing?
- Living-leaderboard framing. Ed’s “point in time, evolves with each model release” is the through-line for our content engine. Does that hold up when reporters and policymakers actually use it? Where does it strain?
- Audience layering. Four audience messages shipped in the press kit: general public, policy, technical, frontier labs. Where does the framing break, and which audience is hardest to reach with a single visual surface?
- Tone in the visual treatment. “Industry improvement, not embarrassment” is the North Star. Does our visual treatment of the gap (Anthropic at #1, Google at #4) read that way, or does it slide into mockery? What would you adjust?
- Highest-risk visual or framing move. What’s the surface, claim, or visual choice in our work most likely to backfire under reporter or lab pressure? What would you change before launch?
If we’ve got bandwidth: where does the content-engine concept need to flex to land with FAR.AI’s actual programming cadence (events, papers, ongoing research)?
Appendix · Optional reading
Technical review in the loop — the full version.
Brief plug appears in 03 · Rigorous. This appendix walks through how the workflow actually runs — what's reviewed where, how researcher time stays scoped, and why it could empower the broader FAR.AI brand and comms team beyond the leaderboard.
Try the reviewer toolOpen demo →
Time-boxed queues. Source-span always visible. One review propagates everywhere.
Sign in as Ed, Kellin, or Adam G. Walk through a queue of paraphrased claims paired with verbatim source spans. Approve, suggest edit inline, flag, or veto. Each action propagates to every downstream artifact. Audit trail timestamps every decision.
farai.makeitresonate.com/reviewer-tool
How review works in practice
About 2 hours of researcher review per major paper. Scoped upfront, not open-ended.
6 principles: long-form veto, methodology co-ownership, claim library, embed templates, time, revision scope
+
- →Long-form artifacts (interactive site, article, video): the lead researcher signs off at two checkpoints (outline before build, finished artifact before ship). ~30 minutes each. Outline-stage rejection kills the build. No sunk-cost arguments.
- →Methodology page + press-kit Q&A: co-authored with a named FAR.AI team from draft one. Co-owned, not reviewed. The structural answer to the methodology-disputes failure mode.
- →Social posts: claim library built with the researcher once. Every post pulls from that library. Researcher scans the week's set in a single Friday pass. No researcher ever sees the same phrasing twice.
- →Embed widgets: every numeric claim carries an inline methodology caveat. Reviewed at the template level, once, not per deployment.
- →Researcher time: ~2 hours per major paper across the full lifecycle, scoped into the engagement upfront. If that number's wrong for FAR.AI's team, we want to know before it's baked in.
- →Revision scope: two review rounds per artifact (outline + finished). Anything beyond is a change order with new timeline. The guardrail against open-ended comment threads that kill engagements at month four.