Methodology
A reader who agrees with our conclusion should still know how we got there. This page documents the methodology behind every story RabixAI publishes — the testing, the scoring, the sourcing weights, and the editorial gates.
How we choose what to cover
Our newsroom monitors RSS feeds, official vendor announcements, academic preprints (arXiv, OpenReview), regulator dockets, and high-signal community channels (developer forums, GitHub release feeds). A story enters the pipeline when at least one of three thresholds is met:
- A frontier-lab release (model launch, capability change, pricing shift) with verifiable first-party documentation.
- A regulatory event — published rule, court filing, government agency statement — with a primary-source URL.
- A measurable industry shift — funding round closed, acquisition announced, executive transition — with at least two independent confirmations.
Stories that fail to clear any threshold are not published. Speculation, rumour aggregation, and Twitter-thread summaries do not constitute newsworthy events under our methodology.
How we test tools
Tool reviews on RabixAI follow a fixed evaluation framework. Every tool is tested against the same task set within its category (writing, coding, research, data analysis, image, audio, agents). We log:
- Output quality on the same set of standardised prompts.
- Latency under cold and warm conditions.
- Pricing per unit of useful output (token, request, render).
- Compliance signals — data residency, audit logs, SOC 2, HIPAA where applicable.
- Real-world failure modes — hallucinations, refusals, edge cases.
We disclose every account tier we tested under (free, paid, enterprise), every prompt set used, and every model version. We do not accept review units in exchange for favourable coverage. Every tool is paid for at its public list price.
How we score comparisons
Comparison articles ("X vs Y") use a weighted scorecard:
- 40% capability — measured against the standardised task set.
- 20% reliability — uptime and consistency over the test window.
- 15% pricing — value per dollar at typical workloads.
- 15% governance — compliance certifications, deployment options, data handling.
- 10% ergonomics — developer experience, documentation, integration depth.
Every comparison article links to the raw scorecard so readers can re-weight against their own priorities. We do not declare an overall "winner" without showing the inputs.
How we weight sources
Not every source carries equal weight. Our internal sourcing taxonomy:
- High reliability — first-party vendor docs and announcements, regulator filings, peer-reviewed papers, named on-the-record interviews, primary datasets.
- Medium reliability — secondary reporting from outlets with editorial standards, industry analyst reports, public financial filings.
- Low reliability — anonymous social posts, unsigned blog posts, community speculation. These can inform a story but cannot be the only source for any factual claim.
A claim sourced exclusively from low-reliability material is either presented as unverified or held until corroboration is obtained.
How we use AI
AI tools assist drafting, research synthesis, headline generation, and illustration — never final editorial judgement. Every paragraph is reviewed and edited by a human editor before publication. Read the full AI use disclosure.
The publish gate
Before any draft becomes an indexable article, it clears an eleven- point automated checklist enforced in our publishing service:
- Status is
needs_review(no skipping the editor). - Word count ≥ 800 — no thin content.
- Research packet present — at least one source ID logged.
- SEO complete — title 30–70 chars, description 100–170 chars, canonical URL set.
- Editorial excerpt ≥ 120 chars.
- Canonical URL absolute and HTTPS.
- Fact-check verdict logged as PASS within the past 24 hours.
- No keyword cannibalisation — primary keyword unique among indexable content.
- OG image asset verified — no broken hero in social shares.
- Headline quality — no AI-slop phrases ("beyond the hype", "the reality of"), at least one named entity.
- No stale tense — body text does not reference prior years in forward-looking constructions.
A draft failing any single check cannot be promoted. The full checklist lives in the codebase at services/publisher/checklist.ts.
Refresh and retirement
Articles older than 90 days are automatically flagged for review. The editor decides whether to refresh (update with new data and republish), retire (set contentStatus = needs_review and remove from public indexes), or leave with a "last reviewed" timestamp.
We do not silently update articles to bury errors. Substantive updates are noted at the bottom of the piece with the date and reason.