"The test of regulated intelligence is not whether the model is accurate in a demo. It is whether a compliance officer can read its documentation, understand its limits, and sign their name under the decision it informs. Governance is what makes a score deployable inside a regulated institution."
Whether expected outcomes match realized ones.
We check three things. Do the probabilities mean what they say, so a 70 percent prediction is right about 70 percent of the time. Does the model hold up on real cases it never saw. And does it match outcomes that actually happened across the 475M+ real court record base. A model only joins the 23,706 in production once it clears that bar on real data.
Promotion is earned. We hold back the models we cannot defend.
A model does not reach production because we built it. It reaches production because it survived testing against real outcomes. The fleet is a funnel, and most of what we build never enters it.
We would rather hold a model back than ship one we cannot stand behind. That discipline is the difference between a dashboard and a data-science institution.
Every production model carries a documented model card: intended use, training-data vintage and provenance, validation performance, known limits, and the jurisdiction-case-type scope it was built for. There are no black boxes in the fleet. If a model is deployed, its card is available to the compliance team that has to approve it. A score with no documented basis is not a product Criterica ships.
Each model is checked three ways: do its probabilities mean what they say, does it hold up on real cases it never saw, and does it match outcomes that actually happened across our 475M+ real court record base. The question is never whether a model sounds confident. It is whether its predictions came true. Only models that clear that bar join the 23,706 in production.
Outcome prediction adjacent to courts draws fairness scrutiny, and it should. Criterica audits for disparate performance across protected and proxy dimensions and treats fairness testing as a standing feature of the validation pipeline, not an afterthought bolted on before a deal. Where a model shows performance Criterica cannot defend, it does not promote it. The audit is documented so a reviewer can see what was tested and what was found.
Every score is accompanied by the reasons behind it: the features that drove the prediction and the direction each one pushed. A decision-maker should never have to defend a number they cannot explain. Explainability is not a separate report you request — it travels with the score, so the person putting capital or a reserve behind it can articulate why.
Data handling, access controls, encryption, and tenant isolation are designed to support SOC 2 and ISO 27001 control objectives. Client data is segregated per tenant and access is scoped and logged. Criterica does not claim a certification it does not hold; instead, the controls and their documentation are built to satisfy the control objectives a security review measures against, and that documentation is available to counterparties under diligence.
Documentation is packaged for institutions that operate under model risk management practices consistent with SR 11-7 style expectations: clear statements of model purpose and limitations, independent validation evidence, ongoing monitoring, and a documented development process. The objective is not to assert compliance on a buyer's behalf — it is to give a bank's model risk function the artifacts it needs to conduct its own review and approve deployment.
For insurance buyers, documentation is prepared to align with NAIC model governance guidance: governance roles, model inventory, validation and testing evidence, and limits on use. As with the banking framework, Criterica supplies the documentation a carrier's governance function requires so procurement and compliance can evaluate and approve the model against their own standards, across 89 jurisdictions in the fleet.
A note on framework language. Criterica does not represent that it holds SOC 2 or ISO 27001 certification, and does not assert that any deployment is "compliant" with SR 11-7 or NAIC guidance on a buyer's behalf. Those determinations belong to the buyer's own auditors, model risk function, and regulators. Criterica's role is to prepare the documentation and controls so those teams can conduct their review against their own standards and reach their own conclusion.