Research, Pathwize

Research

The case for verifiable human judgment

Frontier labs increasingly train and evaluate on the judgment of human experts, yet that judgment is barely verifiable today. We argue that gapless, signed provenance, expert and batch trust scores, and live inter-rater agreement turn trust from a promise into something a reviewer can check.

Dr. Helena Vogt

Head of Research

June 18, 2026

AITraining data

Why synthetic data hits a wall

Mareike Hoffmann

Data Research

December 17, 2025

AIEvaluation

Measuring what models can't fake

Dr. Helena Vogt

Head of Research

March 16, 2026

EconomicLabor market

Where demand for frontier experts is heading

Pathwize Research

Research team

January 2, 2026

AIHuman expertise

How human expertise shapes modern AI

Jonas Albrecht

Research

January 5, 2026

ProvenanceCompliance

The EU AI Act for data teams

Lena Brandt

Compliance Research

August 14, 2025

AIEvaluation

Inter-rater agreement as a live signal

Dr. Helena Vogt

Head of Research

August 1, 2025

Latest research

¹¹

Latest research¹¹

AIEvaluation

Measuring what models can't fake

Most evaluations grade from text alone, flattening the reasoning and edge cases that decide correctness. We describe an expert-graded benchmark built to resist gaming, including undisclosed-AI submissions.

Dr. Helena Vogt

May 27, 2026

AIProvenance

The case for verifiable human judgment

Frontier labs increasingly train on the judgment of human experts, yet that judgment is barely verifiable today. We argue that signed provenance and live agreement turn trust into something a reviewer can check.

Dr. Helena Vogt

April 28, 2026

EconomicExpert economy

Why scarce experts churn, and what it costs

Incumbents optimise for volume and speed, treating scarce specialists as interchangeable gig workers. We look at the churn this creates and how it quietly degrades data quality over time.

Pathwize Research

March 16, 2026

AIIntegrity

Detecting AI disguised as human feedback

When contractors secretly use LLMs, the data meant to capture human judgment is poisoned. We share the timing, telemetry and content signals we use to catch undisclosed-AI submissions in the loop.

Jonas Albrecht

Lena Brandt

February 9, 2026

AIHuman expertise

How human expertise shapes modern AI

Exploring why human judgment, skill and domain expertise remain essential to training modern AI, and how credentialed people shape the frontier where synthetic data can't reach.

Jonas Albrecht

January 5, 2026

Research

The case for verifiable human judgment

Why synthetic data hits a wall

Measuring what models can't fake

Where demand for frontier experts is heading

How human expertise shapes modern AI

The EU AI Act for data teams

Inter-rater agreement as a live signal

Latest research

Latest research11

Measuring what models can't fake

The case for verifiable human judgment

Why scarce experts churn, and what it costs

Detecting AI disguised as human feedback

How human expertise shapes modern AI

Latest research¹¹