L6
The Competency Evidence System
Competence emerges from process, not exam
Evidence accumulates from the accompanied learning process, never from separate examinations. Three dimensions: autonomy (did the learner solve it independently?), transfer (can they apply it in a new context?), reflection (can they explain why?). Inherently fraud-resistant: faking months of Socratic dialogue costs more cognitive effort than learning the material. Separation of powers: the Mentor generates evidence; an independent Audit-Agent verifies it; the Knowledge Graph anchors it.
Three dimensions
Competence is not proven by separate exams. It emerges from the accompanied learning process. The Mentor witnesses the cognitive journey, and from this continuous observation, evidence accumulates.
Three dimensions define the evidence. Autonomy: did the learner solve it independently, or with heavy guidance? Transfer: can the learner apply the concept in a new context — fractions learned through recipes used in an architecture problem? Reflection: can the learner explain why something works, not just that it works?
Each dimension is observed continuously, not measured at a single moment. The result is not a grade. It is a fingerprint of how this learner thinks about this domain.
Why it cannot be faked cheaply
A traditional exam requires the right answer at the right moment; cheating is a point event. Process evidence over months would require consistently faking every Socratic dialogue, simulating transfer across contexts and maintaining a coherent profile over hundreds of interactions.
The threat model takes AI-assisted fraud seriously. A learner using a second AI to feed answers into the Mentor's dialogues could maintain a coherent fake profile without the sudden capability jumps the Audit-Agent normally checks for. The architecture answers with layered defences: behavioural biometrics (typing dynamics, hesitation patterns, response latency distributions) that are harder to simulate than content; periodic challenge probes — unexpected, timed tasks with latency constraints incompatible with an intermediary; and the progressive de-adaptation mechanism in L3, which strips scaffolding at critical thresholds.
No single defence is foolproof. The combination raises the cost of sustained fraud above the cost of actually learning the material.
The Mentor is the teacher, not the examiner.
Separation of powers
The architecture enforces separation of powers. The Mentor (L1) generates evidence through accompaniment. An independent Audit-Agent — performing the function of a notary — verifies whether the performance is genuine: does it fit the profile? Are there sudden capability jumps? Stylistic inconsistencies?
This is the same Audit-Agent introduced in L1, operating in a different mode: in L1 it monitors the Mentor for bias and drift; in L6 it validates the evidence the Mentor produces. Systemic calibration operates across the entire system through statistical anomaly detection.
The teacher and the examiner are different roles, in different execution contexts, with different incentive structures. A platform that bundles them creates an incentive to grade learners well in order to keep them paying. The architecture makes that bundling impossible.
Anchoring evidence to the graph
Evidence gains weight through graph anchoring. Fractions used successfully in algebra, then in integral calculus, then in physics problems, are implicitly confirmed hundreds of times. The higher floors validate the lower ones.
A credential issued from this system therefore carries information no traditional credential can. It does not say "passed the exam." It says "demonstrated this competency at this level of autonomy and transfer, confirmed by N downstream uses, witnessed continuously between dates X and Y." The credential travels with the evidence graph behind it, and the graph is verifiable without exposing the underlying process.
Reference
Architecture paper, Section 5, L6. DOI: 10.5281/zenodo.18759134. CC BY 4.0.