22 April 2026
Preference inference as infrastructure: the MEE framework
Ask most teams how their agents handle user preferences and you'll get a list of if-statements. The user likes terse summaries, so terse summaries are hard-coded. That holds until the preferences conflict, or change, or were never stated out loud. The MEE framework (preference inference treated as a first-class, weighted signal layer rather than a set of rules) starts from a different premise: preference is something you infer from evidence and re-weight over time, and that changes what an agent can do.
The if-statement version of preference looks reasonable until it meets reality. The user likes terse summaries, so terse summaries are hard-coded. Then three things happen. The preferences conflict: terse, except on an unfamiliar topic, where more context is wanted. They change: what was true in March is wrong by May. And most of them were never stated at all; they only show up in how the person reacts. A rule cannot hold “usually terse, more verbose on unfamiliar ground, and drifting terser over time.” A rule is a snapshot of a moving thing.
The reframe is to stop treating preference as a setting and start treating it as a signal: something you infer from evidence and re-weight as the evidence accumulates. And then to put it where settings never sit, underneath every model, as a layer they all read, rather than inside each application as a config file each one maintains separately.
Inference, not assumption
The mechanism matters, because “infer preferences” is where most systems quietly start guessing. The engine reads your conversation history across providers and extracts signals against ten cognitive dimensions: how you want things formatted, how much you want claims sourced, how you handle ambiguity, how fast you move from options to a decision. Each signal is scored, not switched on. The score weights specificity, evidence, recency, stability, and portability, and subtracts for contradictions. A candidate that scores below the threshold is marked unstable, and an unstable signal does not get applied; the engine asks instead. That single rule is the line between inference and assumption.
Signals also do not graduate to truth on first sight. They move through gates: staged, stable, rejected. A preference becomes stable only after it has been observed across many sessions and more than one provider. A one-off, said in a bad mood, stays staged. A contradicted one is rejected. And when a preference genuinely changes, the old one is not deleted; it is dated. The ledger keeps the timeline, because a preference history you can read is worth more than a single current value you have to trust.
Memory remembers what you said. This compiles how you want to be answered.
The output is not a profile of tags. It is a set of instructions the model receives before you type, and every instruction carries its own evidence. It shows the dimension the rule belongs to, its strength, and the number of observations behind it: “cite sources when making empirical claims · evidence orientation · high · 31 observations.” Reject the rule and it is gone. Nothing about it is a black box, because every line traces back to something you actually did.
Why it is infrastructure, not a feature
The test of infrastructure is that the same primitive runs everything built on top of it. This one does. The harness we run NOMARK's own work through uses the same engine to read every instruction we give it, classifying the input, resolving preferences with the same weighted scorer, asking only for what it cannot responsibly infer. Inference has a floor written into it: fields are inferable, defaultable, or must-ask, and the engine never infers a must-ask field. Personalisation that guesses on the things that carry real consequence is not personalisation; it is risk wearing its coat.
The shift underneath all of it is small to state and large in effect. The default answer to AI personalisation has been memory: make the model remember what you said. Recall is necessary and nowhere near sufficient. Knowing your facts is not the same as knowing how to answer you, and the second one is the part that has to be inferred, weighted, dated, and carried across every model you use. Treated as rules, preference is brittle and invisible. Treated as a weighted, evidence-tracked signal layer, it is something you can audit and correct. That is the whole argument for building it as infrastructure. The product is at nomark.ai; the framework is the point.