A design language is tested at its ugly edges, so this demo is deliberately a failure: three days into an intermittent-502 investigation, every hypothesis is refuted, the strongest lead turns out to be a small-sample artifact (recorded as an inflection, not erased), and nothing reaches grounded. The outcome points at an open question — the deliverable is an eliminated-causes map and a concrete next probe, not a confident story. If the agent can't say "I know", the interface must make "I don't know yet — and here is exactly what I ruled out" a first-class, useful answer.
设计语言的成色要在丑陋的边界处检验,所以这个 demo 刻意是一次失败:间歇性 502 调查进行到第三天,所有假设全部被推翻,最强的线索被证明是小样本假象(作为拐点记录在案,而非抹掉),没有任何东西到达 grounded。outcome 指向一个open question——交付物是一张已排除项地图和一个具体的下一步探测,而不是一个自信的故事。当 agent 说不出「我知道」时,界面必须让「我还不知道——但这些是我确切排除掉的」成为一等的、有用的回答。