The data model behind a shape, what adherence modes do to capture and extraction, and why the same shape governs capture, extraction, and retrieval.
A shape is the contract for a slice of your graph. It defines the entity types
that can exist, the properties each type carries, and the relationships that
connect them. The same shape that decides what you can capture also decides what
extraction is allowed to produce and what retrieval reads back. When you author a
shape in the Workbench, you are setting the rules for all
three at once.
A field on an entity type, with a kind and required/optional status.
content (required), domain
Relationship
A typed edge between two entity types.
evidence — SUPPORTS — finding
When you materialize the shape, these become the only
shapes of node and edge the runtime will write for that project. An entity that
does not match a type in the shape, or a property the type does not declare, has
nowhere to land.
shape └─ entity type ├─ property └─ relationship → entity type
Every shape carries an adherence mode that controls how strictly capture and
extraction must conform to the model. You set it on the shape; it applies to
everything written through that shape.
strict
loose
Only declared types, properties, and relationships are accepted. A capture
or extraction that introduces an unknown type, an undeclared property, or a
missing required field is rejected rather than written.Use strict when the graph feeds something downstream that depends on a fixed
schema — typed accessors, exports, reporting — and you want guarantees over
coverage.
The declared model is the guide, not a hard boundary. Capture and extraction
aim for the shape but tolerate additional properties and partial matches,
so source material that does not fit cleanly still produces entities.Use loose when you are exploring a domain or ingesting messy material and
would rather keep an imperfect entity than drop it.
Adherence is a property of the shape, not of a single call. To change how
conformant writes must be, change the shape’s mode and re-materialize.
This is why the modeling work pays off in practice. If a type is missing a
property, extraction cannot capture that field no matter how clearly it appears in
the source — there is nowhere to put it. If a relationship is not declared,
neither capture nor extraction can draw that edge, so traversal later will not
find it. The shape is the ceiling on what your graph can know.
Before you extract a new kind of source, check that the shape already has the
types and properties you expect to find. Authoring the model first is cheaper
than re-extracting after you notice the gap.
You do not start from nothing. Penumbra ships a set of system shapes that are
already materialized and ready to use, including two you will reach for early.
A starting model for research work, with the entity types inquiry, source,
evidence, finding, open_question, and research_note. Use it as-is to
structure investigation, or open it in the Workbench and extend it.
A shape governs structure; planes govern role. The two are
orthogonal — the same shape’s entities can sit on different planes depending on
how they were written.
Plane
What lives there
Semantic
The default canonical surface. Capture and extraction land here.
Memory
Where pb.memory writes and reads; recall defaults here.
Archival
Retired memory and history, kept out of the active surface.
Meta
Hidden internal infrastructure. You do not write to it.
In practice: when you pb.capture or pb.extract through a shape, the entities
land on the semantic plane. When you write through pb.memory, the memory
shape’s entities land on the memory plane, and recall reads them back from
there by default. Same data model, different plane, different default visibility.