mareforma.open(path=None, *, ...)
Open the epistemic graph and return an EpistemicGraph.
| Name | Type | Default | Description |
|---|---|---|---|
path | str | Path | None | None | Project root. Graph stored at <path>/.mareforma/graph.db. Created on first use. |
key_path | str | Path | None | None | Ed25519 private key (PEM). None → use the XDG default ~/.config/mareforma/key. If the path does not exist, the graph operates unsigned. |
require_signed | bool | False | Raise KeyNotFoundError if no key is found at key_path. |
rekor_url | str | None | None | Sigstore-Rekor transparency log endpoint. When set, every signed claim is submitted at INSERT time. Use mareforma.signing.PUBLIC_REKOR_URL for the public instance. |
require_rekor | bool | False | Raise SigningError if rekor_url is unset or initial submission fails. |
trust_insecure_rekor | bool | False | Skip SSRF validation on rekor_url (only for private Rekor instances on internal networks). |
rekor_log_pubkey_pem | bytes | None | None | PEM-encoded Rekor log operator public key. Opts the session into RFC 6962 Merkle inclusion-proof verification: every signed-claim submit and every refresh_unsigned() re-fetches the entry and cryptographically verifies the audit path against the log’s signed checkpoint. Verification failure refuses to mark transparency_logged=1. Supports Ed25519 + ECDSA secp256r1; other curves raise RekorInclusionError(reason="unsupported_key"). Mutually exclusive with rekor_log_pubkey_path. The supplied PEM persists to .mareforma/rekor_log_pubkey.pem as a TOFU pin. |
rekor_log_pubkey_path | str | Path | None | None | Filesystem path to a PEM file holding the Rekor log operator public key. Read once at open() time; equivalent to passing the bytes via rekor_log_pubkey_pem. Mutually exclusive with rekor_log_pubkey_pem. |
EpistemicGraph
Raises
DatabaseError: if the database cannot be opened or the schema cannot migrateKeyNotFoundError: ifrequire_signed=Trueand no key is foundSigningError: ifrequire_rekor=Trueand the Rekor URL is unset or unreachable; or if the supplied Rekor log pubkey conflicts with the TOFU pin on.mareforma/rekor_log_pubkey.pemValueError: if bothrekor_log_pubkey_pemandrekor_log_pubkey_pathare supplied (mutually exclusive)
mareforma bootstrap once to generate an Ed25519 keypair at
~/.config/mareforma/key. After that, every assert_claim auto-signs.
TOFU pin behavior. When rekor_log_pubkey_pem (or rekor_log_pubkey_path) is supplied, the canonical DER bytes of the public key are persisted to .mareforma/rekor_log_pubkey.pem. Every subsequent mareforma.open() on the same project compares the supplied key against the pinned PEM and refuses silent rotation; you must delete the pin file to intentionally rotate. The first-pin write uses O_CREAT|O_EXCL, so two concurrent open() calls with different keys cannot race past existence checks and silently overwrite each other; the loser raises SigningError("...pinned to a different key by a concurrent ... call").
mareforma.schema()
Return the epistemic schema: valid values, defaults, and state transitions.
dict: stable across patch releases within a major version.
mareforma.restore(project_root, *, claims_toml=None, rekor_log_pubkey_pem=None)
Rebuild a fresh graph.db from claims.toml for catastrophic-loss
recovery. Fresh-only: refuses to run if graph.db already has any
claims. Every signature is verified before any row is inserted;
fail-all-or-nothing.
| Name | Type | Default | Description |
|---|---|---|---|
project_root | str | Path | required | Project directory. graph.db is reconstructed under <root>/.mareforma/. |
claims_toml | str | Path | None | None | Path to source TOML. Defaults to <project_root>/claims.toml. |
rekor_log_pubkey_pem | bytes | None | None | PEM-encoded Rekor log operator public key. When supplied, every [rekor_inclusions] entry’s Merkle inclusion proof is cryptographically verified before replay. Verification failure raises RestoreError(kind='rekor_inclusion_invalid'). When None, sidecar entries are replayed unverified. |
dict with validators_restored and claims_restored counts.
Raises mareforma.db.RestoreError with .kind field:
.kind | Meaning |
|---|---|
graph_not_empty | Target graph.db already has claims. |
toml_not_found | claims.toml does not exist. |
toml_malformed | TOML parse error. |
enrollment_unverified | A validator’s enrollment envelope failed verification. |
claim_unverified | A claim’s signature_bundle or validation_signature failed verification. |
mode_inconsistent | Signed-mode graph (validators enrolled) contains a claim with no signature. |
orphan_signer | A claim is signed by a keyid not present in the validators section. |
rekor_inclusion_invalid | A [rekor_inclusions] entry failed verification, references a missing claim, or is missing required fields. |
EpistemicGraph
Returned bymareforma.open(). Do not instantiate directly.
assert_claim(text, *, classification, generated_by, supports, contradicts, source_name, idempotency_key, status, artifact_hash, seed)
Assert a claim into the graph.
| Name | Type | Default | Description |
|---|---|---|---|
text | str | required | Falsifiable assertion. Cannot be empty or whitespace. Capped at 100,000 characters. |
classification | str | "INFERRED" | INFERRED | ANALYTICAL | DERIVED |
generated_by | str | None | "agent" | Agent identifier. Independence signal for REPLICATED detection. |
supports | list[str] | None | None | Upstream claim_ids or DOIs this claim rests on. Cycles are rejected (CycleDetectedError). |
contradicts | list[str] | None | None | Claim_ids this finding is in explicit tension with. |
source_name | str | None | None | Data source. Stored as-is; required for ANALYTICAL to be meaningful. |
idempotency_key | str | None | None | Unique key: same key returns existing claim_id, no INSERT. |
status | str | "open" | open | contested | retracted |
artifact_hash | str | None | None | SHA-256 hex digest of the output bytes backing the claim. When both converging peers supply a hash, the hashes must match for REPLICATED. |
seed | bool | False | Insert directly at ESTABLISHED with a signed seed envelope. Only an enrolled validator can produce a seed. Used to bootstrap the ESTABLISHED-upstream chain on a fresh project. |
str: claim_id UUID
Raises
ValueError: iftextis empty,classificationis invalid,statusis invalid, or text exceeds 100k charsCycleDetectedError: ifsupports[]would create a cycle (A → ... → A)IdempotencyConflictError: if the sameidempotency_keyreplays with a differentartifact_hashIllegalStateTransitionError: if a transition violates the state-machine triggersChainIntegrityError: if theprev_hashappend-only chain check failsDatabaseError: on SQLite write failure
supports[] with different generated_by values, all are promoted to REPLICATED. The promotion fires only when at least one upstream is itself ESTABLISHED (Cochrane / GRADE evidence chain, strict by default). When both peers supply artifact_hash, the hashes must match.
Idempotency
idempotency_key solves two distinct problems:
Retry safety: same key returns the existing claim_id with no INSERT. Use whenever a run may be interrupted and retried:
claim_id even with different text, without needing explicit supports= links:
"target_T_condition_C", not a random run ID.
A replay that supplies a different artifact_hash than the original raises IdempotencyConflictError rather than silently dropping the new hash.
query(text=None, *, min_support=None, classification=None, limit=20, include_unverified=False)
Query claims ordered by support level (descending) then recency (descending).
| Name | Type | Default | Description |
|---|---|---|---|
text | str | None | None | Case-insensitive substring filter on claim text. |
min_support | str | None | None | PRELIMINARY | REPLICATED | ESTABLISHED |
classification | str | None | None | INFERRED | ANALYTICAL | DERIVED |
limit | int | 20 | Maximum number of results. |
include_unverified | bool | False | When False, PRELIMINARY claims whose signing key is not in the validators table are excluded. Pass True to surface unverified preliminary claims. |
include_invalidated | bool | False | When False, claims marked invalid by a signed contradiction_verdicts row (t_invalid IS NOT NULL) are excluded. Pass True for audit / history queries. |
list[dict]: each dict contains:
claim_id, text, classification, support_level, idempotency_key,
validated_by, validated_at, status, source_name, generated_by,
supports_json, contradicts_json, comparison_summary, branch_id,
unresolved, signature_bundle, transparency_logged, artifact_hash,
validation_signature, validator_keyid, prev_hash,
ev_risk_of_bias, ev_inconsistency, ev_indirectness,
ev_imprecision, ev_pub_bias, evidence_json, statement_cid,
t_invalid, created_at, updated_at. Plus two reputation projections computed at query time:
validator_reputation: int: for ESTABLISHED rows, the count of ESTABLISHED claims signed by the same validator.0otherwise.generator_enrolled: bool:Trueiff the claim’s signing keyid is in the validators table.
ValueError if min_support or classification is invalid.
search(query, *, min_support=None, classification=None, limit=20, include_unverified=False)
Full-text search over claim text using SQLite FTS5 (unicode61 tokenizer,
diacritics folded). Returns claim dicts ordered by FTS5 rank (best
match first). Same projection as query().
"*", "**") are refused; they would scan
the whole table.
Parameters: same as query() except text becomes query and is
required (must be a non-empty FTS5 MATCH expression). include_invalidated
applies identically.
Raises ValueError on empty / pure-wildcard / malformed FTS5 syntax.
record_replication_verdict(*, verdict_id, cluster_id, member_claim_id, other_claim_id, method, confidence)
Insert a signed replication verdict. The graph’s loaded signer signs the
verdict; its keyid must be enrolled in the project’s validators table
(chain walk back to a self-signed root, same gate as validate()).
The OSS core accepts verdicts; the predicates that generate them
(semantic-cluster, cross-method, hash-match, shared-resolved-upstream)
live outside the OSS and call this method to write their output. Any
third-party verdict-issuer can integrate against this protocol.
| Name | Type | Default | Description |
|---|---|---|---|
verdict_id | str | required | Caller-supplied unique id. |
cluster_id | str | required | Shared across all verdicts in one replication cluster. |
member_claim_id | str | required | The claim being asserted as replicated. |
other_claim_id | str | None | None | Optional second member of the pair (None for single-row cross-method verdicts). |
method | str | required | One of hash-match / semantic-cluster / shared-resolved-upstream / cross-method. |
confidence | dict | None | None | Confidence values keyed by guard (e.g. cosine, nli_forward). Never fused into a single score. |
PRELIMINARY to
REPLICATED (only when still PRELIMINARY AND status='open' AND t_invalid IS NULL). INSERT + promotion run in one BEGIN IMMEDIATE
transaction so a concurrent contradiction cannot land between the writes.
Raises VerdictIssuerError: no signer loaded, signer’s keyid not
enrolled (or chain broken), method not in the allowed enum, referenced
claim_id missing.
record_contradiction_verdict(*, verdict_id, member_claim_id, other_claim_id, confidence)
Insert a signed contradiction verdict. Sets t_invalid on the older of
the two referenced claims via the contradiction_invalidates_older AFTER
INSERT trigger; default query() / search() then excludes the
invalidated claim.
| Name | Type | Default | Description |
|---|---|---|---|
verdict_id | str | required | Caller-supplied unique id. |
member_claim_id | str | required | First claim in the contradiction. |
other_claim_id | str | required | Second claim. Must differ; self-contradiction (member == other) is refused at the Python layer AND by a SQL CHECK constraint. |
confidence | dict | None | None | Confidence values describing the contradiction. |
VerdictIssuerError: same gates as record_replication_verdict,
plus self-contradiction.
replication_verdicts(*, member_claim_id=None, cluster_id=None, include_invalidated=False)
List signed replication verdicts, optionally filtered.
query(). Pass include_invalidated=True for
audit-mode listings.
Returns list[dict]: each dict carries verdict_id, cluster_id,
member_claim_id, other_claim_id, method, confidence_json,
issuer_keyid, signature (raw bytes), created_at.
contradiction_verdicts(*, claim_id=None, include_invalidated=False)
List signed contradiction verdicts, optionally filtered by either side
of the pair.
include_invalidated=True since the
contradiction verdict IS the evidence for invalidation; auditing “why
was this invalidated” requires audit mode.
Returns list[dict]: each dict carries verdict_id,
member_claim_id, other_claim_id, confidence_json, issuer_keyid,
signature, created_at.
get_validator_reputation()
Returns {validator_keyid: count} for every enrolled validator. Count
is the number of ESTABLISHED claims whose validation envelope was
signed by that keyid. Validators with zero promotions appear with
count=0. Derived state, recomputed on every call.
query_for_llm(text=None, *, min_support=None, classification=None, limit=20)
Same shape as query() with two changes: the text and comparison_summary fields are sanitized (zero-width / bidi / control characters stripped, length capped) AND wrapped in <untrusted_data>...</untrusted_data> delimiters; metadata labels (source_name, generated_by, validated_by) are sanitized but not wrapped.
Use this when retrieved claims will be spliced into an LLM prompt: claim text is written by earlier agents and may contain stored prompt-injection payloads.
<untrusted_data> is data) is the caller’s responsibility.
For one-off content that doesn’t come from the graph, mareforma.sanitize_for_llm(...) and mareforma.wrap_untrusted(...) are public primitives.
get_claim(claim_id)
Return a single claim dict by ID.
claim_id: str
Returns dict | None: None if not found. Same field shape as query().
validate(claim_id, *, validated_by=None, evidence_seen=None)
Promote a REPLICATED claim to ESTABLISHED. Identity-gated.
mareforma bootstrap or mareforma.open(key_path=...)) AND that key must be enrolled in the project’s validators table. The first key opened against a fresh graph auto-enrolls as the root validator. The validation event itself is signed: a DSSE-style envelope binding (claim_id, validator_keyid, validated_at, evidence_seen) is persisted to the row’s validation_signature column, so the promotion is independently verifiable.
validated_by is a cosmetic display label. The authenticated identity is the keyid embedded in the signed envelope.
evidence_seen is an optional list of claim_ids the validator declares to have reviewed before signing. None is normalized to [] and bound into the signed envelope as a positive “I reviewed nothing” admission. Each cited entry must be a strict-v4 UUID matching an existing claim with created_at <= validated_at. The validator’s enumeration is self-declared (mareforma cannot prove what was actually read), but the envelope shifts “a human pressed a button” to “a human pressed a button AND named the evidence they consulted.”
Parameters
| Name | Type | Default | Description |
|---|---|---|---|
claim_id | str | required | UUID of the claim to validate. |
validated_by | str | None | None | Display label of the human reviewer. The authoritative identity is the keyid in the signed envelope. |
evidence_seen | list[str] | None | None | Claim_ids the reviewer consulted. Normalized to [] and always bound into the signed envelope. |
validation_signature is supplied directly to db.validate_claim (advanced/test path), mareforma also decodes the envelope’s signed payload and refuses if its evidence_seen field disagrees with the evidence_seen kwarg. The signed envelope and the validated list must bind the same citations exactly (same items, same order); a direct caller cannot launder fraudulent citations through the on-disk envelope.
Raises
ClaimNotFoundError: if the claim does not existValueError: ifsupport_levelis notREPLICATED, no signer is loaded, or the loaded signer is not an enrolled validatorEvidenceCitationError: if anyevidence_seenentry is not a strict-v4 UUID, does not point to an existing claim, post-datesvalidated_at, or disagrees with the validation envelope’s signedevidence_seenfield
health()
Single-call audit summary aggregating core counters. Pure observability over existing surfaces; no side effects.
dict[str, int] with the seven keys above.
refresh_unresolved()
Retry external DOI verification for every claim currently flagged unresolved=1.
supports[]/contradicts[] are HEAD-checked against Crossref and DataCite at assert_claim time. If the registries are unreachable, the claim persists with unresolved=True and is ineligible for REPLICATED promotion until refresh_unresolved() confirms the DOIs.
Returns dict with keys checked, resolved, still_unresolved.
refresh_all_dois()
Force-re-resolve every DOI referenced anywhere in the graph, bypassing the 30-day positive cache. Use when you suspect a referenced DOI has been retracted or its registry state has changed since assertion.
newly_failed counts DOIs whose cache state flipped from resolved to unresolved (the drift signal operators usually want). Does NOT mutate support_level or per-claim unresolved flags; re-running a HEAD check is not strong enough evidence to demote across the trust ladder.
Returns dict with keys checked, still_resolved, now_unresolved, newly_failed.
refresh_convergence()
Retry convergence detection (PRELIMINARY → REPLICATED) for every claim flagged convergence_retry_needed=1. Without this method, a swallowed SQLite error during detection would leave the claim stuck at PRELIMINARY forever.
dict with keys checked, promoted, still_pending.
refresh_unsigned()
Retry transparency-log submission for every signed-but-unlogged claim when the graph was opened with rekor_url=....
rekor_url is unset.
Two recovery paths:
- Sidecar replay: when the original Rekor submission succeeded but the claims-row UPDATE failed (recorded in
rekor_inclusions), the stored coords are re-attached to the row in a single local UPDATE. No network call, no duplicate Rekor entry. - Re-submit: when no sidecar row exists, the envelope is submitted to Rekor again. Used only when the original submission has no persisted record.
assert_claim) is skipped with a warning.
Returns dict with keys checked, logged, still_unlogged.
find_dangling_supports()
Return UUID-shaped supports[] entries pointing to claims that do not exist in this graph. DOIs and other free-form strings are external references and are NOT flagged.
list[dict] of {"claim_id", "dangling_ref"} dicts sorted deterministically. Empty list when the graph is clean.
classify_supports(values)
Classify each entry in a supports[] / contradicts[] list as claim | doi | external. Pure-function (no network, no DB read).
list[dict] of {"value", "type"} dicts in input order.
enroll_validator(pubkey_pem, *, identity, validator_type="human")
Enroll an additional validator on this project. The currently loaded signer (which must itself be enrolled) signs the enrollment envelope.
| Name | Type | Default | Description |
|---|---|---|---|
pubkey_pem | bytes | required | Ed25519 public key in PEM form. |
identity | str | required | Human-readable label. Capped at 256 chars; control characters and Unicode display-spoofing forms are rejected. |
validator_type | str | "human" | "human" or "llm". Self-declared honesty signal, bound into the signed enrollment envelope. LLM-typed validators may sign validation envelopes but cannot promote a claim past REPLICATED: validate() refuses at the core. |
ValueError: no signer loaded, or the loaded signer is not enrolledInvalidIdentityError: identity contains rejected charactersInvalidValidatorTypeError:validator_typeis not'human'or'llm'ValidatorAlreadyEnrolledError: key already enrolled (the message distinguishes a normal duplicate from a chain-broken row)
list_validators()
Return all enrolled validators for this project, ordered by enrolled_at.
list[dict]: each dict carries keyid, pubkey_pem, identity, validator_type, enrolled_at, enrolled_by_keyid, enrollment_envelope.
get_tools(*, generated_by="agent")
Return [query_graph, assert_finding] as plain Python callables with behavioral contracts in their docstrings. Wrap with any framework’s tool adapter in one line.
| Name | Type | Default | Description |
|---|---|---|---|
generated_by | str | "agent" | Agent identifier baked into the closure. Set this to the calling agent’s identity so REPLICATED detection works across independent runs. |
list: [query_graph, assert_finding]
query_graph(topic, min_support="PRELIMINARY") -> str: routes throughquery_for_llm. Returns a JSON string of matching claims with free-text fields sanitized and wrapped in<untrusted_data>...</untrusted_data>.assert_finding(text, classification="INFERRED", supports=None, contradicts=None, source="") -> str: returnsclaim_id.
close()
Close the graph database connection.
__exit__ calls close() automatically.
Exceptions
Each exception lives in the submodule that raises it. Import from the submodule shown in the table.from mareforma import RekorInclusionError.
| Exception | Module | Raised by | Meaning |
|---|---|---|---|
DatabaseError | mareforma.db | open() and any write | SQLite error or unmigratable schema |
ClaimNotFoundError | mareforma.db | validate(), get_claim() callers | The claim does not exist |
SignedClaimImmutableError | mareforma.db | update_claim() | Tried to mutate a signed-payload field on a signed row |
IdempotencyConflictError | mareforma.db | assert_claim() | Idempotency-key replay with a conflicting artifact_hash |
IllegalStateTransitionError | mareforma.db | DB triggers | Transition violates the state-machine (e.g. PRELIMINARY → ESTABLISHED) |
ChainIntegrityError | mareforma.db | DB write | prev_hash append-only chain check failed |
CycleDetectedError | mareforma.db | assert_claim(), update_claim() | supports[] would form a cycle |
VerdictIssuerError | mareforma.db | record_replication_verdict(), record_contradiction_verdict() | No signer loaded, signer not enrolled (chain walk fails), invalid method enum, referenced claim missing, or self-contradiction (member == other) |
EvidenceCitationError | mareforma.db | validate() with evidence_seen=... | A cited entry isn’t a strict-v4 UUID, points to a non-existent claim, post-dates validated_at, or disagrees with the envelope’s evidence_seen field |
InvalidValidationEnvelopeError | mareforma.db | db.validate_claim() | Validation envelope is structurally / cryptographically invalid: malformed JSON, wrong payloadType, signer not enrolled, signature does not verify against the claimed signer, or payload binds a different claim_id / validator_keyid / timestamp than the row being promoted |
KeyNotFoundError | mareforma.signing | open(require_signed=True) | No private key at key_path |
SigningError | mareforma.signing | Rekor submission, TOFU pin conflict | Rekor URL unset / unreachable / invalid; or supplied log pubkey conflicts with the pinned key on .mareforma/rekor_log_pubkey.pem |
InvalidEnvelopeError | mareforma.signing | verify_envelope, claim_predicate_from_envelope | Envelope payload is malformed, wrong payloadType, or subject and predicate disagree on claim_id / text digest |
RekorInclusionError | mareforma.signing | verify_rekor_inclusion, verify_rekor_checkpoint, fetch_inclusion_proof, fetch_log_pubkey, and the submit + refresh paths when rekor_log_pubkey_pem is set | RFC 6962 inclusion proof failed cryptographic verification. Stable .reason token: missing_proof, malformed_proof, bad_root_hex, bad_proof_hex, merkle_root_mismatch, checkpoint_missing, checkpoint_malformed, checkpoint_root_mismatch, checkpoint_unsigned, checkpoint_bad_sig, unsupported_key. Callers pattern-match on the reason without parsing English. |
InvalidIdentityError | mareforma.validators | enroll_validator() | Identity contains rejected characters |
ValidatorAlreadyEnrolledError | mareforma.validators | enroll_validator() | Validator row already exists |
BundleVerificationError | mareforma.export_bundle | mareforma verify | Signed export bundle failed DSSE or per-claim subject digest |
from mareforma import RekorInclusionError without remembering the
submodule. The submodule paths in the table tell you where the source
lives.
Schema-mismatch message (raised when open_db() finds a graph.db whose user_version differs from mareforma’s _SCHEMA_VERSION): "graph.db has user_version=N but this mareforma expects user_version=M. The dev branch does not migrate schemas. Delete .mareforma/graph.db to start fresh; claims.toml is a human-readable record of the prior state."
v0.3.3 surface
mareforma.open_db_from_db_path(db_path)
Open the graph DB from a direct path to graph.db, not a project root. Honours the supplied filename when it sits outside the conventional <root>/.mareforma/graph.db layout (where mareforma.open() would re-derive that path).
ingest, ask, narrative) when --db points outside .mareforma/.
Capability-shaped predicate URI constants
mareforma.predicate_types exposes URN-form constants for every reserved predicate. Re-exported at the top level so adapter authors can from mareforma import TOOL_CALL_V1.
Core-owned (writer in the core):
CLAIM_V1:urn:mareforma:predicate:claim:v1EPISTEMIC_GRAPH_V1:urn:mareforma:predicate:epistemic-graph:v1CLAIM_WITH_ROLES_V1:urn:mareforma:predicate:claim-with-roles:v1
mareforma.adapters.* or a third-party adapter):
TOOL_CALL_V1:urn:mareforma:predicate:tool-call:v1CONTAINER_EXEC_V1:urn:mareforma:predicate:container-exec:v1CODE_VARIATION_V1:urn:mareforma:predicate:code-variation:v1HYPOTHESIS_V1:urn:mareforma:predicate:hypothesis:v1LITERATURE_INSIGHT_V1:urn:mareforma:predicate:literature-insight:v1SCIENCE_SKILL_V1:urn:mareforma:predicate:science-skill:v1META_CLAIM_V1:urn:mareforma:predicate:meta-claim:v1WORKSHOP_EVENT_V1:urn:mareforma:predicate:workshop-event:v1AGENT_TRACE_V1,INGESTED_TRACE_V1,LLM_OUTPUT_V1,REVIEW_V1,PEER_REVIEW_V1,ELO_MATCH_V1,TOURNAMENT_BRACKET_V1: additional reserved namespaces.- Wet-lab assay family:
WET_LAB_ASSAY_V1plus_FLOW_CYTOMETRY,_SEQUENCING,_IMAGING,_PROTEOMICS,_ELECTROPHYSIOLOGYsiblings. REPLICATION_ATTESTATION_V1,COMPOUNDING_ATTESTATION_V1,SEMANTIC_GROUNDING_V1,DOI_RESOLUTION_V1.
mareforma.events
Typed Protocol contract for adapter event sources.
EventSource and EventHandler are @runtime_checkable Protocols: isinstance(obj, EventSource) at adapter construction time fails loudly on a missing subscribe / unsubscribe / handle_event attribute. EventPayload and ClaimResult are TypedDicts. Source-name constants prevent string-typo dispatch bugs.
mareforma.tools
Structural contract for any wrappable callable.
mareforma.canonicalize
Registry-based canonicalizer surface for adapter authors. Distinct from the internal envelope canonicaliser (mareforma._canonical); adapters use the public registry so claim result_canonical_form fields can name forms by registered string.
mareforma.canonicalize also registers the specialty forms rdkit-canonical-smiles-v1, fasta-nfc-v1, pdb-atom-sorted-v1 via the auto-imported specialty submodule. RDKit canonicaliser gracefully falls back to NFC-stripped UTF-8 when rdkit isn’t installed.
mareforma.derivation
Core-derived classification. Deterministically derives ANALYTICAL vs INFERRED from a static source-code profile plus dynamic runtime-log templates.
pip install mareforma[derivation] (tree_sitter + tree_sitter_python); log-template extraction (Drain parser) is pure stdlib.
mareforma.hooks
Claude Code PreToolUse handler. Records every tool invocation as a prov:Activity row in the project’s .mareforma/graph.db.
Opt in via .claude/settings.json:
agent_activities table is part of the canonical schema; the hook routes through mareforma.db.open_db so it inherits foreign-keys PRAGMA + schema validation. Hook exit is always 0; failures log to stderr but never propagate.
mareforma.adapters.*
Three opt-in adapter packages: see Mareforma adapters (under the Adapter framework section) for full integration examples. Quick reference:
| Package | Install | What it does |
|---|---|---|
mareforma.adapters.clawinstitute | pip install mareforma[clawinstitute] | Workshop-event hook with HttpxClient, EventSource Protocol implementation, 3-layer content sanitisation, 8 typed exceptions. |
mareforma.adapters.tooluniverse | pip install mareforma[tooluniverse] | ProvenanceToolAdapter wraps any mareforma.tools.Tool; each call records a signed tool-call:v1 claim. Container-exec class tools route to container-exec:v1. |
mareforma.adapters.gemini | pip install mareforma[gemini] | Read-only OutputIngester for 4 Gemini for Science capabilities. Per-capability REQUIRED_FIELDS validation; string payload values sanitised. |
v0.3.4 surface
mareforma.trust
The trust layer turns a free-text claim into a structured finding: a
content-addressed proposition, a pre-registered prediction, a computed bearing,
and a derived status. See Findings for the narrative. It
is additive: every finding still rides a signed claim, so it appears in
query() with a support_level like any other claim.
Proposition(subject, relation, object, direction=Direction.UNSPECIFIED, scope={}, magnitude=None)
Frozen, value-typed, content-addressed claim about the world.
| Method | Returns | Description |
|---|---|---|
content_id() | str | sha256 over normalized (subject, relation, object, scope, direction, magnitude), the answer. Same truth conditions give the same id. |
frame_id() | str | Same hash with direction and magnitude dropped, the question. |
is_falsifiable() | bool | True iff direction is committed (not UNSPECIFIED) and scope is non-empty. |
same_as(other) | bool | Equal content_id. |
contradicts(other) | bool | Same frame_id, contrary directions. |
to_dict() / from_dict(d) | dict / Proposition | Round-trip serialization. |
Direction enum: INCREASES, DECREASES, NO_EFFECT (one contrary family),
PRESENT, ABSENT (a second), UNSPECIFIED (the rejection sentinel, never
stored). REGISTRABLE_DIRECTIONS is the closed set of the five storable values.
normalize_token(s) exposes the NFC + casefold + whitespace-collapse used at
identity time.
Prediction(test_type, alpha=0.05, *, direction_of_interest=None, equivalence_lower=None, equivalence_upper=None, preregistered=False, inference_regime=InferenceRegime.FREQUENTIST)
The pre-registered decision rule. TestType.SUPERIORITY requires
direction_of_interest (DirectionOfInterest.INCREASE / DECREASE);
TestType.EQUIVALENCE requires equivalence_lower < equivalence_upper. Raises
ValueError on an inconsistent combination.
EffectEstimate(estimate_value, effect_type, scale=Scale.RAW, *, p_value=None, ci_lower=None, ci_upper=None, ci_level=None, n_total=None)
A point estimate plus the uncertainty the gate needs. Supply a p_value, a full
(ci_lower, ci_upper, ci_level) triple, or both. Validates on construction and
raises InconsistentEstimateError for a non-finite value, a partial CI triple,
p_value outside [0, 1], ci_level outside (0, 1), a CI that does not
bracket the estimate, or a non-positive n_total.
EffectType:SMD,Hedges_g,OR,logOR,RR,HR,COR,ZCOR,MD,ROM,beta,log2FC,GEN(metaformeasurevalues).Scale:raw/log.null_value(effect_type, scale): the “no effect” value:1for raw-scale ratio types (OR/RR/HR/ROM),0otherwise.Contrast(control_type=ControlType.NEGATIVE)andEvidenceLine(estimate, data_id, contrast=…, modality=…, provenance_id=…, design_type=…)build the evidence tree; a finding may carry several lines.data_idis the dataset guard the run-distinct independence count uses so the same dataset is not counted twice.
compute_bearing(estimate, prediction) -> Bearing
The gate. Returns a frozen Bearing(direction, significant) where direction
is BearingDirection.SUPPORTS / REFUTES / NEUTRAL. Raises
InconsistentEstimateError when the estimate cannot drive the requested gate
(e.g. an equivalence test with no CI, or a CI at the wrong level for the
prediction’s alpha).
gates_for(prediction) returns the same decision rule as an ordered
short-circuit gates[] chain, and evaluate_gates(estimate, gates) runs it.
A one-element chain is bearing-identical to compute_bearing; a multi-gate
chain raises NotImplementedError until its precedence is designed.
compute_status(independent_support, independent_refute) -> Status
The state machine: UNTESTED (0, 0); CONTESTED (≥1 support and ≥1 refute);
REFUTED (≥1 refute, 0 support); CORROBORATED (≥2 support, 0 refute);
PRELIMINARY (exactly 1 support, 0 refute). compute_frame_status(contrary_independent_support)
returns FrameStatus.CONTESTED when a contrary proposition in the same frame
has ≥1 independent supporting line, else CONSISTENT. Independence is counted by
distinct run (generated_by) with a data_id guard, so one run yields at most
one support and one refute. STATUS_POLICY is the policy stamp
("status_policy@v2"), independent of the package version.
Errors: TrustError (base), NonFalsifiablePropositionError,
InconsistentEstimateError, NoRegisteredPlanError, FindingPlanForkError.
EpistemicGraph trust methods
register_proposition(proposition) -> str
Register a falsifiable Proposition; returns its content_id. Idempotent on
content_id. Raises NonFalsifiablePropositionError if the proposition has no
direction or an empty scope.
register_plan(proposition, prediction, *, generated_by=None) -> str
Pre-register a decision rule against a proposition, before the numbers are
seen. Registers the proposition, writes the predictions row with
preregistered=1, and writes its own signed plan attestation claim (under
idempotency key plan:{plan_id}, Rekor-anchorable like any other claim).
Returns the content-addressed plan_id. Idempotent: re-registering the same
prediction is a no-op on both the row and the claim. Raises
NonFalsifiablePropositionError on a non-falsifiable proposition.
submit_finding(proposition, prediction, estimate=None, *, data_id=None, lines=None, generated_by=None, control_type=None, modality=None, provenance_id=None, design_type=None, code_ref=None, idempotency_key=None) -> dict
Submit a finding against an already-registered plan. Same shape and return as
assert_finding, with one difference: the plan must already exist (else
NoRegisteredPlanError), and the finding’s signed supports[] cites the plan
attestation’s claim_id, so the plan → finding edge is cryptographic, not
denormalised metadata. The finding’s identity is its full data_id set: a
re-submission carrying a partial overlap or a different plan raises
FindingPlanForkError rather than silently returning the prior bearing. The
existence check and the writes run in one transaction.
assert_finding(proposition, prediction, estimate=None, *, data_id=None, lines=None, generated_by=None, control_type=None, modality=None, provenance_id=None, design_type=None, code_ref=None, idempotency_key=None) -> dict
Record a finding: validate the inputs, compute a bearing per line, write a signed
claim as the attestation, persist the evidence tree, and derive the proposition’s
status. Pass either estimate + data_id (one line) or lines (a sequence of
EvidenceLine, the multi-line evidence tree), never both. A finding with no
lines raises ValueError; a generated_by that is blank or whitespace raises
ValueError, since independence is counted by run.
dict with finding_id, content_id, plan_id, claim_id,
bearing ({"direction", "significant"}), bearings (the per-line list, one
entry for a single-line finding), status, idempotent (bool), and
proposition_status (the full view below).
Idempotent on the finding’s data_id set: re-asserting the same dataset(s)
returns the prior finding rather than double-counting it. All validation runs
before the signed claim is written, so a rejected finding never leaves an
orphan claim. Raises NonFalsifiablePropositionError /
InconsistentEstimateError on bad input.
The one-shot composes register_plan + submit_finding internally; its
synthesised plan is flagged preregistered=0, so a genuine up-front
pre-registration via register_plan stays distinguishable from it.
proposition_status(proposition_or_content_id) -> dict | None
The retrieval view for one proposition. Accepts a content_id or a
Proposition. None if not registered.
Returns dict with content_id, frame_id, direction, status,
independent_support, independent_refute, frame_status, status_policy.
get_proposition(content_id) -> dict | None
The stored proposition row as a dict, or None.
query_frame(frame_id_or_proposition, *, min_status=None) -> list[dict]
Everything known about a question (frame), each entry a proposition_status
view. Accepts a frame_id or a Proposition. min_status filters to a floor on
the UNTESTED < PRELIMINARY < CORROBORATED ladder (the only valid floors;
REFUTED / CONTESTED are off the ladder and excluded by any floor). Raises
ValueError on an invalid floor.