v0.3.6 - 2026-06-17
The multi-line evidence tree. A finding can now carry several evidence lines instead of one, andStatus counts independence by distinct run rather than
distinct dataset. Additive on the schema (stays at v1, no migration). The
single-line API is unchanged, and so are its Status outcomes for findings from
distinct runs. See Findings.
Added
submit_finding/assert_findingtake alines=[EvidenceLine, ...]argument in place of the singleestimate+data_idpair, recording several datasets or arms under one proposition and prediction. A finding with no lines raisesValueError; a finding where any line fails the gate rolls back whole. The finding’s identity is its fulldata_idset. The return dict gainsbearings, the per-line bearing list.- Per-line bearing: each line’s bearing is recomputed on read, so a multi-line
finding whose lines disagree reads as
CONTESTED.
Changed
Statusindependence is now run-distinct: support and refute count distinctgenerated_by(run) with adata_idguard. One run contributes at most one support and one refute, so a single run cannot reachCORROBORATEDon its own. Two findings on one proposition from the same run that previously readCORROBORATEDnow readPRELIMINARY; findings from distinct runs are unaffected. Policy stamp moves tostatus_policy@v2, recomputed on read with no migration.- A blank or whitespace
generated_byis rejected at the finding write; a missing or default token writes but emits a health event.
v0.3.5 - 2026-06-15
The pre-registration split. v0.3.4 shipped the trust layer as a singleassert_finding call. v0.3.5 separates the two earned steps of the
hypothetico-deductive method: register the decision rule before the numbers
are seen, then submit the outcome against it. The plan → finding edge becomes
cryptographic. Additive only: no schema migration, schema stays at v1, and
assert_finding / assert_claim keep working unchanged. See
Findings.
Added
EpistemicGraph.register_plan(proposition, prediction): pre-register a decision rule. Writes thepredictionsrow (preregistered=1) and its own signed plan attestation claim under idempotency keyplan:{plan_id}, Rekor-anchorable like any other claim. Idempotent. Returns the content-addressedplan_id.EpistemicGraph.submit_finding(proposition, prediction, estimate, *, data_id, ...): submit a finding against an already-registered plan. Requires the plan to exist (NoRegisteredPlanError), computes the bearing, and writes the finding’s signed claim whosesupports[]cites the plan attestation, so the plan → finding edge is signed, not denormalised metadata. A finding already recorded for(content_id, data_id)under a differentplan_idraisesFindingPlanForkError.mareforma.trusterrorsNoRegisteredPlanErrorandFindingPlanForkError.mareforma.trustgates chain:Gate,gates_for(prediction), andevaluate_gates(estimate, gates)re-express the decision rule as an ordered short-circuit chain over the existing prediction columns. A one-element chain is bearing-identical tocompute_bearing. Pure Python, no new schema column.
Changed
assert_findingnow composesregister_plan+submit_findinginternally. Its synthesised plan is flaggedpreregistered=0, so a genuine up-front pre-registration stays distinguishable from a one-shot. Return shape, idempotency on(content_id, data_id), atomicity, and derived status are unchanged.register_planandsubmit_findingemit to the health/activity log.
v0.3.4 - 2026-06-11
The trust layer: structured findings with a computed bearing and a derived status. A free-text claim becomes a content-addressed proposition bound to a pre-registered prediction; the direction of evidence is computed from the registered rule and the result, never self-declared; and a count over independent data derives the status. Additive only: six new tables, schema stays at v1, and every finding still rides a signed claim as its attestation. See Findings.Added
mareforma.trust
Proposition: a content-addressed, falsifiable claim.content_idis the answer (subject, relation, object, scope, direction, magnitude);frame_idis the question (direction and magnitude dropped). The same truth conditions collapse to one node across hosts and languages.Prediction: a pre-registered decision rule. Superiority and equivalence (TOST) gates.EffectEstimate/EvidenceLine/Contrast: the one-line evidence tree with metafor-named effect fields; rejects inconsistent input.compute_bearing: the gate. Returns supports / refutes / neutral, computed rather than declared.compute_status/compute_frame_status: the count-based status (UNTESTED,PRELIMINARY,CORROBORATED,REFUTED,CONTESTED) over independent data, versioned asstatus_policy@v1.
EpistemicGraph trust methods: register_proposition, assert_finding,
proposition_status, get_proposition, query_frame. assert_finding
validates, computes the bearing, writes a signed claim, persists the evidence
tree, and derives the status in one call; idempotent on (content_id, data_id).
Schema: six additive tables (propositions, predictions, findings,
evidence_lines, contrasts, effect_estimates), prediction table
append-only. Schema stays at v1; an existing v0.3.3 graph.db gains them on
next open_db().
Notes
- The superiority gate is one-sided at
alpha. A supplied p-value is read as two-sided (the metafor/escalc convention), so significance isp < 2*alpha, matching the(1 - 2*alpha)confidence-interval path.
v0.3.3 — 2026-05-29
Adapter framework and substrate primitives. Five new primitives in core (events, tools, canonicalize, derivation, hooks) plus three opt-in adapters undermareforma.adapters.* and a literature-ingest
CLI. Schema stays at v1; existing v0.3.2 graph.db auto-applies the
new literature_claims and agent_activities tables on next
open_db().
Added
Substrate primitivesmareforma.events—EventSource/EventHandlerProtocols, typedEventPayloadandClaimResult, source-name constants (SOURCE_CLAWINSTITUTE,SOURCE_TOOLUNIVERSE,SOURCE_GEMINI,SOURCE_CLAUDE_CODE_PRETOOLUSE) so adapters dispatch on constants, not string literals.mareforma.tools—ToolProtocol (name,version,call(**kwargs) -> ToolResult),ToolResultTypedDict,ReplayResultdataclass. The structural contract any wrappable callable satisfies.mareforma.canonicalize— registry-based canonicalizer surface for adapter authors. Defaultjson-c14n-v1(RFC 8785 JCS) plusdsse-jcs-nfc-v1(same bytes the signed-envelope layer produces). Importingmareforma.canonicalizeregistersrdkit-canonical-smiles-v1,fasta-nfc-v1,pdb-atom-sorted-v1via the specialty submodule.mareforma.derivation— substrate-derived classification. Deterministically derivesANALYTICALvsINFERREDfrom a static source-code profile plus dynamic log templates (Drain parser). Source-profile extraction requires the[derivation]extra (tree_sitter); log-template extraction is pure stdlib.mareforma.hooks— Claude CodePreToolUsehandler (python -m mareforma.hooks) records every tool invocation as aprov:Activityrow.agent_activitiestable is part of the canonical schema.
mareforma.predicate_types (re-exported at the top level):
TOOL_CALL_V1, CONTAINER_EXEC_V1, CODE_VARIATION_V1,
HYPOTHESIS_V1, LITERATURE_INSIGHT_V1, SCIENCE_SKILL_V1,
META_CLAIM_V1, WORKSHOP_EVENT_V1. Adapters import the
constants — a typo on a constant name fails at import; a typo on a
URI string would silently mis-classify a claim.
Three opt-in adapters under mareforma.adapters.*:
mareforma.adapters.clawinstitute— generic ClawInstitute workshop-event hook.EventHookimplements the EventSource Protocol;HttpxClientuses a pooledhttpx.Clientwithfollow_redirects=Falseand URL-quotes path segments. Eight typed exceptions shareClawInstituteApiErroras parent. Untrusted workshop content runs through three sanitisation layers (raw-byte cap →sanitize_for_llm→wrap_untrusted). Handler exceptions duringdispatch()are caught and returned asClaimResult(error=…)so a misbehaving subscriber cannot block peers.mareforma.adapters.tooluniverse— wrap anymareforma.tools.Toolso each.call(**kwargs)records a signedtool-call:v1claim with arguments digest, result digest, tool config fingerprint, timing. Container-exec class tools route tocontainer-exec:v1. Over-cap results raiseResultTooLargeError.mareforma.adapters.gemini— read-only ingest for Gemini for Science outputs (4 capabilities: code-variation, hypothesis, literature-insight, science-skill). Per-capabilityREQUIRED_FIELDSvalidation runs beforeassert_claim; string payload values flow throughsanitize_for_llm; reserved keys (predicate_type,capability) are adapter-owned.
mareforma ingest <file>,
mareforma ask "<query>", mareforma narrative. Paper claim drafts
live in their own literature_claims table (separate from the
signed claims table). FTS5 BM25 search escapes embedded quotes;
the narrative exporter flags structural and polarity-heuristic
contradictions inline.
mareforma.db.open_db_from_db_path() — opens a graph DB from a
direct file path. Honours the supplied filename instead of silently
re-deriving <root>/.mareforma/graph.db.
rich is now a core dependency.
Changed
- Schema is additive on every
open_db().literature_claims,literature_claims_fts(with insert / delete / update triggers), andagent_activitiestables are created via an_ADDITIVE_TABLES_SQLscript that runs on both fresh and v1-initialised graphs. Existing v0.3.2 databases pick up the new tables on first open with no migration required. cli.pylazy-loads ingest / ask / narrative subcommands somareforma --help/--version/bootstrap/validator adddo not pay therich+tomli_wimport cost.
Fixed
mareforma.derivation.source_profile: import guard catchesException(tree-sitter ABI mismatch surfaces asTypeError/RuntimeError, notImportError). Module-prefix matching requires a dot separator sourllib_legacy.getno longer matches theurllibimport. Dead-zone walker no longer marksexceptclause bodies as dead (was silently demoting ANALYTICAL agents to INFERRED on error-handling paths).
Removed
truncate_oversized=Trueoption onmareforma.adapters.tooluniverse.ProvenanceToolAdapter. Truncating canonicalised JSON at an arbitrary byte boundary produces bytes no replayer can re-derive; the adapter now always raisesResultTooLargeError.
v0.3.2 — 2026-05-27
Internal restructure + one restore-time verification improvement. Schema stays at v1; no migration required. All existingfrom mareforma.db import X and from mareforma.signing import Y
import paths continue to work unchanged.
Changed
mareforma/signing.pysplit intomareforma/signing/subpackage.signing/core.pycarries DSSE PAE, canonical Statement v1, key management, envelope sign/verify, andbootstrap_key.signing/rekor.pycarries Rekor submission, RFC 6962 Merkle inclusion-proof verification, checkpoint parsing, SSRF defense, and log-pubkey fetch.mareforma/db.pysplit intomareforma/db/subpackage.db/core.pycarries the live-write path, queries, verdicts, Rekor saga, and TOML backup.db/_schema_sql.pycarries the DDL constant.db/errors.pycarries the exception hierarchy.db/restore.pycarriesrestore()and its verification helpers.
Added
rekor_inclusionssidecar round-trip throughclaims.toml._backup_claims_tomlemits a[rekor_inclusions]section carrying each sidecar row’s uuid, log_index, integrated_time, raw_response_b64, and recorded_at.restore()replays entries into the sidecar table after the corresponding claim INSERT, inside the same fail-all-or-nothing transaction. Closes the restore-time gap where Merkle inclusion proofs could not be re-verified post-restore.- Two drift-warning classes for the sidecar restore path:
RekorSidecarSectionAbsentWarning(TOML has no[rekor_inclusions]section — expected for pre-v0.3.2 files) andRekorSidecarEntryMissingWarning(section exists but lacks an entry for a Rekor-logged claim — suspicious). - CI guard tests walk each submodule source file via AST and
assert every defined name is importable AND accessible via
getattron the package. Fails CI if a future contributor adds a name without mirroring it in__init__.py. - Restore-time sidecar validation. Orphan
rekor_inclusionsentries and entries missing required fields raiseRestoreError.
Compatibility
claims.tomlfiles from v0.3.0 / v0.3.1 (no[rekor_inclusions]section) restore successfully on v0.3.2 with aRekorSidecarSectionAbsentWarning. Runrefresh_unsigned()after restore to re-fetch inclusion proofs from the log.
v0.3.1 — 2026-05-22
Additive release. Schema stays at v1; new columns land via in-placeALTER TABLE ADD COLUMN on the non-signed-integrity
surface. First mareforma.open() after upgrade auto-adds:
claims.predicate_payload, claims.original_signature_bundle,
and doi_cache.content_digest. None are part of the signed
envelope or chain hash, so every existing claim’s signed bytes
round-trip byte-equal and signatures re-verify under the new
code.
Added
EpistemicGraph.query_provenance(claim_id, depth=4)— agent-readable lineage view of a claim: focal row + role-actor signatures + recursive upstream / downstream walks + inbound contradictions + replication verdicts in one deterministic dict.- Rebuildable
claim_supportscache. Edge denormalisation in a separate SQLite file (.mareforma/claim_supports_cache.db). Recursive-CTE walkers serve provenance queries in O(depth × deg). Auto-rebuilt on stale / missing detection; 50k-claim p99 < 300ms. claim-with-roles:v1multi-signature DSSE envelopes. Newmareforma.signing.sign_claim_with_roles+verify_envelope_multilet asserters carry per-role (planner / executor / reviewer / validator) signatures inside one envelope. Legacy single-sig envelopes verify under the existingverify_envelopeunchanged.- PROV-O JSON-LD exporter + four-invariant hand-rolled
validator.
mareforma export --format=prov-o. - GRADE certainty surface. Optional
study_designfield onEvidenceVector(randomised-trial/observational/case-series/not-applicable) + newEvidenceVector.certainty()returning the GRADE four-tier band. - DOI metadata drift detection. New
doi_cache.content_digestcolumn +EpistemicGraph.find_drifted_dois(limit=N). - Refutation taxonomy + filter. New
refutation_status()presenter (clean / contradicted / contested / retracted) and a composablerefutation_filterkwarg onquery()/search(). - Grounding sensor protocol. New
mareforma.VerifierProtocol +MockNLIVerifierreference impl.EpistemicGraph.assert_claim(grounding_sensor=verifier)snapshots the verdict (score + rationale) into the signed Statement v1 predicate at assertion time. - Predicate URI reservations.
BUILTIN_URISexpanded from 3 to 21 entries reserving substrate-owned slots plus 18 adapter URIs. - Operational health log + stats CLI. Append-only
.mareforma/health.jsonlrecords per-op operational signal. Newmareforma stats [--last N] [--json]renders rolling rates. - Public
EpistemicGraph.update_claimwrapper arounddb.update_claim. Status mutations are EDITORIAL — cryptographically-traceable changes use the retract-and- supersede pattern.
v0.3.0 — 2026-05-13
Breaking change from v0.2.x. Schema does not migrate from older versions; delete.mareforma/graph.db to start fresh. claims.toml
at the project root is a human-readable record of the prior state —
the prev_hash chain and per-claim signatures cannot be reconstructed
from it, so it is a reference not a backup.
What ships in v0.3.0:
- Ed25519 claim signing with optional Sigstore-Rekor transparency log
- Artifact-hash gate on REPLICATED — converging peers that both supply a SHA-256 must agree
- Identity-gated
graph.validate()with a per-project validators table and signed enrollment chain - DOI resolution against Crossref + DataCite with a persistent cache
- DB-layer state-machine triggers + append-only
prev_hashchain — the storage layer rejects illegal transitions - Cycle / self-loop detection on
supports[]at INSERT and UPDATE - ESTABLISHED-upstream requirement for REPLICATED + signed seed-claim bootstrap (Cochrane / GRADE evidence chains; no replication-of-noise)
- JSON-LD export in a mareforma-native vocabulary
- SCITT-style signed export bundle +
mareforma verifyCLI - In-toto Statement v1 + DSSE v1 PAE envelope on every signed claim, GRADE 5-domain
EvidenceVectorinside every signed predicate, signed verdict-issuer protocol that any third party can integrate against (see below) - RFC 8785-strict canonical JSON for every signed payload — cross-language verifiers in Go, Rust, or JS now read byte-identical bytes from a mareforma envelope. Adds
rfc8785>=0.1runtime dep. - Operator surfaces:
graph.health()single-call audit summary,graph.refresh_convergence()to retry promotions whose detection swallowed an error,graph.refresh_all_dois()to force-re-check DOIs for retraction drift,graph.find_dangling_supports()to audit UUID refs that point nowhere,graph.classify_supports()to inspect the substrate’s claim/doi/external classification. - Validation envelope binds
evidence_seen— passgraph.validate(claim_id, evidence_seen=[upstream_id, ...])to record which claims the validator reviewed before signing. Bound into the signed payload alongside(claim_id, validator_keyid, validated_at). Empty list is a positive “I reviewed nothing” admission. Substrate verifies each cited claim exists and predates validation. - Rekor saga atomicity via a new
rekor_inclusionssidecar table. When a Rekor submission succeeds but the local row-UPDATE fails, the sidecar preserves the coords sorefresh_unsigned()replays the UPDATE without re-submitting (no duplicate log entries). Append-only at the trigger level. - Strict UUIDv4 in claim_id pattern. Non-v4 UUID-shapes in
supports[]are now classified as external references rather than dangling claim_ids. - RFC 6962 Merkle inclusion-proof verification (opt-in). Pass
rekor_log_pubkey_pem(orrekor_log_pubkey_path) tomareforma.open()and every signed-claim submit + everyrefresh_unsigned()re-fetches the entry from Rekor and cryptographically verifies the Merkle audit path against the log’s signed checkpoint. Verification failure refuses to marktransparency_logged=1. Supports Ed25519 (private Rekor) + ECDSA secp256r1 (public Sigstore Rekor). The supplied PEM persists to.mareforma/rekor_log_pubkey.pemas a TOFU pin — silent rotation is refused on subsequent opens; the first-pin write is atomic (O_CREAT|O_EXCL). NewRekorInclusionErrorexception with a stable.reasontoken taxonomy. Since v0.3.2 therekor_inclusionssidecar round-trips throughclaims.toml;restore(rekor_log_pubkey_pem=...)re-verifies each entry’s inclusion proof against the pinned key. - Defense-in-depth on
db.validate_claim. Direct callers of the substrate function (not justEpistemicGraph.validate) get the full gate sequence: cryptographic envelope verification, LLM-type ceiling refusal, self-validation refusal, payload-field equality vs the row + kwargs. NewInvalidValidationEnvelopeErrorfor structural / cryptographic envelope failures, distinct fromEvidenceCitationErrorfor citation-list failures. - All documented exceptions re-exported at the top level.
from mareforma import RekorInclusionErrorworks without remembering the submodule path. 19 exception classes total, alphabetical underMareformaError.
- In-toto Statement v1 + DSSE v1 PAE envelope. Every signed claim
is now a DSSE envelope (
payloadType=application/vnd.in-toto+json) wrapping an in-toto Statement v1 (predicateType=urn:mareforma:predicate:claim:v1). Standards-aligned; introspectable bycosign, GUAC, and any in-toto-aware tool without a mareforma-specific verifier. The signature covers the DSSE Pre-Authentication Encoding (PAE), not the payload bytes alone — a signature on(typeA, payload)cannot be replayed as a signature on(typeB, payload). - GRADE 5-domain EvidenceVector carried inside every signed claim’s
predicate. Five downgrade domains (
risk_of_bias,inconsistency,indirectness,imprecision,publication_bias) each in[-2, 0], three upgrade flags (large_effect,dose_response,opposing_confounding),rationaledict (required for any nonzero domain), andreporting_compliancelist. Bound into the signature; denormalized intoev_*columns for queryable filters; restore re-derives the canonical bytes and refuses any TOML-tampered upgrade. - Verdict-issuer protocol. Two new tables —
replication_verdictsandcontradiction_verdicts— accept signed verdicts from any enrolled validator. The OSS substrate ratifies what enrolled identities sign; the predicates that PRODUCE verdicts (semantic-cluster, cross-method, hash-match, shared-resolved-upstream, contradiction-detection) live outside the OSS and callGraph.record_replication_verdict()/Graph.record_contradiction_verdict(). NewVerdictIssuerErrorexception covers the gates: signer must be enrolled (chain walk back to a self-signed root), referenced claim must exist, method must be in the allowed enum, contradictionmember != other. t_invalidderived state. New nullable column onclaims. Thecontradiction_invalidates_olderAFTER INSERT trigger oncontradiction_verdictssetst_invalidon the older of the two referenced claims (lex-smallerclaim_idas deterministic tie-break when timestamps collide; idempotent viaWHERE t_invalid IS NULL).validate_claimrefuses to promote at_invalidclaim — a signed contradiction is terminal evidence.include_invalidatedkwarg ongraph.query(),graph.search(),graph.replication_verdicts(),graph.contradiction_verdicts(). Defaults toFalse— invalidated claims and the verdicts that reference them are excluded from default reads. PassTruefor audit / history queries.- Append-only over the signed predicate. New
claims_signed_fields_no_launderingBEFORE UPDATE trigger refuses direct-SQL mutation of any signed-predicate column on rows whosesignature_bundle IS NOT NULL. Value-comparison fires only when something actually changed, so multi-column UPDATEs that re-emit unchanged values pass through. A tampered Python interpreter cannot relax this. - Append-only verdicts.
*_append_only+*_no_deletetriggers refuse UPDATE on signed columns and any DELETE on both verdict tables. The envelope is the source of truth. - PRAGMA foreign_keys = ON. Set on every
open_db(). The verdict tables’ FK references tovalidators(keyid)andclaims(claim_id)are now enforced — direct-SQL INSERTs with fabricated keyids fail at the SQL layer, not just in Python. - Subject ↔ predicate consistency.
claim_predicate_from_envelope()refuses envelopes wheresubject[0].nameorsubject[0].digest.sha256disagree with the predicate’sclaim_idortext. Caught at the envelope-decode layer. - Restore extensions.
claims.tomlround-trip now covers both verdict tables (signatures base64-encoded). Each verdict’s signature is cryptographically verified against the enrolled issuer’s pubkey before INSERT. Verdicts are replayed increated_atorder so the trigger’sWHERE t_invalid IS NULLguard preserves the truthful first-invalidation moment.transparency_logged=truein TOML is downgraded to0when the bundle has norekorblock — hand-edited TOML cannot fake a Rekor inclusion. New adversarial tests for tamperedEvidenceVector, swappedstatement_cid, tampered verdict fields, and forgedissuer_keyid. - New modules:
mareforma._canonical(NFC + sorted-keys + no-whitespace +allow_nan=Falsecanonical JSON),mareforma._statement(in-toto Statement v1 builder +statement_cidcomputation),mareforma._evidence(stdlib-dataclassEvidenceVectorwith__post_init__validator). No pydantic dependency added; mareforma stays at 5 runtime deps. mareforma.signing.dsse_pae()is public so external verifiers can independently re-derive the bytes the signature covers.canonical_statement(claim_fields, evidence)replaces the legacycanonical_payloadfor chain-hash + signature inputs; the old shim is removed because it silently desynced from production bytes.
Identity, signing, transparency
- Ed25519 claim signing.
mareforma bootstraponce to generate a keypair at~/.config/mareforma/key(XDG-compliant, mode0600). Everyassert_claimthen signs before INSERT. The signed payload bindsclaim_id,text,classification,generated_by,supports,contradicts,source_name,artifact_hash, andcreated_at— any tamper breaks verification. - Append-only invariant. Signed claims refuse mutation of any
signed-surface field.
update_claim(text=...)/update_claim(supports=...)/update_claim(contradicts=...)on a signed row raiseSignedClaimImmutableError.statusandcomparison_summaryremain editable. - Sigstore-Rekor transparency log.
mareforma.open(rekor_url=mareforma.signing.PUBLIC_REKOR_URL)submits every signed claim at INSERT time. Submission failure persists the claim withtransparency_logged=0and blocks REPLICATED untilgraph.refresh_unsigned()succeeds. - Identity-gated
graph.validate(). The loaded signer must be enrolled in the project’svalidatorstable. The first key opened against a fresh graph auto-enrolls as the root validator (silent self-signed enrollment with aUserWarning). The validation event itself is signed: a DSSE-style envelope binding(claim_id, validator_keyid, validated_at)is persisted to the row’svalidation_signaturecolumn. - New
mareforma validator add/mareforma validator listsubcommands. Each enrollment is signed by the parent validator andis_enrolledwalks the chain back to a self-signed root before accepting a row — direct sqlite INSERTs with a fabricated parent do not pass. Singleton-root invariant + 64-hop walk cap defend against DoS-by-planted-chain.
Storage-layer state machine
- DB-layer state-machine triggers. Two
BEFOREtriggers enforcePRELIMINARY → REPLICATED → ESTABLISHEDat the storage layer; directPRELIMINARY → ESTABLISHEDis rejected; ESTABLISHED rows requirevalidation_signature. Illegal transitions surface asIllegalStateTransitionErrorwith a parsed<from>-><to>string instead of an opaqueCHECK CONSTRAINT FAILED. - Append-only hash chain. New
claims.prev_hashcolumn carriessha256(prev_chain_link || canonical_payload). UNIQUE partial index +BEGIN IMMEDIATEtogether prevent branched chains from concurrent writers or manual SQL tamper. NewChainIntegrityError. - Cycle / self-loop detection. A claim whose
supports[]would create a cycle (directly or via a chain) raisesCycleDetectedErrorat INSERT and at UPDATE. Forward-walk DFS, depth-capped at 1024 hops. DOI strings insupports[]are not graph nodes and skipped. - ESTABLISHED-upstream requirement for REPLICATED.
REPLICATED promotion now requires at least one ESTABLISHED claim in
the peer’s
supports[]. Matches Cochrane / GRADE evidence-chain methodology — stops replication-of-noise. Strict by default. - Seed-claim bootstrap.
graph.assert_claim(text=..., seed=True)inserts a claim directly atESTABLISHEDwith a signed seed envelope (payload typeapplication/vnd.mareforma.seed+json, bindsclaim_id + validator_keyid + seeded_at). Only enrolled validators can produce seeds — bootstraps the trust chain on a fresh graph without a back door.
Artifact-hash gate
artifact_hashparameter onassert_claim(Python API) and--artifact-hashflag onmareforma claim add(CLI). Accepts a SHA-256 hex digest of the output bytes (figure, CSV, model) backing the claim. Normalised to lowercase, validated as 64-char hex, persisted to the newartifact_hashcolumn, and bound into the signed payload.- REPLICATED gate. When two converging peers BOTH supply a hash, the hashes must match for REPLICATED to fire. When either omits the hash, the gate is bypassed and identity-only REPLICATED applies — the signal is opt-in, not retroactive.
- Idempotency conflict. A replay that supplies a different
artifact_hashthan the original raisesIdempotencyConflictErrorrather than silently dropping the new hash.
Prompt-safety substrate
mareforma.prompt_safetymodule +graph.query_for_llm(). Sanitize-and-wrap helpers for feeding retrieved claim text into an LLM prompt. Strips zero-width / bidi-override / C0-C1 control characters, Goodside U+E0000 tag plane, variation selectors, interlinear annotation anchors, and the fullwidth</>//lookalikes. Caps oversized fields at 100k chars with a visible truncation marker. Free-text fields are wrapped in<untrusted_data>...</untrusted_data>; forged delimiter tags inside the content are replaced with[stripped].get_tools()routes throughquery_for_llm. Thequery_graphtool that ships to LangChain / LangGraph / CrewAI / AutoGen / LlamaIndex / PydanticAI / Smol Agents / OpenAI SDK / Anthropic SDK now returns sanitized + wrapped text. A stored prompt-injection planted by a prior agent is no longer delivered verbatim to the consuming LLM.- Sanitize-on-write.
assert_claimrunssanitize_for_llm(text)before signing and persisting. Defense in depth — any consumer that readsclaim.textdirectly gets a clean string. Hard cap of 100,000 characters; claims that consist entirely of zero-width / control characters are rejected withValueError.
Export
- JSON-LD export — mareforma-native vocabulary. Removed PROV-O references
(
prov:wasGeneratedBy,prov:used) from the JSON-LD@context— the previous export name-dropped the vocabulary without populating the full PROV-O graph. The export now declares@type='mare:Graph'andmare:mediaType='application/x-mareforma-graph+json'. Theusedkey on source-bearing claims was renamed tousedSource(aliased tomare:usedSource). EverySIGNED_FIELDSmember is always emitted on each claim node so downstream consumers (e.g. the bundle verifier below) can re-derivecanonical_payloadfrom a node alone. - SCITT-style signed bundle. New
mareforma export --bundleproduces an in-toto Statement v1 wrapper around the JSON-LD export, withpredicateType='urn:mareforma:predicate:epistemic-graph:v1'and a DSSE-style signature over the whole bundle. Subject names use theurn:mareforma:claim:<uuid>namespace; URN (not DNS) avoids a perpetual-ownership commitment onmareforma.dev. Newmareforma verify <bundle.json>checks the DSSE signature AND every per-claim subject digest. NewBundleVerificationErrornames the first failing check so callers can route between “corrupt” and “cross-version skew”.
DOI verification
- DOI resolution: every DOI in
supports[]/contradicts[]is HEAD-checked against Crossref and DataCite at assert time. Unresolved DOIs mark the claimunresolved=Trueand block REPLICATED promotion.EpistemicGraph.refresh_unresolved()retries previously-failed resolutions. - DOI resolver hardening: DOI suffix URL-encoded before interpolation
(prevents host injection via
#/@);follow_redirects=False(registry must answer directly); pooledhttpx.Clientwith threading lock around lazy init (FD-leak-safe under concurrency); HTTP 429 from either registry skips the cache write; tight exception clause so programmer bugs surface in tracebacks. doi_cachetable: 30-day TTL for resolved entries, 24-hour TTL for unresolved.
Supply chain
- PyPI Trusted Publishing. Releases are published via OIDC-based
GitHub Actions, not long-lived API tokens.
pypa/gh-action-pypi-publishis SHA-pinned.actions/checkoutandactions/setup-pythonare pinned by commit SHA — closes the tag-squat / maintainer-compromise vector against the Trusted Publishing OIDC token. - New
SECURITY.mddocuments the disclosure channel (GitHub Private Vulnerability Reporting), supported-versions policy (latest pre-1.0 only), PyPI Trusted Publishing setup, cryptographic trust boundaries, and out-of-scope categories. - Typosquat reservations.
maraforma,mareform,mareforma-cli,mareforma-py, andmareforma-agentare reserved on PyPI as defensive placeholders that raiseImportErrorand point users back to the canonical package.mare-forma/mare_forma/mare.formaare auto-blocked by PyPI’s confusable-name check. - New
.github/CODEOWNERSand.github/dependabot.yml.
Agent surface
mareforma.open()returns anEpistemicGraph— no@transformrequired. New parameters:key_path,require_signed,rekor_url,require_rekor,trust_insecure_rekor.- EpistemicGraph methods:
assert_claim,query,search,query_for_llm,get_claim,validate,refresh_unresolved,refresh_unsigned,enroll_validator,list_validators,get_validator_reputation,get_tools,close. get_tools(generated_by="agent/...")returns[query_graph, assert_finding]as plain Python callables. One-line wrap for Anthropic SDK, OpenAI SDK, LangChain, LangGraph, CrewAI, AutoGen, LlamaIndex, PydanticAI, Smol Agents.mareforma.schema()— runtime introspection of valid values, defaults, state transitions, and schema version.mareforma.restore(project_root)— rebuild a freshgraph.dbfromclaims.tomlfor catastrophic-loss recovery. Fresh-only, fail-all-or-nothing on signature verification.- CLI:
mareforma bootstrap,mareforma validator add/validator list(with--type human|llm),mareforma claim add/list/show/update/validate,mareforma status,mareforma export [--bundle],mareforma verify <bundle>,mareforma restore [<toml-path>].
Validator type and reputation
- Validator type signal.
validators.validator_type TEXT CHECK IN ('human','llm'), bound into the signed enrollment envelope. Default'human'. The substrate refuses promotion past REPLICATED on an LLM-typed validator’s signature alone (LLMValidatorPromotionError); a human-typed co-signer is required. Self-validation (claim signer == validation signer) is also refused (SelfValidationError). - Reputation-aware retrieval.
query()andsearch()gaininclude_unverified: bool = False. PRELIMINARY claims whose signing key is not in the validators table are excluded by default. Result dicts carry derivedvalidator_reputation(count of ESTABLISHED claims signed by the same validator) andgenerator_enrolled(bool).graph.get_validator_reputation()returns the bulk{keyid: count}map.
Full-text search
- FTS5 over claim text. New
claims_ftsvirtual table (unicode61tokenizer, diacritics folded) synced withclaimsvia three INSERT/DELETE/UPDATE-of-text triggers. Newgraph.search()method exposes FTS5 ranked match. Phrase, prefix, boolean, and proximity operators all supported. Pure-wildcard queries refused.
claims.toml round-trip + restore
- claims.toml format extended. A
[validators]section now travels alongside[claims], carrying signed enrollment envelopes so the restore path can verify the chain. Old files with no[validators]section continue to work as unsigned-mode. mareforma restore(CLI + Python API). Fresh-only rebuild from claims.toml. Refuses non-emptygraph.db. Verifies every signature before any row is inserted. NewRestoreErrorwith.kindfield naming the failure mode (graph_not_empty,toml_not_found,toml_malformed,enrollment_unverified,claim_unverified,mode_inconsistent,orphan_signer). Adversarial test class proves the round-trip catches tampered text, mutated signature bytes, missing signatures in signed-mode graphs, orphan signers, and validator-row tampering._backup_claims_tomlfailure to stderr at ERROR-level (waswarnings.warn, which production loggers routinely suppress). graph.db remains authoritative.
Removed
@transformdecorator andBuildContext— pipeline layer removed.MareformaObserver,LangChainAdapter— execution tracing removed.- Pipeline CLI commands:
init,add-source,explain,build,log,diff,cross-diff,trace.
v0.2.1 — 2026-05-08
ctx.params— runtime parameter injection from TOMLquery_claims()— read primitive for the epistemic graphdelete_claims_by_generated_by()— delete claims by source agent- Fixed
LangChainAdapterimport path
v0.2.0 — 2026-04-08
mareforma.agent— framework-agnostic agent provenance moduleMareformaObserver— context manager recording agent events tograph.dbLangChainAdapter— LangChain callback handler
v0.1.0 — 2026-03-25
Initial release.@transform decorator, ctx.claim(), mareforma build,
SQLite epistemic graph, claims.toml backup.