Skip to content

Tutorial 4: Store & Recall — Artifacts and Memory

This tutorial focus on the how Aethergraph can memorize what happened before. Learn how to save outputs, log results/events, and recall them later using AetherGraph’s two persistence pillars:

  • Artifacts — durable assets (files/dirs/JSON/text) stored by content address (CAS URI) with labels & metrics for ranking and search.
  • Memory — a structured event & result log with fast “what’s the latest?” simple recent‑history queries.

We’ll build this up step‑by‑step with short, copy‑ready snippets.


0. What you’ll use

# Access services from your NodeContext
arts = context.artifacts()   # artifact store
mem  = context.memory()      # event & result log

Mental model: Artifacts hold large, immutable outputs. Memory records what happened and the small named values you need to recall quickly.


1. Save something → get a URI → open it later

A. Ingest an existing file

art = await arts.save_file(
    path="/tmp/report.pdf",
    kind="report",                 # a short noun; you’ll filter/rank by this
    labels={"exp": "A"},           # 1–3 filters you actually plan to query
    # metrics={"bleu": 31.2},      # optional if you’ll rank later
)

# When you need a real path again:
path = await arts.as_local_file(art)

Why CAS? It prevents accidental overwrites and gives you a stable handle you can pass around (in Memory, dashboards, etc.).

B. Stream‑write (no temp file), atomically

async with arts.writer(kind="plot", planned_ext=".png") as w:
    w.write(png_bytes)
# on exit → the artifact is committed and indexed

Tip: Prefer writer(...) for programmatically produced bytes.


2. Record results you’ll want to recall fast

Use Memory for structured results and lightweight logs.

A. Record a typed result (fast recall by name)

await mem.record_tool_result(
    tool="train.step",
    outputs=[
        {"name": "val_acc",  "kind": "number", "value": 0.912},
        {"name": "ckpt_uri", "kind": "uri",    "value": uri},
    ],
)

recent = await mem.recent_tool_results(tool="train.step", limit=10) # retrieve the last tool result events

B. Log arbitrary events (structured but lightweight)

await mem.record(
    kind="train_log",
    data={"epoch": 1, "loss": 0.25, "acc": 0.91},
)

recent = await mem.recent(kinds=["train_log"], limit=3) # recent is an Event

You will need to load the data from seriazalized recent.text (channel.memory() docs)

Need only the decoded payloads?

logs = await mem.recent_data(kinds=["train_log"], limit=3) # this returns the `data` saved in memory, not Event

3. Search, filter, and rank artifacts

A. Search by labels you saved earlier

hits = await arts.search(
    kind="report",
    labels={"exp": "A"},    # exact‑match filter across indexed labels
)

B. Pick “best so far” by a metric

best = await arts.best(
    kind="checkpoint",
    metric="val_acc",   # must exist in artifact.metrics
    mode="max",         # or "min"
    scope="run",        # limit to current run | graph | node
)
if best:
    best_path = await arts.as_local_file(best)

Attach metrics={"val_acc": ...} when saving to enable ranking later.


4. Practical recipes

A. Save small JSON/Text directly

cfg_art = await arts.save_json({"lr": 1e-3, "batch": 64})
log_art = await arts.save_text("training finished ok")

B. Browse everything produced in this run

all_run_outputs = await arts.list(scope="run")

C. Pin something to keep forever

await arts.pin(artifact_id=cfg_art.artifact_id)

D. Keep Memory lean but persistent

  • Memory acts like a fixed‑length hot queue for fast recall (last_by_name, recent).
  • All events are persisted for later inspection, but only a rolling window stays hot in KV for speed.

5. Minimal reference (schemas & helpers)

You rarely need all fields. Here are the useful bits to recognize in code and logs.

@dataclass
class Artifact:
    artifact_id: str
    uri: str           # CAS URI
    kind: str          # short noun (e.g., "checkpoint", "report")
    labels: dict[str, Any]
    metrics: dict[str, Any]
    preview_uri: str | None = None
    pinned: bool = False
@dataclass
class Event:
    event_id: str
    ts: str
    kind: str          # e.g., "tool_result", "train_log"
    topic: str | None = None
    inputs: list[Value] | None = None
    outputs: list[Value] | None = None
    metrics: dict[str, float] | None = None
    text: str | None = None   # JSON string or message text
    version: int = 1

Helper (already built‑in) that returns decoded payloads from recent:

async def recent_data(*, kinds: list[str], limit: int = 50) -> list[Any]

6. When to use what

Need Use Why
Keep a file/dir for later artifacts.save_file(...) / writer(...) Durable, deduped, indexed
Store small JSON/Text artifacts.save_json/text Convenience, still indexed
Log structured progress/events memory.recordrecent/recent_data Lightweight trace
Pick the best checkpoint/report artifacts.best(kind, metric, mode) Built‑in ranking
List everything from current run artifacts.list(scope="run") One‑liner browse