Skip to content
Go to Dashboard

GUMem Python SDK

Use the GUMem API Key only from server-side code. Do not expose it in browsers, mobile apps, client-side scripts, or public repositories.

Before you start

You need:

  • Python 3.9 or later.
  • A valid GUMem API Key.
  • A Python server-side runtime, such as FastAPI, Django, Flask, a background worker, a queue job, or an agent service.

For local testing, set the API key as an environment variable:

bash
export GUMEM_API_KEY="<your-gumem-api-key>"

Install

bash
pip install steamory-agent-kit-gumem

The SDK uses httpx for HTTP requests and provides both the synchronous GUMemClient and the asynchronous AsyncGUMemClient. Request types are described with TypedDict, so you can use type hints while still passing regular Python dict values.

Quickstart

The following example covers the shortest integration path: initialize the synchronous client, create a Session, write a few conversation messages, and retrieve relevant Memory.

python
import os

from gumem import GUMemClient

gumem = GUMemClient(api_key=os.environ["GUMEM_API_KEY"])

session = gumem.sessions.create({
    "user_id": "user_123",
    "title": "Team scheduling session",
})

session.add_messages([
    {
        "role": "user",
        "content": (
            "For team scheduling, use Berlin when I mention Europe "
            "and Toronto when I mention the Americas."
        ),
    },
    {
        "role": "assistant",
        "content": (
            "Got it. I will use Berlin for Europe scheduling "
            "and Toronto for Americas scheduling."
        ),
    },
])

memory = session.get_memory({
    "query": "which city should be used for Europe team scheduling",
    "details": True,
})

print(memory.get("data"))

Checkpoint

After this code runs, you should have a Session object and a Memory result in memory.get("data"). Use session.id as the stable identifier for adding future messages and retrieving future Memory.

In production, call gumem.close() when your application shuts down, or use a context manager to close the httpx.Client owned by the SDK automatically.

python
import os

from gumem import GUMemClient

with GUMemClient(api_key=os.environ["GUMEM_API_KEY"]) as gumem:
    health = gumem.health()
    print(health)

Async usage

Asynchronous services can use AsyncGUMemClient. It exposes the same resources as the synchronous client, but network methods must be awaited.

python
import os

from gumem import AsyncGUMemClient


async def main() -> None:
    async with AsyncGUMemClient(api_key=os.environ["GUMEM_API_KEY"]) as gumem:
        session = await gumem.sessions.create({
            "user_id": "user_123",
            "title": "Async support session",
        })

        await session.add_message({
            "role": "user",
            "content": "For Europe planning, keep Berlin as the default city.",
        })

        memory = await session.get_memory({
            "query": "default city for Europe planning",
            "details": True,
        })

        print(memory.get("data"))

Main synchronous and asynchronous API equivalents:

SyncAsyncDescription
GUMemClientAsyncGUMemClientClient entry point.
SessionAsyncSessionLocal Session handle.
gumem.health()await gumem.health()Check service health.
gumem.sessions.create()await gumem.sessions.create()Create a Session.
gumem.sessions.from_id()gumem.sessions.from_id()Restore a local handle from an existing Session ID without making a network request.
session.add_message()await session.add_message()Write one message.
session.add_messages()await session.add_messages()Write multiple messages.
session.get_memory()await session.get_memory()Retrieve Memory.
gumem.user_actions.create()await gumem.user_actions.create()Write a user Action.

Use GUMem in an assistant turn

In a real agent or assistant system, use this flow:

  1. Receive the user input in your Python backend.
  2. Retrieve relevant Memory for the current session.id.
  3. Add that Memory to your model context.
  4. After the model replies, write the user message and assistant reply back to GUMem.
python
import os
from typing import Any, Dict

from gumem import GUMemClient

gumem = GUMemClient(api_key=os.environ["GUMEM_API_KEY"])


def generate_assistant_reply(input: Dict[str, Any]) -> str:
    # Replace this function with your OpenAI, Anthropic, Gemini, or local model call.
    return f"I will use the recalled memory for: {input['user_content']}"


def assistant_turn(session_id: str, user_content: str) -> Dict[str, Any]:
    session = gumem.sessions.from_id(session_id)

    memory = session.get_memory({
        "query": user_content,
        "details": True,
    })

    assistant_reply = generate_assistant_reply({
        "user_content": user_content,
        "recalled_memory": memory.get("data"),
    })

    session.add_messages([
        {"role": "user", "content": user_content},
        {"role": "assistant", "content": assistant_reply},
    ])

    return {
        "reply": assistant_reply,
        "memory": memory.get("data"),
    }

Checkpoint

In production, store session.id when a new conversation is created. Later requests can restore the local Session handle with gumem.sessions.from_id(session_id); this does not make a network request.

Client configuration

By default, only api_key is required. You do not need to configure host when using the default GUMem service.

python
import os

from gumem import GUMemClient

gumem = GUMemClient(api_key=os.environ["GUMEM_API_KEY"])

Pass additional options when you need a custom deployment, timeout behavior, default headers, or a custom httpx.Client:

python
import httpx

from gumem import GUMemClient

client = httpx.Client(transport=httpx.HTTPTransport(retries=2))

gumem = GUMemClient(
    api_key="gumem_api_key",
    host="gumem.asix.inc",
    timeout_ms=30_000,
    headers={"X-Service": "assistant-api"},
    client=client,
)
OptionTypeDefaultDescription
api_keystrRequiredGUMem API Key. The SDK sends Authorization: Api-Key <api_key>. Values that already start with Api-Key are not prefixed again.
hoststr | Nonegumem.asix.incGUMem service host. Plain host values are normalized to HTTPS, explicit http:// and https:// URLs are preserved, and a trailing / is removed.
timeout_msint | None30000Default request timeout in milliseconds. Set it to 0 to disable the SDK timeout.
headersMapping[str, str] | NoneNoneDefault headers. The SDK merges them into every request.
clienthttpx.Client | NoneNoneSynchronous HTTP client for proxies, tests, custom transports, or observability wrappers. The async client accepts httpx.AsyncClient.

Per-request options can override timeout and headers:

python
memory = session.get_memory(
    {"query": "Europe planning"},
    options={
        "timeout_ms": 10_000,
        "headers": {"X-Trace-Id": "trace_123"},
    },
)
NameTypeRequiredDescription
options.timeout_msint | NoneNoOverride timeout for this request.
options.headersMapping[str, str] | NoneNoAdd or override headers for this request.

Health check

gumem.health(options=None) checks the GUMem service health endpoint. Use it in startup checks, deployment verification, or monitoring probes.

python
health = gumem.health()

print(health)

With the async client:

python
health = await gumem.health()

Parameters:

NameTypeRequiredDescription
optionsRequestOptionsNoPer-request options. Use it to override timeout and headers.

Sessions

A Session is the GUMem object that holds one user conversation and its Memory retrieval context. Prefer the methods on the Session handle. Use lower-level resource methods only when passing raw Session IDs is more convenient for your application.

Create a Session

gumem.sessions.create(input, options=None) creates a Session and returns a Session object. user_id is required by the current GUMem API.

python
session = gumem.sessions.create({
    "user_id": "user_123",
    "title": "Support chat",
    "metadata": {
        "source": "web-chat",
        "locale": "en-US",
    },
})

print(session.id)
print(session.raw_response)

Parameters:

NameTypeRequiredDescription
inputSessionCreateRequestYesRequest body for creating a Session.
input.user_idstrYesUser ID from your application. GUMem uses it to associate the Session with user Memory.
input.titlestr | NoneNoSession title, useful for storing a conversation topic or source label.
input.metadatadict[str, Any] | NoneNoCustom metadata such as channel, locale, product area, or business trace information.
optionsRequestOptionsNoPer-request options.

session.raw_response keeps the original response envelope returned by GUMem when the Session was created. If GUMem returns a successful response without data.session_id, the SDK throws a ValueError with the message GUMem API did not return data.session_id.

Restore a Session handle

gumem.sessions.from_id(session_id) creates a local Session handle from an existing Session ID without making a network request.

python
session = gumem.sessions.from_id("session_123")

session.add_message({
    "role": "user",
    "content": "Continue with the city preference from earlier.",
})

Parameters:

NameTypeRequiredDescription
session_idstrYesExisting Session ID that was created by GUMem and stored in your application.

Add messages

session.add_message(message, options=None) writes one message.

python
session.add_message({
    "role": "user",
    "content": "For Europe planning, keep Berlin as the default city.",
    "metadata": {
        "channel": "chat",
    },
})

Parameters:

NameTypeRequiredDescription
messageMessageYesOne message to write to the current Session.
message.rolestrYesMessage role, such as user, assistant, system, or a custom role from your application.
message.contentstrYesMessage text. GUMem uses it to build and retrieve Memory.
message.idstr | NoneNoMessage ID from your application. Use it to connect GUMem writes with your message table, logs, or traces.
message.metadatadict[str, Any]NoMessage-level metadata such as channel, model name, locale, or business tags.
message.timestampstr | datetime | NoneNoTime when the message happened. datetime values are serialized to ISO strings.
message.created_atstr | datetime | NoneNoTime when the message was created. Use it when this differs from the event timestamp.
message.statuspending | chunked | processed | failedNoMessage processing status.
optionsRequestOptionsNoPer-request options.

session.add_messages(input, options=None) writes multiple messages. input can be either a message list or an object with a messages field.

python
session.add_messages([
    {"role": "user", "content": "I prefer Berlin for Europe meetings."},
    {"role": "assistant", "content": "I will remember Berlin for Europe meetings."},
])

session.add_messages({
    "messages": [
        {"role": "user", "content": "Toronto works best for Americas meetings."},
    ],
    "user_id": "user_123",
})

Parameters:

NameTypeRequiredDescription
inputlist[Message] | AddMessagesRequestYesMessage list, or a request object with a messages field.
input.messageslist[Message]YesMessages to write in bulk.
input.user_idstr | NoneNoOptional user ID. Use it when your backend endpoint needs to pass user ownership explicitly.
optionsRequestOptionsNoPer-request options.

If passing raw Session IDs is more convenient, use the lower-level method:

python
gumem.sessions.add_messages("session_123", {
    "messages": [
        {"role": "user", "content": "hello"},
    ],
})

Get Memory

session.get_memory(params=None, options=None) retrieves relevant Memory for the current Session.

python
memory = session.get_memory({
    "query": "which city should be used for Europe scheduling",
    "details": True,
})

print(memory.get("data"))

Parameters:

NameTypeRequiredDescription
paramsGetSessionMemoryParamsNoMemory retrieval parameters.
params.querystrNoCurrent user question or business query. The SDK sends it as a query string parameter.
params.detailsboolNoWhether to request more detailed retrieval results. The SDK sends it as a query string parameter.
params.recall_configRecallConfig | NoneNoRetrieval configuration. When the recall_config field is present, the SDK uses the POST context endpoint.
optionsRequestOptionsNoPer-request options.

Pass recall_config to tune the recent Message count and apply exact metadata filters to recall scope:

python
memory = session.get_memory({
    "query": "which city should be used for the user request",
    "details": True,
    "recall_config": {
        "MessageRecentLimit": 20,
        "MetadataFilters": {
            "ping": "pong",
        },
    },
})

Common RecallConfig fields:

Field names follow the current SDK/API. Descriptions use GUMem's public terms.

NameTypeDescription
MessageRecentLimitintNumber of recent Message items to retrieve.
MetadataFiltersdictA simple metadata key-value dictionary for exact recall filtering. Keys must be non-empty strings, and values must be strings, numbers, or booleans.

Lower-level usage:

python
memory = gumem.sessions.get_memory("session_123", {
    "query": "Europe scheduling",
    "details": True,
})

User actions

User Actions are useful for clicks, page views, business operations, preference changes, and other non-conversation events. GUMem can use them to build Memory from user behavior.

python
from datetime import datetime, timezone

gumem.user_actions.create({
    "user_id": "user_123",
    "timestamp": datetime(2026, 4, 22, 1, 2, 3, tzinfo=timezone.utc),
    "content": "User opened the Europe team scheduling page",
    "session_id": "session_123",
    "event_type": "page_view",
    "page": "team_scheduling",
    "anchors": {"region": "Europe", "city": "Berlin"},
    "metadata": {"source": "assistant-api"},
})

Parameters:

NameTypeRequiredDescription
inputActionLogInputYesUser Action request body.
input.user_idstrYesUser ID from your application.
input.timestampstr | datetimeYesTime when the event happened. datetime values are serialized to ISO strings.
input.contentstrYesEvent description. Write it as natural language that is useful for Memory.
input.log_idstr | NoneNoEvent ID from your application.
input.session_idstr | NoneNoRelated GUMem Session ID.
input.device_idstr | NoneNoDevice ID.
input.appstr | NoneNoApplication or product name.
input.platformstr | NoneNoPlatform, such as web, ios, android, or backend.
input.event_typestr | NoneNoEvent type, such as page_view, click, purchase, or preference_update.
input.pagestr | NoneNoPage or product area.
input.anchorsdict[str, str] | NoneNoStructured anchors such as region, city, document ID, or product ID.
input.metadatadict[str, Any] | NoneNoCustom metadata.
input.entitieslist[str] | NoneNoRelated entity names.
optionsRequestOptionsNoPer-request options.

With the async client:

python
await gumem.user_actions.create({
    "user_id": "user_123",
    "timestamp": "2026-04-22T01:02:03Z",
    "content": "User opened the Europe team scheduling page",
})

Error handling

The SDK separates GUMem API errors, network errors, and timeout errors.

python
from gumem import GUMemApiError, GUMemConnectionError, GUMemTimeoutError

try:
    gumem.sessions.create({"user_id": "user_123"})
except GUMemApiError as error:
    print(error.status_code, error.status_text, error.detail, error.body)
except GUMemTimeoutError as error:
    print(f"Timed out after {error.timeout_ms}ms")
except GUMemConnectionError as error:
    print("Network failure", error.cause)
ErrorWhen it is raisedUseful fields
GUMemApiErrorThe GUMem API returns a non-2xx response.status_code, status, status_text, headers, body, detail.
GUMemTimeoutErrorThe request does not complete within the configured timeout.timeout_ms, cause.
GUMemConnectionErrorThe request fails before GUMem returns a response, such as DNS, connection, or transport errors.cause.
GUMemErrorBase class for SDK errors.Use it to catch SDK exceptions together.

Public types / exported surface

The Python SDK exports commonly used clients, errors, and types from the gumem package root. Application code usually only needs the clients and a small set of request types.

python
from gumem import (
    ActionLogInput,
    AsyncGUMemClient,
    GetSessionMemoryParams,
    GUMemApiError,
    GUMemClient,
    Message,
    RecallConfig,
    SessionCreateRequest,
)
ExportDescription
GUMemClientSynchronous client entry point.
AsyncGUMemClientAsynchronous client entry point.
SessionSynchronous Session handle returned by gumem.sessions.create() or gumem.sessions.from_id().
AsyncSessionAsynchronous Session handle returned by the sessions resource on AsyncGUMemClient.
SessionCreateRequestRequest type for creating a Session.
MessageMessage type for writing to a Session.
AddMessagesRequestRequest type for writing multiple messages.
GetSessionMemoryParamsParameter type for retrieving Memory.
RecallConfigMemory retrieval configuration type.
ActionLogInputRequest type for writing a user Action.
GUMemEnvelopeGUMem API response envelope, typed as dict[str, Any].
RequestOptionsPer-request options type.
GUMemApiError, GUMemConnectionError, GUMemTimeoutError, GUMemErrorSDK error types.

The Python SDK v1 intentionally keeps the public business surface focused: it does not expose threads, get_context, or user_actions.query. To continue working with an existing Session, use gumem.sessions.from_id(session_id).