GUMem Python SDK
Use the GUMem API Key only from server-side code. Do not expose it in browsers, mobile apps, client-side scripts, or public repositories.
Before you start
You need:
- Python 3.9 or later.
- A valid GUMem API Key.
- A Python server-side runtime, such as FastAPI, Django, Flask, a background worker, a queue job, or an agent service.
For local testing, set the API key as an environment variable:
export GUMEM_API_KEY="<your-gumem-api-key>"Install
pip install steamory-agent-kit-gumemThe SDK uses httpx for HTTP requests and provides both the synchronous GUMemClient and the asynchronous AsyncGUMemClient. Request types are described with TypedDict, so you can use type hints while still passing regular Python dict values.
Quickstart
The following example covers the shortest integration path: initialize the synchronous client, create a Session, write a few conversation messages, and retrieve relevant Memory.
import os
from gumem import GUMemClient
gumem = GUMemClient(api_key=os.environ["GUMEM_API_KEY"])
session = gumem.sessions.create({
"user_id": "user_123",
"title": "Team scheduling session",
})
session.add_messages([
{
"role": "user",
"content": (
"For team scheduling, use Berlin when I mention Europe "
"and Toronto when I mention the Americas."
),
},
{
"role": "assistant",
"content": (
"Got it. I will use Berlin for Europe scheduling "
"and Toronto for Americas scheduling."
),
},
])
memory = session.get_memory({
"query": "which city should be used for Europe team scheduling",
"details": True,
})
print(memory.get("data"))Checkpoint
After this code runs, you should have a Session object and a Memory result in memory.get("data"). Use session.id as the stable identifier for adding future messages and retrieving future Memory.
In production, call gumem.close() when your application shuts down, or use a context manager to close the httpx.Client owned by the SDK automatically.
import os
from gumem import GUMemClient
with GUMemClient(api_key=os.environ["GUMEM_API_KEY"]) as gumem:
health = gumem.health()
print(health)Async usage
Asynchronous services can use AsyncGUMemClient. It exposes the same resources as the synchronous client, but network methods must be awaited.
import os
from gumem import AsyncGUMemClient
async def main() -> None:
async with AsyncGUMemClient(api_key=os.environ["GUMEM_API_KEY"]) as gumem:
session = await gumem.sessions.create({
"user_id": "user_123",
"title": "Async support session",
})
await session.add_message({
"role": "user",
"content": "For Europe planning, keep Berlin as the default city.",
})
memory = await session.get_memory({
"query": "default city for Europe planning",
"details": True,
})
print(memory.get("data"))Main synchronous and asynchronous API equivalents:
| Sync | Async | Description |
|---|---|---|
GUMemClient | AsyncGUMemClient | Client entry point. |
Session | AsyncSession | Local Session handle. |
gumem.health() | await gumem.health() | Check service health. |
gumem.sessions.create() | await gumem.sessions.create() | Create a Session. |
gumem.sessions.from_id() | gumem.sessions.from_id() | Restore a local handle from an existing Session ID without making a network request. |
session.add_message() | await session.add_message() | Write one message. |
session.add_messages() | await session.add_messages() | Write multiple messages. |
session.get_memory() | await session.get_memory() | Retrieve Memory. |
gumem.user_actions.create() | await gumem.user_actions.create() | Write a user Action. |
Use GUMem in an assistant turn
In a real agent or assistant system, use this flow:
- Receive the user input in your Python backend.
- Retrieve relevant Memory for the current
session.id. - Add that Memory to your model context.
- After the model replies, write the user message and assistant reply back to GUMem.
import os
from typing import Any, Dict
from gumem import GUMemClient
gumem = GUMemClient(api_key=os.environ["GUMEM_API_KEY"])
def generate_assistant_reply(input: Dict[str, Any]) -> str:
# Replace this function with your OpenAI, Anthropic, Gemini, or local model call.
return f"I will use the recalled memory for: {input['user_content']}"
def assistant_turn(session_id: str, user_content: str) -> Dict[str, Any]:
session = gumem.sessions.from_id(session_id)
memory = session.get_memory({
"query": user_content,
"details": True,
})
assistant_reply = generate_assistant_reply({
"user_content": user_content,
"recalled_memory": memory.get("data"),
})
session.add_messages([
{"role": "user", "content": user_content},
{"role": "assistant", "content": assistant_reply},
])
return {
"reply": assistant_reply,
"memory": memory.get("data"),
}Checkpoint
In production, store session.id when a new conversation is created. Later requests can restore the local Session handle with gumem.sessions.from_id(session_id); this does not make a network request.
Client configuration
By default, only api_key is required. You do not need to configure host when using the default GUMem service.
import os
from gumem import GUMemClient
gumem = GUMemClient(api_key=os.environ["GUMEM_API_KEY"])Pass additional options when you need a custom deployment, timeout behavior, default headers, or a custom httpx.Client:
import httpx
from gumem import GUMemClient
client = httpx.Client(transport=httpx.HTTPTransport(retries=2))
gumem = GUMemClient(
api_key="gumem_api_key",
host="gumem.asix.inc",
timeout_ms=30_000,
headers={"X-Service": "assistant-api"},
client=client,
)| Option | Type | Default | Description |
|---|---|---|---|
api_key | str | Required | GUMem API Key. The SDK sends Authorization: Api-Key <api_key>. Values that already start with Api-Key are not prefixed again. |
host | str | None | gumem.asix.inc | GUMem service host. Plain host values are normalized to HTTPS, explicit http:// and https:// URLs are preserved, and a trailing / is removed. |
timeout_ms | int | None | 30000 | Default request timeout in milliseconds. Set it to 0 to disable the SDK timeout. |
headers | Mapping[str, str] | None | None | Default headers. The SDK merges them into every request. |
client | httpx.Client | None | None | Synchronous HTTP client for proxies, tests, custom transports, or observability wrappers. The async client accepts httpx.AsyncClient. |
Per-request options can override timeout and headers:
memory = session.get_memory(
{"query": "Europe planning"},
options={
"timeout_ms": 10_000,
"headers": {"X-Trace-Id": "trace_123"},
},
)| Name | Type | Required | Description |
|---|---|---|---|
options.timeout_ms | int | None | No | Override timeout for this request. |
options.headers | Mapping[str, str] | None | No | Add or override headers for this request. |
Health check
gumem.health(options=None) checks the GUMem service health endpoint. Use it in startup checks, deployment verification, or monitoring probes.
health = gumem.health()
print(health)With the async client:
health = await gumem.health()Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
options | RequestOptions | No | Per-request options. Use it to override timeout and headers. |
Sessions
A Session is the GUMem object that holds one user conversation and its Memory retrieval context. Prefer the methods on the Session handle. Use lower-level resource methods only when passing raw Session IDs is more convenient for your application.
Create a Session
gumem.sessions.create(input, options=None) creates a Session and returns a Session object. user_id is required by the current GUMem API.
session = gumem.sessions.create({
"user_id": "user_123",
"title": "Support chat",
"metadata": {
"source": "web-chat",
"locale": "en-US",
},
})
print(session.id)
print(session.raw_response)Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
input | SessionCreateRequest | Yes | Request body for creating a Session. |
input.user_id | str | Yes | User ID from your application. GUMem uses it to associate the Session with user Memory. |
input.title | str | None | No | Session title, useful for storing a conversation topic or source label. |
input.metadata | dict[str, Any] | None | No | Custom metadata such as channel, locale, product area, or business trace information. |
options | RequestOptions | No | Per-request options. |
session.raw_response keeps the original response envelope returned by GUMem when the Session was created. If GUMem returns a successful response without data.session_id, the SDK throws a ValueError with the message GUMem API did not return data.session_id.
Restore a Session handle
gumem.sessions.from_id(session_id) creates a local Session handle from an existing Session ID without making a network request.
session = gumem.sessions.from_id("session_123")
session.add_message({
"role": "user",
"content": "Continue with the city preference from earlier.",
})Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
session_id | str | Yes | Existing Session ID that was created by GUMem and stored in your application. |
Add messages
session.add_message(message, options=None) writes one message.
session.add_message({
"role": "user",
"content": "For Europe planning, keep Berlin as the default city.",
"metadata": {
"channel": "chat",
},
})Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
message | Message | Yes | One message to write to the current Session. |
message.role | str | Yes | Message role, such as user, assistant, system, or a custom role from your application. |
message.content | str | Yes | Message text. GUMem uses it to build and retrieve Memory. |
message.id | str | None | No | Message ID from your application. Use it to connect GUMem writes with your message table, logs, or traces. |
message.metadata | dict[str, Any] | No | Message-level metadata such as channel, model name, locale, or business tags. |
message.timestamp | str | datetime | None | No | Time when the message happened. datetime values are serialized to ISO strings. |
message.created_at | str | datetime | None | No | Time when the message was created. Use it when this differs from the event timestamp. |
message.status | pending | chunked | processed | failed | No | Message processing status. |
options | RequestOptions | No | Per-request options. |
session.add_messages(input, options=None) writes multiple messages. input can be either a message list or an object with a messages field.
session.add_messages([
{"role": "user", "content": "I prefer Berlin for Europe meetings."},
{"role": "assistant", "content": "I will remember Berlin for Europe meetings."},
])
session.add_messages({
"messages": [
{"role": "user", "content": "Toronto works best for Americas meetings."},
],
"user_id": "user_123",
})Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
input | list[Message] | AddMessagesRequest | Yes | Message list, or a request object with a messages field. |
input.messages | list[Message] | Yes | Messages to write in bulk. |
input.user_id | str | None | No | Optional user ID. Use it when your backend endpoint needs to pass user ownership explicitly. |
options | RequestOptions | No | Per-request options. |
If passing raw Session IDs is more convenient, use the lower-level method:
gumem.sessions.add_messages("session_123", {
"messages": [
{"role": "user", "content": "hello"},
],
})Get Memory
session.get_memory(params=None, options=None) retrieves relevant Memory for the current Session.
memory = session.get_memory({
"query": "which city should be used for Europe scheduling",
"details": True,
})
print(memory.get("data"))Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
params | GetSessionMemoryParams | No | Memory retrieval parameters. |
params.query | str | No | Current user question or business query. The SDK sends it as a query string parameter. |
params.details | bool | No | Whether to request more detailed retrieval results. The SDK sends it as a query string parameter. |
params.recall_config | RecallConfig | None | No | Retrieval configuration. When the recall_config field is present, the SDK uses the POST context endpoint. |
options | RequestOptions | No | Per-request options. |
Pass recall_config to tune the recent Message count and apply exact metadata filters to recall scope:
memory = session.get_memory({
"query": "which city should be used for the user request",
"details": True,
"recall_config": {
"MessageRecentLimit": 20,
"MetadataFilters": {
"ping": "pong",
},
},
})Common RecallConfig fields:
Field names follow the current SDK/API. Descriptions use GUMem's public terms.
| Name | Type | Description |
|---|---|---|
MessageRecentLimit | int | Number of recent Message items to retrieve. |
MetadataFilters | dict | A simple metadata key-value dictionary for exact recall filtering. Keys must be non-empty strings, and values must be strings, numbers, or booleans. |
Lower-level usage:
memory = gumem.sessions.get_memory("session_123", {
"query": "Europe scheduling",
"details": True,
})User actions
User Actions are useful for clicks, page views, business operations, preference changes, and other non-conversation events. GUMem can use them to build Memory from user behavior.
from datetime import datetime, timezone
gumem.user_actions.create({
"user_id": "user_123",
"timestamp": datetime(2026, 4, 22, 1, 2, 3, tzinfo=timezone.utc),
"content": "User opened the Europe team scheduling page",
"session_id": "session_123",
"event_type": "page_view",
"page": "team_scheduling",
"anchors": {"region": "Europe", "city": "Berlin"},
"metadata": {"source": "assistant-api"},
})Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
input | ActionLogInput | Yes | User Action request body. |
input.user_id | str | Yes | User ID from your application. |
input.timestamp | str | datetime | Yes | Time when the event happened. datetime values are serialized to ISO strings. |
input.content | str | Yes | Event description. Write it as natural language that is useful for Memory. |
input.log_id | str | None | No | Event ID from your application. |
input.session_id | str | None | No | Related GUMem Session ID. |
input.device_id | str | None | No | Device ID. |
input.app | str | None | No | Application or product name. |
input.platform | str | None | No | Platform, such as web, ios, android, or backend. |
input.event_type | str | None | No | Event type, such as page_view, click, purchase, or preference_update. |
input.page | str | None | No | Page or product area. |
input.anchors | dict[str, str] | None | No | Structured anchors such as region, city, document ID, or product ID. |
input.metadata | dict[str, Any] | None | No | Custom metadata. |
input.entities | list[str] | None | No | Related entity names. |
options | RequestOptions | No | Per-request options. |
With the async client:
await gumem.user_actions.create({
"user_id": "user_123",
"timestamp": "2026-04-22T01:02:03Z",
"content": "User opened the Europe team scheduling page",
})Error handling
The SDK separates GUMem API errors, network errors, and timeout errors.
from gumem import GUMemApiError, GUMemConnectionError, GUMemTimeoutError
try:
gumem.sessions.create({"user_id": "user_123"})
except GUMemApiError as error:
print(error.status_code, error.status_text, error.detail, error.body)
except GUMemTimeoutError as error:
print(f"Timed out after {error.timeout_ms}ms")
except GUMemConnectionError as error:
print("Network failure", error.cause)| Error | When it is raised | Useful fields |
|---|---|---|
GUMemApiError | The GUMem API returns a non-2xx response. | status_code, status, status_text, headers, body, detail. |
GUMemTimeoutError | The request does not complete within the configured timeout. | timeout_ms, cause. |
GUMemConnectionError | The request fails before GUMem returns a response, such as DNS, connection, or transport errors. | cause. |
GUMemError | Base class for SDK errors. | Use it to catch SDK exceptions together. |
Public types / exported surface
The Python SDK exports commonly used clients, errors, and types from the gumem package root. Application code usually only needs the clients and a small set of request types.
from gumem import (
ActionLogInput,
AsyncGUMemClient,
GetSessionMemoryParams,
GUMemApiError,
GUMemClient,
Message,
RecallConfig,
SessionCreateRequest,
)| Export | Description |
|---|---|
GUMemClient | Synchronous client entry point. |
AsyncGUMemClient | Asynchronous client entry point. |
Session | Synchronous Session handle returned by gumem.sessions.create() or gumem.sessions.from_id(). |
AsyncSession | Asynchronous Session handle returned by the sessions resource on AsyncGUMemClient. |
SessionCreateRequest | Request type for creating a Session. |
Message | Message type for writing to a Session. |
AddMessagesRequest | Request type for writing multiple messages. |
GetSessionMemoryParams | Parameter type for retrieving Memory. |
RecallConfig | Memory retrieval configuration type. |
ActionLogInput | Request type for writing a user Action. |
GUMemEnvelope | GUMem API response envelope, typed as dict[str, Any]. |
RequestOptions | Per-request options type. |
GUMemApiError, GUMemConnectionError, GUMemTimeoutError, GUMemError | SDK error types. |
The Python SDK v1 intentionally keeps the public business surface focused: it does not expose threads, get_context, or user_actions.query. To continue working with an existing Session, use gumem.sessions.from_id(session_id).