Dev Log 04: Optimizing Vector Memory with Weaviate

The Memory Challenge for AI Agents

Large Language Models are stateless. Each interaction starts fresh, with no inherent memory of previous conversations, learned preferences, or accumulated knowledge. For simple chatbots, this is fine. But for Creator Agents that need to maintain consistent personas, remember user preferences, and build upon past interactions, memory is essential.

The challenge is scale. A Creator Agent might have thousands of conversations across multiple platforms, generating millions of tokens of context. We can't feed all of this into the LLM for every response—context windows are limited and expensive. We need a way to store, retrieve, and utilize only the most relevant memories.

The Core Problem

How do we give AI agents long-term memory that is searchable, scalable, and cost-effective while maintaining the context they need for coherent, personalized interactions?

Why Vector Databases?

Traditional databases store data in structured tables. Vector databases store data as high-dimensional vectors—mathematical representations of meaning. This allows for semantic search: finding information based on conceptual similarity rather than keyword matching.

For AI agents, this is transformative. Instead of searching for "user likes coffee," the agent can search for concepts semantically similar to "user preferences" and retrieve memories about coffee, tea, morning routines, and caffeine habits—all in a single query.

The Embedding Pipeline

Here's how memory storage works in Pygmalion's architecture:

Agent has a conversation or generates content
Important information is extracted and formatted as text
Text is converted to a vector embedding using an embedding model
Vector + metadata is stored in Weaviate with the agent's ID
On-chain hash of the memory is anchored for verifiability

Weaviate Integration

After evaluating multiple vector databases (Pinecone, Milvus, Qdrant, Chroma), we selected Weaviate for several key reasons:

GraphQL Interface: Flexible querying that our developers love
Hybrid Search: Combines vector similarity with keyword matching
Modular AI: Built-in vectorization modules for common embedding models
Self-Hostable: Can run on our infrastructure for data sovereignty
Multi-Tenancy: Native support for isolating different agents' memories

Architecture Overview

MEMORY ARCHITECTURE

Creator Agent
Generates/retrieves memories

↓

Embedding Service
OpenAI text-embedding-3-large

↓

Weaviate Cluster
Vector storage & semantic search

↓

State Anchor Contract
On-chain memory verification

The Optimization Challenge

Our initial implementation anchored every memory on-chain. While this provided maximum verifiability, it was prohibitively expensive. Each anchor operation cost ~50,000 gas, and with agents generating hundreds of memories daily, costs quickly became unsustainable.

Batched Anchoring Strategy

The breakthrough came from implementing batched Merkle tree anchoring instead of individual transactions:

                // Before: Individual anchoring
function anchorMemory(bytes32 memoryHash) external {
    memories.push(Memory({
        hash: memoryHash,
        timestamp: block.timestamp,
        agentId: msg.sender
    }));
}
// Gas cost: ~50,000 per memory

// After: Batched Merkle anchoring
function anchorBatch(bytes32 merkleRoot, uint256 count) external {
    batches.push(Batch({
        root: merkleRoot,
        memoryCount: count,
        timestamp: block.timestamp
    }));
}
// Gas cost: ~25,000 for entire batch
            

By batching memories and only anchoring the Merkle root, we reduced per-memory gas costs from ~50,000 to ~15,000—a 70% reduction. Combined with other optimizations, we achieved our target 40% overall cost reduction while maintaining verifiability.

Performance Improvements

40%
Gas Cost Reduction

3x
Context Retention

50ms
Query Latency

Context Retention Improvements

Beyond cost savings, the Weaviate integration dramatically improved context retention. Our previous solution used simple keyword matching and had difficulty retrieving relevant historical context. With semantic search:

Agents now recall relevant memories from conversations weeks ago
Context retrieval accuracy improved from 62% to 89%
User satisfaction scores for "feeling understood" increased 45%
Agents can now maintain consistent personas across thousands of interactions

Technical Implementation Details

Weaviate Schema Design

Our Weaviate schema is designed for multi-tenant agent memory:

                {
  "class": "AgentMemory",
  "vectorizer": "text2vec-openai",
  "moduleConfig": {
    "text2vec-openai": {
      "vectorizeClassName": false
    }
  },
  "properties": [
    {
      "name": "content",
      "dataType": ["text"],
      "moduleConfig": {
        "text2vec-openai": {
          "skip": false,
          "vectorizePropertyName": false
        }
      }
    },
    {
      "name": "agentId",
      "dataType": ["text"],
      "moduleConfig": { "text2vec-openai": { "skip": true } }
    },
    {
      "name": "memoryType",
      "dataType": ["text"],
      "moduleConfig": { "text2vec-openai": { "skip": true } }
    },
    {
      "name": "timestamp",
      "dataType": ["date"],
      "moduleConfig": { "text2vec-openai": { "skip": true } }
    },
    {
      "name": "merkleRoot",
      "dataType": ["text"],
      "moduleConfig": { "text2vec-openai": { "skip": true } }
    }
  ]
}
            

Memory Retrieval Query

Here's how we retrieve relevant memories during agent conversations:

                {
  Get {
    AgentMemory(
      nearText: {
        concepts: ["user query context"]
        certainty: 0.75
      }
      where: {
        operator: And
        operands: [
          {
            path: ["agentId"]
            operator: Equal
            valueText: "agent-123"
          },
          {
            path: ["timestamp"]
            operator: GreaterThan
            valueDate: "2026-01-01T00:00:00Z"
          }
        ]
      }
      limit: 10
    ) {
      content
      memoryType
      timestamp
      certainty
    }
  }
}
            

Memory Types and Prioritization

Not all memories are equal. We categorize memories to improve retrieval quality:

Core Identity: Agent's persona, values, communication style (highest priority)
User Preferences: Individual user likes, dislikes, history (high priority)
Conversation History: Recent exchanges for context continuity (medium priority)
Knowledge Base: Facts, data, learned information (low priority, frequent)

Each memory type has different retrieval parameters—core identity memories are always included, while knowledge base memories are only retrieved when highly relevant.

State Anchoring and Verifiability

While we've moved away from anchoring every memory individually, we haven't compromised on verifiability. Here's how our batched anchoring works:

Memories are collected in a batch over a time window (e.g., 1 hour)
Each memory is hashed individually: memoryHash = keccak256(content + timestamp + agentId)
All memory hashes are combined into a Merkle tree
Only the Merkle root is anchored on-chain
Individual memories can be verified against the anchored root

Verifiability Preserved

Anyone can verify that a specific memory was part of a batch by providing the memory hash and Merkle proof. The on-chain root serves as a cryptographic commitment to the entire batch.

Scaling Considerations

As our agent network grows, we're preparing for significant scale:

Horizontal Scaling: Weaviate cluster can add nodes as memory volume grows
Tiered Storage: Hot memories in fast SSD, archived memories in cheaper storage
Memory Compression: Summarization of old memories to reduce storage without losing meaning
Selective Anchoring: Only high-value memories are anchored; routine conversation isn't

What's Next

Our memory optimization work is ongoing. Upcoming improvements include:

Multi-Modal Memory: Storing and retrieving images, audio, and video alongside text
Cross-Agent Memory Sharing: Agents can share relevant knowledge while preserving privacy
Memory Decay: Automatically deprioritizing outdated information
User-Controlled Memory: Users can view, edit, and delete what agents remember about them

Conclusion: Memory as Infrastructure

Memory isn't a feature—it's infrastructure. Just as databases are essential to traditional applications, vector memory is essential to AI agents. Our Weaviate integration provides the foundation for agents that learn, adapt, and build genuine relationships with their audiences.

The 40% gas cost reduction and 3x context retention improvement represent significant milestones, but they're just the beginning. As we continue to optimize and expand our memory infrastructure, Creator Agents will become increasingly sophisticated, personalized, and valuable.

The future of AI isn't just about smarter models—it's about systems that can remember, learn, and grow. That's what we're building at Pygmalion.