Daydreams Logo

Outputs

How Daydreams agents send information and responses.

Outputs are how Daydreams agents communicate results or send information to external systems or users. If Inputs are how agents "listen," Outputs are how they "speak" or "act" based on the LLM's reasoning.

Examples of outputs include:

  • Sending a message to a Discord channel or Telegram chat.
  • Posting a tweet.
  • Returning a response in a CLI session.
  • Calling an external API based on the agent's decision (though Actions are often better for this if a response is needed).

Defining an Output

Outputs are defined using the output helper function exported from @daydreamsai/core. Each definition specifies how the agent should structure information for a particular output channel and how to execute the sending logic.

import {
  output,
  context,
  type AnyAgent,
  type ContextState, // Base context state type
  type OutputRef, // Type for the log entry
} from "@daydreamsai/core";
import { z } from "zod";
 
// Assume myDiscordClient.sendMessage exists
declare const myDiscordClient: {
  sendMessage: (channelId: string, content: string) => Promise<any>;
};
declare const myContext: any; // Placeholder for your context type
 
const discordMessageOutput = output({
  // Required: A unique identifier for this output type. Used by the LLM.
  type: "discord:message",
 
  // Optional: Description for the LLM.
  description: "Sends a message to a specific Discord channel.",
 
  // Optional: Instructions for the LLM on usage.
  instructions: "Use this to reply to the user in the Discord channel.",
 
  // Optional: Zod schema for the main content of the output.
  // The LLM provides this content *inside* the <output> tag.
  // Defaults to z.string() if omitted.
  schema: z.string().describe("The message content to send."),
 
  // Optional: Zod schema for additional attributes the LLM must provide
  // *on* the <output> tag itself.
  attributes: z.object({
    channelId: z.string().describe("The ID of the Discord channel to send to."),
    replyToUserId: z
      .string()
      .optional()
      .describe("User ID to mention in the reply."),
  }),
 
  // Required (usually): The function that performs the actual sending logic.
  handler: async (data, ctx, agent) => {
    // 'data' contains the validated content (from schema).
    // 'ctx' includes the ContextState and the OutputRef for this specific call.
    // Access attributes parsed from the <output> tag via ctx.outputRef.params.
    const { channelId, replyToUserId } = ctx.outputRef.params ?? {};
    const content = data; // Access validated content from schema
 
    let messageContent = content;
    if (replyToUserId) {
      messageContent = `<@${replyToUserId}> ${content}`;
    }
 
    console.log(`Sending to Discord channel ${channelId}: ${messageContent}`);
    // Example: await myDiscordClient.sendMessage(channelId, messageContent);
    await new Promise((res) => setTimeout(res, 50)); // Simulate async
 
    // Optional: Return data to update the OutputRef log.
    // Can also return an array of OutputRefResponse for multiple logs.
    return {
      data: { content: messageContent, channelId }, // Updated data for the log
      params: ctx.outputRef.params, // Typically keep original params
      processed: true, // Mark this output as fully handled
    };
  },
 
  // Optional: Custom formatting for the OutputRef log.
  format: (res) => {
    // Note: 'res' is the OutputRef after the handler possibly updated it
    const outputData = Array.isArray(res.data) ? res.data[0] : res.data; // Adjust if handler returns array
    return `Sent Discord message to ${res.params?.channelId}: "${outputData?.content ?? res.content}"`;
  },
 
  // Optional: Examples for the LLM.
  examples: [
    `<output type="discord:message" channelId="12345">Hello there!</output>`,
    `<output type="discord:message" channelId="67890" replyToUserId="user123">Got it!</output>`,
  ],
 
  // Optional: Setup logic run when the agent starts.
  install: async (agent) => {
    /* ... */
  },
 
  // Optional: Conditionally enable this output based on context.
  enabled: (ctx: ContextState) => {
    // Example: Only enable if the current context is a discord channel
    // return ctx.context.type === 'discord:channel';
    return true;
  },
 
  // Optional: Associate with a specific context type.
  // context: myContext,
});

Key Parameters:

  • type (string): Unique identifier used in <output type="...">.
  • description/instructions (string, optional): Help the LLM understand what the output does and when to use it.
  • schema (Zod Schema, optional): Defines the structure and validates the content placed inside the <output> tag by the LLM. Defaults to z.string().
  • attributes (Zod Schema, optional): Defines and validates attributes placed on the <output> tag itself (e.g., <output type="discord:message" channelId="...">). These provide necessary parameters for the handler.
  • handler (Function): Executes the logic to send the information externally. It receives:
    • data: The validated content from the schema.
    • ctx: The context state (ContextState) augmented with the specific outputRef for this call (OutputRef). Attributes parsed from the tag are found in ctx.outputRef.params.
    • agent: The agent instance.
    • It can optionally return an OutputRefResponse (or array thereof) to update the log entry or mark it as processed.
  • format (Function, optional): Customizes the log representation of the OutputRef.
  • examples (string[], optional): Provides concrete examples to the LLM on how to structure the <output> tag.
  • install / enabled / context (Functions/Context, optional): Similar to Actions and Inputs for setup, conditional availability, and context scoping.

LLM Interaction

  1. Availability: Enabled outputs are presented to the LLM within the <available-outputs> tag in the prompt, including their type, description, instructions, content schema (content_schema), attribute schema (attributes_schema), and examples.
  2. Invocation: The LLM generates an output by including an <output> tag in its response stream, matching one of the available types. It must provide any required attributes defined in the attributes schema and the content inside the tag matching the schema.
    <output type="discord:message" channelId="123456789">
      This is the message content generated by the LLM.
    </output>

Execution Flow

  1. Parsing: When the framework parses an <output> tag from the LLM stream (handleStream in streaming.ts), it extracts the type, attributes, and content.
  2. Log Creation: An initial OutputRef log is created (getOrCreateRef in streaming.ts).
  3. Processing: Once the tag is fully parsed (el.done), the engine calls handleOutput (handlers.ts).
  4. Validation: handleOutput finds the corresponding output definition by type. It validates the extracted content against the output.schema and the extracted attributes against the output.attributes schema.
  5. Handler Execution: If validation passes, handleOutput executes the output.handler function, passing the validated content (data) and the context state augmented with the outputRef (ctx). Attributes are accessed via ctx.outputRef.params.
  6. External Action: The handler performs the necessary external operation (e.g., sending the Discord message).
  7. Logging: The handler can optionally return data to update the OutputRef log. The OutputRef is added to the WorkingMemory.

Outputs vs. Actions

While outputs and actions share similar structures, they serve different purposes:

FeatureActionsOutputs
Primary purposeTwo-way interaction (call -> result -> LLM)One-way communication (LLM -> external)
Return valueResult is crucial for next LLM stepResult usually not directly needed by LLM
State mutationCommonly used to update context stateCan update state but less common
Usage patternLLM requests data or triggers processLLM communicates final response or notification
Error handlingErrors often returned to LLM for reactionErrors handled internally (logged/retried)

When to Use Outputs vs. Actions

  • Use outputs when: The primary goal is to communicate outward (send a message, display UI, log data), and you don't need the result of that communication for the LLM's immediate next reasoning step.
  • Use actions when: You need the result of the operation (e.g., data fetched from an API, status of a transaction) for the LLM to continue its reasoning process or make subsequent decisions.

Best Practices for Outputs

  1. Keep outputs focused: Each output definition should have a single, clear responsibility (e.g., discord:message, log:event).
  2. Handle errors gracefully: The handler should contain try...catch blocks for external calls and report failures appropriately (e.g., log an error, perhaps emit an error event) without crashing the agent.
  3. Consider asynchronous processing: For outputs involving potentially slow external systems, ensure the handler is async and handles the operation without blocking the main agent loop excessively.
  4. Track important outputs in context: If the fact that an output occurred is important for future agent decisions (e.g., remembering a message was sent), update the relevant context memory within the handler.
  5. Use descriptive names and schemas: Clearly define the type, schema, and attributes so the LLM understands exactly how to use the output. Provide good examples.

Outputs allow the agent to respond and communicate, completing the interaction loop initiated by Inputs and guided by Actions.

On this page