Prompting
How Daydreams structures prompts to guide LLM reasoning and actions.
The interaction between the Daydreams framework and the Large Language Model (LLM) is mediated through carefully structured prompts. These prompts provide the LLM with the necessary context, instructions, available tools (actions and outputs), and current state, guiding its reasoning process and constraining its output format.
The Main Prompt Template (mainPrompt
)
The core prompt structure is defined in packages/core/src/prompts/main.ts
within the mainPrompt
configuration object. It uses a main template string
(promptTemplate
) composed of several sections identified by placeholders:
Each placeholder ({{intro}}
, {{instructions}}
, etc.) corresponds to static
text providing overall guidance to the LLM on how it should behave within the
framework.
Dynamic Prompt Generation
At each step of the Agent Lifecycle, the
framework dynamically generates the content for the {{content}}
section of the
promptTemplate
. This ensures the LLM always has the most up-to-date
information.
- Gathering Data (
formatPromptSections
): TheformatPromptSections
function (inpackages/core/src/prompts/main.ts
) collects the current state, including:- Available
actions
. - Available
outputs
. - Active
contexts
and their rendered state. - Recent
WorkingMemory
logs (both processed and unprocessed).
- Available
- Formatting to XML (
packages/core/src/formatters.ts
): Various helper functions (formatAction
,formatContextState
,formatOutputInterface
,formatContextLog
,formatValue
,formatXml
) convert these JavaScript objects and data structures into standardized XML strings. This XML format is designed to be clearly parsable by both the LLM and the framework's stream parser. - Rendering (
render
): Therender
function (frompackages/core/src/formatters.ts
) injects these dynamically generated XML strings into the mainpromptTemplate
, replacing placeholders like{{actions}}
,{{outputs}}
,{{contexts}}
,{{workingMemory}}
, and{{updates}}
.
Key XML Sections in the Prompt
The dynamically generated {{content}}
section typically includes these crucial
XML blocks:
<available-actions>
: Lists all actions currently enabled for the agent. Each action includes itsname
,description
,instructions
, and argumentschema
(as JSON schema).<available-outputs>
: Lists all outputs the agent can generate. Each output includes itstype
,description
,instructions
, contentschema
(content_schema
), attributeschema
(attributes_schema
), andexamples
.<contexts>
: Displays the state of currently active contexts, as rendered by their respectiverender
functions.<working-memory>
: Shows processed logs (InputRef
,OutputRef
,ActionCall
,ActionResult
,Thought
,EventRef
) from previous steps within the current run.<updates>
: Shows new, unprocessed logs (typically newInputRef
s orActionResult
s from the previous step) that the LLM needs to analyze and react to in the current step.
Expected LLM Response Structure
The framework instructs the LLM (via the {{response}}
section of the template)
to structure its output using specific XML tags:
<response>
: The root tag for the entire response.<reasoning>
: Contains the LLM's step-by-step thinking process. This is logged as aThought
.<action_call>
: Used to invoke an action. Thename
must match an available action. The content (arguments) depends on the action's definedschema
andcallFormat
(defaulting to JSON if schema is complex, but can be XML). The framework parses this content accordingly.<output>
: Used to generate an output. Thetype
must match an available output. Any requiredattributes
must be included, and the content must match the output's contentschema
.
The framework parses this XML structure from the LLM's response stream to trigger the corresponding handlers for actions and outputs.
Template Engine ({{...}}
)
The prompt template includes a mention of a simple template engine using
{{...}}
syntax (e.g., {{calls[0].someValue}}
, {{shortTermMemory.key}}
). As
noted in the prompt comments, its primary intended use is for intra-turn data
referencing. This means allowing an action call within the same LLM response
to reference the anticipated result of a previous action call in that same
response.
Example:
Here, the writeFile
call references the fileId
expected to be returned by
the createFile
action called just before it within the same LLM response turn.
The framework resolves these templates before executing the action handlers
(using resolveTemplates
in packages/core/src/handlers.ts
).
This dynamic and structured prompting approach allows Daydreams to effectively leverage LLMs for complex orchestration tasks, providing them with the necessary information and tools while ensuring their output can be reliably processed.