LlamaLib  v2.0.5
Cross-platform library for local LLMs
Loading...
Searching...
No Matches
LLMAgent Class Reference

High-level conversational agent for LLM interactions. More...

#include <LLM_agent.h>

Inheritance diagram for LLMAgent:
[legend]

Public Member Functions

 LLMAgent (LLMLocal *llm, const std::string &system_prompt="")
 Constructor for LLM agent.
 
std::string tokenize_json (const json &data) override
 Tokenize input (override)
 
std::string detokenize_json (const json &data) override
 Convert tokens back to text.
 
std::string embeddings_json (const json &data) override
 Generate embeddings with HTTP response support.
 
std::string completion_json (const json &data, CharArrayFn callback=nullptr, bool callbackWithJSON=true) override
 Generate completion (delegate to wrapped LLM)
 
std::string apply_template_json (const json &data) override
 Apply a chat template to message data.
 
std::string slot_json (const json &data) override
 Manage slots with HTTP response support.
 
void cancel (int id_slot) override
 Cancel request (delegate to wrapped LLM)
 
int get_next_available_slot () override
 Get available slot (delegate to wrapped LLM)
 
int get_slot_context_size () override
 Get slot context size (delegate to wrapped LLM)
 
virtual json build_completion_json (const std::string &prompt)
 Build completion JSON with agent's slot.
 
virtual std::string completion (const std::string &prompt, CharArrayFn callback=nullptr, bool return_response_json=false)
 Generate completion with agent's slot.
 
virtual json build_slot_json (const std::string &action, const std::string &filepath)
 Build slot operation JSON with agent's slot.
 
virtual std::string save_slot (const std::string &filepath)
 Save agent's slot state.
 
virtual std::string load_slot (const std::string &filepath)
 Load agent's slot state.
 
virtual void cancel ()
 Cancel agent's current request.
 
int get_slot ()
 Get current processing slot ID.
 
void set_slot (int id_slot)
 Set processing slot ID.
 
void set_system_prompt (const std::string &system_prompt_)
 Set system prompt.
 
std::string get_system_prompt () const
 Get current system prompt.
 
void set_history (const json &history_)
 Set conversation history.
 
json get_history () const
 Get conversation history.
 
void add_user_message (const std::string &content)
 Add a user message to conversation history.
 
void add_assistant_message (const std::string &content)
 Add an assistant message to conversation history.
 
void clear_history ()
 Clear all conversation history.
 
void remove_last_message ()
 Remove the last message from history.
 
void save_history (const std::string &filepath) const
 Save conversation history to file.
 
void load_history (const std::string &filepath)
 Load conversation history from file.
 
size_t get_history_size () const
 Get number of messages in history.
 
void set_overflow_strategy (ContextOverflowStrategy strategy, float target_ratio=0.5f, const std::string &summarize_prompt=SUMMARY_PROMPT)
 Configure how the agent handles context overflow.
 
ContextOverflowStrategy get_overflow_strategy () const
 Get the current overflow strategy.
 
std::string get_summarize_prompt () const
 Get the current summarize prompt.
 
std::string get_summary () const
 Get the current rolling summary (empty if none has been generated yet)
 
void set_summary (const std::string &summary_)
 Set the rolling summary directly (e.g. after loading from file)
 
std::string chat (const std::string &user_prompt, bool add_to_history=true, CharArrayFn callback=nullptr, bool return_response_json=false, bool debug_prompt=false)
 Conduct a chat interaction.
 
- Public Member Functions inherited from LLMLocal
virtual std::string save_slot (int id_slot, const std::string &filepath)
 Save slot state to file.
 
virtual std::string load_slot (int id_slot, const std::string &filepath)
 Load slot state from file.
 
- Public Member Functions inherited from LLM
virtual ~LLM ()=default
 Virtual destructor.
 
virtual std::vector< int > tokenize (const std::string &query)
 Tokenize text.
 
virtual std::string detokenize (const std::vector< int32_t > &tokens)
 Convert tokens to text.
 
virtual std::vector< float > embeddings (const std::string &query)
 Generate embeddings.
 
virtual void set_completion_params (json completion_params_)
 Set completion parameters.
 
virtual std::string get_completion_params ()
 Get current completion parameters.
 
virtual std::string completion (const std::string &prompt, CharArrayFn callback=nullptr, int id_slot=-1, bool return_response_json=false)
 Generate completion.
 
virtual void set_grammar (std::string grammar_)
 Set grammar for constrained generation.
 
virtual std::string get_grammar ()
 Get current grammar specification.
 
virtual std::string apply_template (const json &messages)
 Apply template to messages.
 

Public Attributes

const std::string USER_ROLE = "user"
 
const std::string ASSISTANT_ROLE = "assistant"
 
- Public Attributes inherited from LLM
int32_t n_keep = 0
 Number of tokens to keep from the beginning of the context.
 
std::string grammar = ""
 Grammar specification in GBNF format or JSON schema.
 
json completion_params
 JSON object containing completion parameters.
 

Protected Member Functions

void set_n_keep ()
 
json build_system_history () const
 Builds the history to send to the model including only the prompts.
 
json build_working_history (const std::string &user_prompt, bool include_history=true) const
 Build the full message list to send to the model.
 
bool handle_overflow (const std::string &user_prompt)
 Handle context overflow using the configured strategy before a chat call.
 
void truncate_history (const std::string &user_prompt)
 Remove oldest message pairs from the front until history fits within target_context_ratio.
 
void summarize_history (const std::string &user_prompt)
 Summarise the entire history (chunking if needed), embed summary in system message, then truncate if still needed.
 
virtual void add_message (const std::string &role, const std::string &content)
 Add a message to conversation history.
 
- Protected Member Functions inherited from LLMLocal
virtual std::string slot (int id_slot, const std::string &action, const std::string &filepath)
 Perform slot operation.
 
virtual json build_slot_json (int id_slot, const std::string &action, const std::string &filepath)
 Build JSON for slot operations.
 
virtual std::string parse_slot_json (const json &result)
 Parse slot operation result.
 
- Protected Member Functions inherited from LLM
virtual json build_apply_template_json (const json &messages)
 Build JSON for template application.
 
virtual std::string parse_apply_template_json (const json &result)
 Parse template application result.
 
virtual json build_tokenize_json (const std::string &query)
 Build JSON for tokenization.
 
virtual std::vector< int > parse_tokenize_json (const json &result)
 Parse tokenization result.
 
virtual json build_detokenize_json (const std::vector< int32_t > &tokens)
 Build JSON for detokenization.
 
virtual std::string parse_detokenize_json (const json &result)
 Parse detokenization result.
 
virtual json build_embeddings_json (const std::string &query)
 Build JSON for embeddings generation.
 
virtual std::vector< float > parse_embeddings_json (const json &result)
 Parse embeddings result.
 
virtual json build_completion_json (const std::string &prompt, int id_slot=-1)
 Build JSON for completion generation.
 
virtual std::string parse_completion_json (const json &result)
 Parse completion result.
 

Additional Inherited Members

- Static Public Member Functions inherited from LLM
static bool has_gpu_layers (const std::string &command)
 Check if command line arguments specify GPU layers.
 
static std::string LLM_args_to_command (const std::string &model_path, int num_slots=1, int num_threads=-1, int num_GPU_layers=0, bool flash_attention=false, int context_size=4096, int batch_size=2048, bool embedding_only=false, const std::vector< std::string > &lora_paths={})
 Convert LLM parameters to command line arguments.
 

Detailed Description

High-level conversational agent for LLM interactions.

Provides a conversation-aware interface that manages chat history and applies chat template formatting

Definition at line 82 of file LLM_agent.h.

Constructor & Destructor Documentation

◆ LLMAgent()

LLMAgent::LLMAgent ( LLMLocal * llm,
const std::string & system_prompt = "" )

Constructor for LLM agent.

Parameters
llmPointer to LLMLocal instance to wrap
system_promptInitial system prompt for conversation context

Creates an agent that manages conversations with the specified LLM backend

Definition at line 5 of file LLM_agent.cpp.

Member Function Documentation

◆ add_assistant_message()

void LLMAgent::add_assistant_message ( const std::string & content)
inline

Add an assistant message to conversation history.

Parameters
contentAssistant message content

Convenience method for adding assistant messages

Definition at line 228 of file LLM_agent.h.

Here is the caller graph for this function:

◆ add_message()

void LLMAgent::add_message ( const std::string & role,
const std::string & content )
protectedvirtual

Add a message to conversation history.

Parameters
roleMessage role identifier
contentMessage content text

Appends a new message to the conversation history

Definition at line 95 of file LLM_agent.cpp.

◆ add_user_message()

void LLMAgent::add_user_message ( const std::string & content)
inline

Add a user message to conversation history.

Parameters
contentUser message content

Convenience method for adding user messages

Definition at line 223 of file LLM_agent.h.

Here is the caller graph for this function:

◆ apply_template_json()

std::string LLMAgent::apply_template_json ( const json & data)
inlineoverridevirtual

Apply a chat template to message data.

Parameters
dataJSON object containing messages to format
Returns
Formatted string with template applied

Pure virtual method for applying chat templates to conversation data

Implements LLM.

Definition at line 123 of file LLM_agent.h.

◆ build_completion_json()

virtual json LLMAgent::build_completion_json ( const std::string & prompt)
inlinevirtual

Build completion JSON with agent's slot.

Parameters
promptInput prompt text
Returns
JSON object for completion request

Override that automatically uses the agent's assigned slot

Definition at line 150 of file LLM_agent.h.

◆ build_slot_json()

virtual json LLMAgent::build_slot_json ( const std::string & action,
const std::string & filepath )
inlinevirtual

Build slot operation JSON with agent's slot.

Parameters
actionSlot operation action ("save" or "restore")
filepathFile path for slot operation
Returns
JSON object for slot operation

Override that automatically uses the agent's assigned slot

Definition at line 168 of file LLM_agent.h.

◆ build_system_history()

json LLMAgent::build_system_history ( ) const
protected

Builds the history to send to the model including only the prompts.

Returns
JSON array: [system+summary]

Definition at line 34 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ build_working_history()

json LLMAgent::build_working_history ( const std::string & user_prompt,
bool include_history = true ) const
protected

Build the full message list to send to the model.

Parameters
user_promptThe current user message to append
include_historyWhether to include the chat history
Returns
JSON array: [system+summary, ...history, user_prompt]

Definition at line 44 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ cancel() [1/2]

virtual void LLMAgent::cancel ( )
inlinevirtual

Cancel agent's current request.

Cancels any running request on the agent's slot

Definition at line 184 of file LLM_agent.h.

◆ cancel() [2/2]

void LLMAgent::cancel ( int id_slot)
inlineoverridevirtual

Cancel request (delegate to wrapped LLM)

Parameters
dataJSON cancellation request

Implements LLMLocal.

Definition at line 133 of file LLM_agent.h.

◆ chat()

std::string LLMAgent::chat ( const std::string & user_prompt,
bool add_to_history = true,
CharArrayFn callback = nullptr,
bool return_response_json = false,
bool debug_prompt = false )

Conduct a chat interaction.

Parameters
user_promptUser's input message
add_to_historyWhether to add messages to conversation history
callbackOptional callback for streaming responses
return_response_jsonWhether to return full JSON response
debug_promptWhether to display the complete prompt (default: false)
Returns
Assistant's response text or JSON

Main chat method that processes user input, applies conversation context, generates a response, and optionally updates conversation history

Definition at line 63 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ clear_history()

void LLMAgent::clear_history ( )

Clear all conversation history.

Removes all messages from the conversation history

Definition at line 27 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ completion()

virtual std::string LLMAgent::completion ( const std::string & prompt,
CharArrayFn callback = nullptr,
bool return_response_json = false )
inlinevirtual

Generate completion with agent's slot.

Parameters
promptInput prompt text
callbackOptional streaming callback
return_response_jsonWhether to return JSON response
Returns
Generated completion text or JSON

Override that automatically uses the agent's assigned slot

Definition at line 158 of file LLM_agent.h.

Here is the caller graph for this function:

◆ completion_json()

std::string LLMAgent::completion_json ( const json & data,
CharArrayFn callback = nullptr,
bool callbackWithJSON = true )
inlineoverridevirtual

Generate completion (delegate to wrapped LLM)

Parameters
dataJSON completion request
callbackOptional streaming callback
callbackWithJSONWhether callback uses JSON
Returns
Generated completion

Implements LLM.

Definition at line 117 of file LLM_agent.h.

◆ detokenize_json()

std::string LLMAgent::detokenize_json ( const json & data)
inlineoverridevirtual

Convert tokens back to text.

Parameters
dataJSON object containing token IDs
Returns
JSON string containing detokenized text

Pure virtual method for converting token sequences back to text

Implements LLM.

Definition at line 104 of file LLM_agent.h.

◆ embeddings_json()

std::string LLMAgent::embeddings_json ( const json & data)
inlineoverridevirtual

Generate embeddings with HTTP response support.

Parameters
dataJSON object containing embedding request
Returns
JSON string with embedding data

Protected method used internally for server-based embedding generation

Implements LLM.

Definition at line 110 of file LLM_agent.h.

◆ get_history()

json LLMAgent::get_history ( ) const
inline

Get conversation history.

Returns
JSON array containing conversation history

Returns the complete conversation history as JSON

Definition at line 216 of file LLM_agent.h.

Here is the caller graph for this function:

◆ get_history_size()

size_t LLMAgent::get_history_size ( ) const
inline

Get number of messages in history.

Returns
Number of messages in conversation history

Returns the count of messages currently stored in history

Definition at line 251 of file LLM_agent.h.

Here is the caller graph for this function:

◆ get_next_available_slot()

int LLMAgent::get_next_available_slot ( )
inlineoverridevirtual

Get available slot (delegate to wrapped LLM)

Returns
Available slot ID

Implements LLMLocal.

Definition at line 137 of file LLM_agent.h.

◆ get_overflow_strategy()

ContextOverflowStrategy LLMAgent::get_overflow_strategy ( ) const
inline

Get the current overflow strategy.

Definition at line 271 of file LLM_agent.h.

◆ get_slot()

int LLMAgent::get_slot ( )
inline

Get current processing slot ID.

Returns
Current slot ID

Returns the slot ID used for this agent's operations

Definition at line 190 of file LLM_agent.h.

Here is the caller graph for this function:

◆ get_slot_context_size()

int LLMAgent::get_slot_context_size ( )
inlineoverridevirtual

Get slot context size (delegate to wrapped LLM)

Returns
Slot context size

Implements LLMLocal.

Definition at line 141 of file LLM_agent.h.

Here is the caller graph for this function:

◆ get_summarize_prompt()

std::string LLMAgent::get_summarize_prompt ( ) const
inline

Get the current summarize prompt.

Definition at line 274 of file LLM_agent.h.

◆ get_summary()

std::string LLMAgent::get_summary ( ) const
inline

Get the current rolling summary (empty if none has been generated yet)

Definition at line 277 of file LLM_agent.h.

Here is the caller graph for this function:

◆ get_system_prompt()

std::string LLMAgent::get_system_prompt ( ) const
inline

Get current system prompt.

Returns
Current system prompt string

Definition at line 206 of file LLM_agent.h.

Here is the caller graph for this function:

◆ handle_overflow()

bool LLMAgent::handle_overflow ( const std::string & user_prompt)
protected

Handle context overflow using the configured strategy before a chat call.

Parameters
user_promptThe user prompt string about to be sent
Returns
true if history was modified

Definition at line 172 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ load_history()

void LLMAgent::load_history ( const std::string & filepath)

Load conversation history from file.

Parameters
filepathPath to the history file to load

Loads conversation history from a JSON file, replacing current history

Definition at line 133 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ load_slot()

virtual std::string LLMAgent::load_slot ( const std::string & filepath)
inlinevirtual

Load agent's slot state.

Parameters
filepathPath to load slot state from
Returns
Operation result string

Restores the agent's processing state from file

Definition at line 180 of file LLM_agent.h.

◆ remove_last_message()

void LLMAgent::remove_last_message ( )

Remove the last message from history.

Removes the most recently added message

Definition at line 101 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ save_history()

void LLMAgent::save_history ( const std::string & filepath) const

Save conversation history to file.

Parameters
filepathPath to save the history file

Saves the current conversation history as JSON to the specified file

Definition at line 109 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ save_slot()

virtual std::string LLMAgent::save_slot ( const std::string & filepath)
inlinevirtual

Save agent's slot state.

Parameters
filepathPath to save slot state
Returns
Operation result string

Saves the agent's current processing state to file

Definition at line 174 of file LLM_agent.h.

◆ set_history()

void LLMAgent::set_history ( const json & history_)
inline

Set conversation history.

Parameters
history_JSON array of chat messages

Replaces current conversation history with provided messages

Definition at line 211 of file LLM_agent.h.

Here is the caller graph for this function:

◆ set_n_keep()

void LLMAgent::set_n_keep ( )
protected

Definition at line 55 of file LLM_agent.cpp.

◆ set_overflow_strategy()

void LLMAgent::set_overflow_strategy ( ContextOverflowStrategy strategy,
float target_ratio = 0.5f,
const std::string & summarize_prompt = SUMMARY_PROMPT )
inline

Configure how the agent handles context overflow.

Parameters
strategyThe overflow strategy to use
target_ratioFraction of context to target after truncation (0.0–1.0, default 0.5)
summarize_promptPrompt used to ask the LLM to summarise the history

Definition at line 259 of file LLM_agent.h.

Here is the caller graph for this function:

◆ set_slot()

void LLMAgent::set_slot ( int id_slot)

Set processing slot ID.

Parameters
id_slotSlot ID to use for operations

Assigns a specific slot for this agent's processing (not available for remote LLMClient)

Definition at line 11 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ set_summary()

void LLMAgent::set_summary ( const std::string & summary_)
inline

Set the rolling summary directly (e.g. after loading from file)

Definition at line 280 of file LLM_agent.h.

Here is the caller graph for this function:

◆ set_system_prompt()

void LLMAgent::set_system_prompt ( const std::string & system_prompt_)
inline

Set system prompt.

Parameters
system_prompt_New system prompt text

Sets the system prompt and clears conversation history

Definition at line 202 of file LLM_agent.h.

Here is the caller graph for this function:

◆ slot_json()

std::string LLMAgent::slot_json ( const json & data)
inlineoverridevirtual

Manage slots with HTTP response support.

Parameters
dataJSON object with slot operation
Returns
JSON response string

Protected method used internally for server-based slot management

Implements LLMLocal.

Definition at line 129 of file LLM_agent.h.

◆ summarize_history()

void LLMAgent::summarize_history ( const std::string & user_prompt)
protected

Summarise the entire history (chunking if needed), embed summary in system message, then truncate if still needed.

Definition at line 214 of file LLM_agent.cpp.

Here is the caller graph for this function:

◆ tokenize_json()

std::string LLMAgent::tokenize_json ( const json & data)
inlineoverridevirtual

Tokenize input (override)

Parameters
dataJSON object containing text to tokenize
Returns
JSON string with token data

Implements LLM.

Definition at line 98 of file LLM_agent.h.

◆ truncate_history()

void LLMAgent::truncate_history ( const std::string & user_prompt)
protected

Remove oldest message pairs from the front until history fits within target_context_ratio.

Definition at line 193 of file LLM_agent.cpp.

Here is the caller graph for this function:

Member Data Documentation

◆ ASSISTANT_ROLE

const std::string LLMAgent::ASSISTANT_ROLE = "assistant"

Definition at line 86 of file LLM_agent.h.

◆ USER_ROLE

const std::string LLMAgent::USER_ROLE = "user"

Definition at line 85 of file LLM_agent.h.


The documentation for this class was generated from the following files: