Client for accessing LLM functionality locally or remotely. More...

#include <LLM_client.h>

Inheritance diagram for LLMClient:

Public Member Functions
	LLMClient (LLMProvider *llm)
	Constructor for local LLM access.

	LLMClient (const std::string &url, const int port, const std::string &API_key="", const int max_retries=5)
	Constructor for remote LLM access.

	~LLMClient ()
	Destructor.

bool	is_server_alive ()

void	set_SSL (const char *SSL_cert)
	Configure SSL certificate for remote connections.

bool	is_remote () const
	Check if this is a remote client.

std::string	tokenize_json (const json &data) override
	Tokenize input (override)

std::string	detokenize_json (const json &data) override
	Convert tokens back to text.

std::string	embeddings_json (const json &data) override
	Generate embeddings with HTTP response support.

std::string	completion_json (const json &data, CharArrayFn callback=nullptr, bool callbackWithJSON=true) override
	Generate text completion (override)

std::string	apply_template_json (const json &data) override
	Apply a chat template to message data.

std::string	slot_json (const json &data) override
	Manage slots with HTTP response support.

void	cancel (int id_slot) override
	Cancel running request (override)

int	get_next_available_slot () override
	Get available processing slot (override)

Public Member Functions inherited from LLMLocal
virtual std::string	save_slot (int id_slot, const std::string &filepath)
	Save slot state to file.

virtual std::string	load_slot (int id_slot, const std::string &filepath)
	Load slot state from file.

Public Member Functions inherited from LLM
virtual	~LLM ()=default
	Virtual destructor.

virtual std::vector< int >	tokenize (const std::string &query)
	Tokenize text.

virtual std::string	detokenize (const std::vector< int32_t > &tokens)
	Convert tokens to text.

virtual std::vector< float >	embeddings (const std::string &query)
	Generate embeddings.

virtual void	set_completion_params (json completion_params_)
	Set completion parameters.

virtual std::string	get_completion_params ()
	Get current completion parameters.

virtual std::string	completion (const std::string &prompt, CharArrayFn callback=nullptr, int id_slot=-1, bool return_response_json=false)
	Generate completion.

virtual void	set_grammar (std::string grammar_)
	Set grammar for constrained generation.

virtual std::string	get_grammar ()
	Get current grammar specification.

virtual std::string	apply_template (const json &messages)
	Apply template to messages.

Additional Inherited Members
Static Public Member Functions inherited from LLM
static bool	has_gpu_layers (const std::string &command)
	Check if command line arguments specify GPU layers.

static std::string	LLM_args_to_command (const std::string &model_path, int num_slots=1, int num_threads=-1, int num_GPU_layers=0, bool flash_attention=false, int context_size=4096, int batch_size=2048, bool embedding_only=false, const std::vector< std::string > &lora_paths={})
	Convert LLM parameters to command line arguments.

Public Attributes inherited from LLM
int32_t	n_keep = 0
	Number of tokens to keep from the beginning of the context.

std::string	grammar = ""
	Grammar specification in GBNF format or JSON schema.

json	completion_params
	JSON object containing completion parameters.

Protected Member Functions inherited from LLMLocal
virtual std::string	slot (int id_slot, const std::string &action, const std::string &filepath)
	Perform slot operation.

virtual json	build_slot_json (int id_slot, const std::string &action, const std::string &filepath)
	Build JSON for slot operations.

virtual std::string	parse_slot_json (const json &result)
	Parse slot operation result.

Protected Member Functions inherited from LLM
virtual json	build_apply_template_json (const json &messages)
	Build JSON for template application.

virtual std::string	parse_apply_template_json (const json &result)
	Parse template application result.

virtual json	build_tokenize_json (const std::string &query)
	Build JSON for tokenization.

virtual std::vector< int >	parse_tokenize_json (const json &result)
	Parse tokenization result.

virtual json	build_detokenize_json (const std::vector< int32_t > &tokens)
	Build JSON for detokenization.

virtual std::string	parse_detokenize_json (const json &result)
	Parse detokenization result.

virtual json	build_embeddings_json (const std::string &query)
	Build JSON for embeddings generation.

virtual std::vector< float >	parse_embeddings_json (const json &result)
	Parse embeddings result.

virtual json	build_completion_json (const std::string &prompt, int id_slot=-1)
	Build JSON for completion generation.

virtual std::string	parse_completion_json (const json &result)
	Parse completion result.

Detailed Description

Client for accessing LLM functionality locally or remotely.

Provides a unified interface that can connect to either local LLMProvider instances or remote LLM services via HTTP. Supports all standard LLM operations including completion, tokenization, embeddings, and slot management.

Definition at line 31 of file LLM_client.h.

Constructor & Destructor Documentation

◆ LLMClient() [1/2]

LLMClient::LLMClient ( LLMProvider * llm )

Constructor for local LLM access.

Parameters

llm	Pointer to local LLMProvider instance

Creates a client that directly accesses a local LLM provider

Definition at line 226 of file LLM_client.cpp.

◆ LLMClient() [2/2]

LLMClient::LLMClient	(	const std::string &	url,
		const int	port,
		const std::string &	API_key = "",
		const int	max_retries = 5 )

Constructor for remote LLM access.

Parameters

url	Server URL or hostname
port	Server port number
API_key	Optional API key

Creates a client that connects to a remote LLM server via HTTP

Definition at line 229 of file LLM_client.cpp.

◆ ~LLMClient()

LLMClient::~LLMClient ( )

Destructor.

Definition at line 258 of file LLM_client.cpp.

Member Function Documentation

◆ apply_template_json()

std::string LLMClient::apply_template_json ( const json & data )

overridevirtual

Apply a chat template to message data.

Parameters

data	JSON object containing messages to format

Returns: Formatted string with template applied

Pure virtual method for applying chat templates to conversation data

Implements LLM.

Definition at line 345 of file LLM_client.cpp.

◆ cancel()

void LLMClient::cancel ( int id_slot )

overridevirtual

Cancel running request (override)

Parameters

data	JSON object with cancellation parameters

Implements LLMLocal.

Definition at line 370 of file LLM_client.cpp.

◆ completion_json()

std::string LLMClient::completion_json	(	const json &	data,
		CharArrayFn	callback = nullptr,
		bool	callbackWithJSON = true )

overridevirtual

Generate text completion (override)

Parameters

data	JSON object with prompt and parameters
callback	Optional callback for streaming responses
callbackWithJSON	Whether callback receives JSON format

Returns: Generated completion text or JSON

Implements LLM.

Definition at line 320 of file LLM_client.cpp.

◆ detokenize_json()

std::string LLMClient::detokenize_json ( const json & data )

overridevirtual

Convert tokens back to text.

Parameters

data	JSON object containing token IDs

Returns: JSON string containing detokenized text

Pure virtual method for converting token sequences back to text

Implements LLM.

Definition at line 296 of file LLM_client.cpp.

◆ embeddings_json()

std::string LLMClient::embeddings_json ( const json & data )

overridevirtual

Generate embeddings with HTTP response support.

Parameters

data	JSON object containing embedding request

Returns: JSON string with embedding data

Protected method used internally for server-based embedding generation

Implements LLM.

Definition at line 308 of file LLM_client.cpp.

◆ get_next_available_slot()

int LLMClient::get_next_available_slot ( )

overridevirtual

Get available processing slot (override)

Returns: Available slot ID or -1 if none available

Implements LLMLocal.

Definition at line 338 of file LLM_client.cpp.

◆ is_remote()

bool LLMClient::is_remote ( ) const

inline

Check if this is a remote client.

Returns: true if configured for remote access, false for local access

Helper method to determine the client's connection type

Definition at line 60 of file LLM_client.h.

Here is the caller graph for this function:

◆ is_server_alive()

bool LLMClient::is_server_alive ( )

Definition at line 71 of file LLM_client.cpp.

◆ set_SSL()

void LLMClient::set_SSL ( const char * SSL_cert )

Configure SSL certificate for remote connections.

Parameters

SSL_cert Path to SSL certificate file

Only applicable for remote clients. Sets up SSL verification.

Definition at line 272 of file LLM_client.cpp.

Here is the caller graph for this function:

◆ slot_json()

std::string LLMClient::slot_json ( const json & data )

overridevirtual

Manage slots with HTTP response support.

Parameters

data	JSON object with slot operation

Returns: JSON response string

Protected method used internally for server-based slot management

Implements LLMLocal.

Definition at line 357 of file LLM_client.cpp.

◆ tokenize_json()

std::string LLMClient::tokenize_json ( const json & data )

overridevirtual

Tokenize input (override)

Parameters

data	JSON object containing text to tokenize

Returns: JSON string with token data

Implements LLM.

Definition at line 284 of file LLM_client.cpp.

The documentation for this class was generated from the following files:

include/LLM_client.h
src/LLM_client.cpp

Public Member Functions

Additional Inherited Members

Detailed Description

Constructor & Destructor Documentation

◆ LLMClient() [1/2]

◆ LLMClient() [2/2]

◆ ~LLMClient()

Member Function Documentation

◆ apply_template_json()

◆ cancel()

◆ completion_json()

◆ detokenize_json()

◆ embeddings_json()

◆ get_next_available_slot()

◆ is_remote()

◆ is_server_alive()

◆ set_SSL()

◆ slot_json()

◆ tokenize_json()