LlamaLib  v2.0.2
Cross-platform library for local LLMs
Loading...
Searching...
No Matches
LLMClient Class Reference

Client for accessing LLM functionality locally or remotely. More...

#include <LLM_client.h>

Inheritance diagram for LLMClient:
[legend]

Public Member Functions

 LLMClient (LLMProvider *llm)
 Constructor for local LLM access.
 
 LLMClient (const std::string &url, const int port, const std::string &API_key="", const int max_retries=5)
 Constructor for remote LLM access.
 
 ~LLMClient ()
 Destructor.
 
bool is_server_alive ()
 
void set_SSL (const char *SSL_cert)
 Configure SSL certificate for remote connections.
 
bool is_remote () const
 Check if this is a remote client.
 
std::string tokenize_json (const json &data) override
 Tokenize input (override)
 
std::string detokenize_json (const json &data) override
 Convert tokens back to text.
 
std::string embeddings_json (const json &data) override
 Generate embeddings with HTTP response support.
 
std::string completion_json (const json &data, CharArrayFn callback=nullptr, bool callbackWithJSON=true) override
 Generate text completion (override)
 
std::string apply_template_json (const json &data) override
 Apply a chat template to message data.
 
std::string slot_json (const json &data) override
 Manage slots with HTTP response support.
 
void cancel (int id_slot) override
 Cancel running request (override)
 
int get_next_available_slot () override
 Get available processing slot (override)
 
- Public Member Functions inherited from LLMLocal
virtual std::string save_slot (int id_slot, const std::string &filepath)
 Save slot state to file.
 
virtual std::string load_slot (int id_slot, const std::string &filepath)
 Load slot state from file.
 
- Public Member Functions inherited from LLM
virtual ~LLM ()=default
 Virtual destructor.
 
virtual std::vector< int > tokenize (const std::string &query)
 Tokenize text.
 
virtual std::string detokenize (const std::vector< int32_t > &tokens)
 Convert tokens to text.
 
virtual std::vector< float > embeddings (const std::string &query)
 Generate embeddings.
 
virtual void set_completion_params (json completion_params_)
 Set completion parameters.
 
virtual std::string get_completion_params ()
 Get current completion parameters.
 
virtual std::string completion (const std::string &prompt, CharArrayFn callback=nullptr, int id_slot=-1, bool return_response_json=false)
 Generate completion.
 
virtual void set_grammar (std::string grammar_)
 Set grammar for constrained generation.
 
virtual std::string get_grammar ()
 Get current grammar specification.
 
virtual std::string apply_template (const json &messages)
 Apply template to messages.
 

Additional Inherited Members

- Static Public Member Functions inherited from LLM
static bool has_gpu_layers (const std::string &command)
 Check if command line arguments specify GPU layers.
 
static std::string LLM_args_to_command (const std::string &model_path, int num_slots=1, int num_threads=-1, int num_GPU_layers=0, bool flash_attention=false, int context_size=4096, int batch_size=2048, bool embedding_only=false, const std::vector< std::string > &lora_paths={})
 Convert LLM parameters to command line arguments.
 
- Public Attributes inherited from LLM
int32_t n_keep = 0
 Number of tokens to keep from the beginning of the context.
 
std::string grammar = ""
 Grammar specification in GBNF format or JSON schema.
 
json completion_params
 JSON object containing completion parameters.
 
- Protected Member Functions inherited from LLMLocal
virtual std::string slot (int id_slot, const std::string &action, const std::string &filepath)
 Perform slot operation.
 
virtual json build_slot_json (int id_slot, const std::string &action, const std::string &filepath)
 Build JSON for slot operations.
 
virtual std::string parse_slot_json (const json &result)
 Parse slot operation result.
 
- Protected Member Functions inherited from LLM
virtual json build_apply_template_json (const json &messages)
 Build JSON for template application.
 
virtual std::string parse_apply_template_json (const json &result)
 Parse template application result.
 
virtual json build_tokenize_json (const std::string &query)
 Build JSON for tokenization.
 
virtual std::vector< int > parse_tokenize_json (const json &result)
 Parse tokenization result.
 
virtual json build_detokenize_json (const std::vector< int32_t > &tokens)
 Build JSON for detokenization.
 
virtual std::string parse_detokenize_json (const json &result)
 Parse detokenization result.
 
virtual json build_embeddings_json (const std::string &query)
 Build JSON for embeddings generation.
 
virtual std::vector< float > parse_embeddings_json (const json &result)
 Parse embeddings result.
 
virtual json build_completion_json (const std::string &prompt, int id_slot=-1)
 Build JSON for completion generation.
 
virtual std::string parse_completion_json (const json &result)
 Parse completion result.
 

Detailed Description

Client for accessing LLM functionality locally or remotely.

Provides a unified interface that can connect to either local LLMProvider instances or remote LLM services via HTTP. Supports all standard LLM operations including completion, tokenization, embeddings, and slot management.

Definition at line 31 of file LLM_client.h.

Constructor & Destructor Documentation

◆ LLMClient() [1/2]

LLMClient::LLMClient ( LLMProvider * llm)

Constructor for local LLM access.

Parameters
llmPointer to local LLMProvider instance

Creates a client that directly accesses a local LLM provider

Definition at line 226 of file LLM_client.cpp.

◆ LLMClient() [2/2]

LLMClient::LLMClient ( const std::string & url,
const int port,
const std::string & API_key = "",
const int max_retries = 5 )

Constructor for remote LLM access.

Parameters
urlServer URL or hostname
portServer port number
API_keyOptional API key

Creates a client that connects to a remote LLM server via HTTP

Definition at line 229 of file LLM_client.cpp.

◆ ~LLMClient()

LLMClient::~LLMClient ( )

Destructor.

Definition at line 258 of file LLM_client.cpp.

Member Function Documentation

◆ apply_template_json()

std::string LLMClient::apply_template_json ( const json & data)
overridevirtual

Apply a chat template to message data.

Parameters
dataJSON object containing messages to format
Returns
Formatted string with template applied

Pure virtual method for applying chat templates to conversation data

Implements LLM.

Definition at line 345 of file LLM_client.cpp.

◆ cancel()

void LLMClient::cancel ( int id_slot)
overridevirtual

Cancel running request (override)

Parameters
dataJSON object with cancellation parameters

Implements LLMLocal.

Definition at line 370 of file LLM_client.cpp.

◆ completion_json()

std::string LLMClient::completion_json ( const json & data,
CharArrayFn callback = nullptr,
bool callbackWithJSON = true )
overridevirtual

Generate text completion (override)

Parameters
dataJSON object with prompt and parameters
callbackOptional callback for streaming responses
callbackWithJSONWhether callback receives JSON format
Returns
Generated completion text or JSON

Implements LLM.

Definition at line 320 of file LLM_client.cpp.

◆ detokenize_json()

std::string LLMClient::detokenize_json ( const json & data)
overridevirtual

Convert tokens back to text.

Parameters
dataJSON object containing token IDs
Returns
JSON string containing detokenized text

Pure virtual method for converting token sequences back to text

Implements LLM.

Definition at line 296 of file LLM_client.cpp.

◆ embeddings_json()

std::string LLMClient::embeddings_json ( const json & data)
overridevirtual

Generate embeddings with HTTP response support.

Parameters
dataJSON object containing embedding request
Returns
JSON string with embedding data

Protected method used internally for server-based embedding generation

Implements LLM.

Definition at line 308 of file LLM_client.cpp.

◆ get_next_available_slot()

int LLMClient::get_next_available_slot ( )
overridevirtual

Get available processing slot (override)

Returns
Available slot ID or -1 if none available

Implements LLMLocal.

Definition at line 338 of file LLM_client.cpp.

◆ is_remote()

bool LLMClient::is_remote ( ) const
inline

Check if this is a remote client.

Returns
true if configured for remote access, false for local access

Helper method to determine the client's connection type

Definition at line 60 of file LLM_client.h.

Here is the caller graph for this function:

◆ is_server_alive()

bool LLMClient::is_server_alive ( )

Definition at line 71 of file LLM_client.cpp.

◆ set_SSL()

void LLMClient::set_SSL ( const char * SSL_cert)

Configure SSL certificate for remote connections.

Parameters
SSL_certPath to SSL certificate file

Only applicable for remote clients. Sets up SSL verification.

Definition at line 272 of file LLM_client.cpp.

Here is the caller graph for this function:

◆ slot_json()

std::string LLMClient::slot_json ( const json & data)
overridevirtual

Manage slots with HTTP response support.

Parameters
dataJSON object with slot operation
Returns
JSON response string

Protected method used internally for server-based slot management

Implements LLMLocal.

Definition at line 357 of file LLM_client.cpp.

◆ tokenize_json()

std::string LLMClient::tokenize_json ( const json & data)
overridevirtual

Tokenize input (override)

Parameters
dataJSON object containing text to tokenize
Returns
JSON string with token data

Implements LLM.

Definition at line 284 of file LLM_client.cpp.


The documentation for this class was generated from the following files: