![]() |
LlamaLib
v2.0.2
Cross-platform library for local LLMs
|
Client for accessing LLM functionality locally or remotely. More...
#include <LLM_client.h>
Public Member Functions | |
| LLMClient (LLMProvider *llm) | |
| Constructor for local LLM access. | |
| LLMClient (const std::string &url, const int port, const std::string &API_key="", const int max_retries=5) | |
| Constructor for remote LLM access. | |
| ~LLMClient () | |
| Destructor. | |
| bool | is_server_alive () |
| void | set_SSL (const char *SSL_cert) |
| Configure SSL certificate for remote connections. | |
| bool | is_remote () const |
| Check if this is a remote client. | |
| std::string | tokenize_json (const json &data) override |
| Tokenize input (override) | |
| std::string | detokenize_json (const json &data) override |
| Convert tokens back to text. | |
| std::string | embeddings_json (const json &data) override |
| Generate embeddings with HTTP response support. | |
| std::string | completion_json (const json &data, CharArrayFn callback=nullptr, bool callbackWithJSON=true) override |
| Generate text completion (override) | |
| std::string | apply_template_json (const json &data) override |
| Apply a chat template to message data. | |
| std::string | slot_json (const json &data) override |
| Manage slots with HTTP response support. | |
| void | cancel (int id_slot) override |
| Cancel running request (override) | |
| int | get_next_available_slot () override |
| Get available processing slot (override) | |
Public Member Functions inherited from LLMLocal | |
| virtual std::string | save_slot (int id_slot, const std::string &filepath) |
| Save slot state to file. | |
| virtual std::string | load_slot (int id_slot, const std::string &filepath) |
| Load slot state from file. | |
Public Member Functions inherited from LLM | |
| virtual | ~LLM ()=default |
| Virtual destructor. | |
| virtual std::vector< int > | tokenize (const std::string &query) |
| Tokenize text. | |
| virtual std::string | detokenize (const std::vector< int32_t > &tokens) |
| Convert tokens to text. | |
| virtual std::vector< float > | embeddings (const std::string &query) |
| Generate embeddings. | |
| virtual void | set_completion_params (json completion_params_) |
| Set completion parameters. | |
| virtual std::string | get_completion_params () |
| Get current completion parameters. | |
| virtual std::string | completion (const std::string &prompt, CharArrayFn callback=nullptr, int id_slot=-1, bool return_response_json=false) |
| Generate completion. | |
| virtual void | set_grammar (std::string grammar_) |
| Set grammar for constrained generation. | |
| virtual std::string | get_grammar () |
| Get current grammar specification. | |
| virtual std::string | apply_template (const json &messages) |
| Apply template to messages. | |
Additional Inherited Members | |
Static Public Member Functions inherited from LLM | |
| static bool | has_gpu_layers (const std::string &command) |
| Check if command line arguments specify GPU layers. | |
| static std::string | LLM_args_to_command (const std::string &model_path, int num_slots=1, int num_threads=-1, int num_GPU_layers=0, bool flash_attention=false, int context_size=4096, int batch_size=2048, bool embedding_only=false, const std::vector< std::string > &lora_paths={}) |
| Convert LLM parameters to command line arguments. | |
Public Attributes inherited from LLM | |
| int32_t | n_keep = 0 |
| Number of tokens to keep from the beginning of the context. | |
| std::string | grammar = "" |
| Grammar specification in GBNF format or JSON schema. | |
| json | completion_params |
| JSON object containing completion parameters. | |
Protected Member Functions inherited from LLMLocal | |
| virtual std::string | slot (int id_slot, const std::string &action, const std::string &filepath) |
| Perform slot operation. | |
| virtual json | build_slot_json (int id_slot, const std::string &action, const std::string &filepath) |
| Build JSON for slot operations. | |
| virtual std::string | parse_slot_json (const json &result) |
| Parse slot operation result. | |
Protected Member Functions inherited from LLM | |
| virtual json | build_apply_template_json (const json &messages) |
| Build JSON for template application. | |
| virtual std::string | parse_apply_template_json (const json &result) |
| Parse template application result. | |
| virtual json | build_tokenize_json (const std::string &query) |
| Build JSON for tokenization. | |
| virtual std::vector< int > | parse_tokenize_json (const json &result) |
| Parse tokenization result. | |
| virtual json | build_detokenize_json (const std::vector< int32_t > &tokens) |
| Build JSON for detokenization. | |
| virtual std::string | parse_detokenize_json (const json &result) |
| Parse detokenization result. | |
| virtual json | build_embeddings_json (const std::string &query) |
| Build JSON for embeddings generation. | |
| virtual std::vector< float > | parse_embeddings_json (const json &result) |
| Parse embeddings result. | |
| virtual json | build_completion_json (const std::string &prompt, int id_slot=-1) |
| Build JSON for completion generation. | |
| virtual std::string | parse_completion_json (const json &result) |
| Parse completion result. | |
Client for accessing LLM functionality locally or remotely.
Provides a unified interface that can connect to either local LLMProvider instances or remote LLM services via HTTP. Supports all standard LLM operations including completion, tokenization, embeddings, and slot management.
Definition at line 31 of file LLM_client.h.
| LLMClient::LLMClient | ( | LLMProvider * | llm | ) |
Constructor for local LLM access.
| llm | Pointer to local LLMProvider instance |
Creates a client that directly accesses a local LLM provider
Definition at line 226 of file LLM_client.cpp.
| LLMClient::LLMClient | ( | const std::string & | url, |
| const int | port, | ||
| const std::string & | API_key = "", | ||
| const int | max_retries = 5 ) |
Constructor for remote LLM access.
| url | Server URL or hostname |
| port | Server port number |
| API_key | Optional API key |
Creates a client that connects to a remote LLM server via HTTP
Definition at line 229 of file LLM_client.cpp.
| LLMClient::~LLMClient | ( | ) |
Destructor.
Definition at line 258 of file LLM_client.cpp.
|
overridevirtual |
Apply a chat template to message data.
| data | JSON object containing messages to format |
Pure virtual method for applying chat templates to conversation data
Implements LLM.
Definition at line 345 of file LLM_client.cpp.
|
overridevirtual |
Cancel running request (override)
| data | JSON object with cancellation parameters |
Implements LLMLocal.
Definition at line 370 of file LLM_client.cpp.
|
overridevirtual |
Generate text completion (override)
| data | JSON object with prompt and parameters |
| callback | Optional callback for streaming responses |
| callbackWithJSON | Whether callback receives JSON format |
Implements LLM.
Definition at line 320 of file LLM_client.cpp.
|
overridevirtual |
Convert tokens back to text.
| data | JSON object containing token IDs |
Pure virtual method for converting token sequences back to text
Implements LLM.
Definition at line 296 of file LLM_client.cpp.
|
overridevirtual |
Generate embeddings with HTTP response support.
| data | JSON object containing embedding request |
Protected method used internally for server-based embedding generation
Implements LLM.
Definition at line 308 of file LLM_client.cpp.
|
overridevirtual |
Get available processing slot (override)
Implements LLMLocal.
Definition at line 338 of file LLM_client.cpp.
|
inline |
Check if this is a remote client.
Helper method to determine the client's connection type
Definition at line 60 of file LLM_client.h.
| bool LLMClient::is_server_alive | ( | ) |
Definition at line 71 of file LLM_client.cpp.
| void LLMClient::set_SSL | ( | const char * | SSL_cert | ) |
Configure SSL certificate for remote connections.
| SSL_cert | Path to SSL certificate file |
Only applicable for remote clients. Sets up SSL verification.
Definition at line 272 of file LLM_client.cpp.
|
overridevirtual |
Manage slots with HTTP response support.
| data | JSON object with slot operation |
Protected method used internally for server-based slot management
Implements LLMLocal.
Definition at line 357 of file LLM_client.cpp.
|
overridevirtual |
Tokenize input (override)
| data | JSON object containing text to tokenize |
Implements LLM.
Definition at line 284 of file LLM_client.cpp.