Unity MonoBehaviour component that manages a local LLM server instance. Handles model loading, GPU acceleration, LORA adapters, and provides completion, tokenization, and embedding functionality.
More...
|
| async void | Awake () |
| | Unity Awake method that initializes the LLM server. Sets up the model, starts the service, and handles GPU fallback if needed.
|
| |
| void | OnDestroy () |
| |
| async Task | WaitUntilReady () |
| | Waits asynchronously until the LLM is ready to accept requests.
|
| |
| void | SetModel (string path) |
| | Sets the model file to use. Automatically configures context size and embedding settings.
|
| |
| void | SetReasoning (bool reasoning) |
| | Enable LLM reasoning ("thinking" mode)
|
| |
| void | SetEmbeddings (int embeddingLength, bool embeddingsOnly) |
| | Configure the LLM for embedding generation.
|
| |
| void | Register (LLMClient llmClient) |
| | Registers an LLMClient for slot management.
|
| |
| List< LoraIdScalePath > | ListLoras () |
| | Gets a list of loaded LORA adapters.
|
| |
| void | SetLora (string path, float weight=1f) |
| | Sets a single LORA adapter, replacing any existing ones.
|
| |
| void | AddLora (string path, float weight=1f) |
| | Adds a LORA adapter to the existing set.
|
| |
| void | RemoveLora (string path) |
| | Removes a specific LORA adapter.
|
| |
| void | RemoveLoras () |
| | Removes all LORA adapters.
|
| |
| void | SetLoraWeight (string path, float weight) |
| | Changes the weight of a specific LORA adapter.
|
| |
| void | SetLoraWeights (Dictionary< string, float > loraToWeight) |
| | Changes the weights of multiple LORA adapters.
|
| |
| void | SetSSLCertFromFile (string path) |
| | Sets the SSL certificate for secure server connections.
|
| |
| void | SetSSLKeyFromFile (string path) |
| | Sets the SSL private key for secure server connections.
|
| |
| void | Destroy () |
| | Stops and cleans up the LLM service.
|
| |
|
| static async Task< bool > | WaitUntilModelSetup (Action< float > downloadProgressCallback=null) |
| | Waits asynchronously until model setup is complete.
|
| |
|
| int | numThreads [get, set] |
| | Number of threads to use for processing (-1 = use all available threads)
|
| |
| int | numGPULayers [get, set] |
| | Number of model layers to offload to GPU (0 = CPU only)
|
| |
| int | parallelPrompts [get, set] |
| | Number of prompts that can be processed in parallel (-1 = auto-detect from clients)
|
| |
| int | contextSize [get, set] |
| | Size of the prompt context in tokens (0 = use model's default context size)
|
| |
| int | batchSize [get, set] |
| | Batch size for prompt processing (larger = more memory, potentially faster)
|
| |
| bool | flashAttention [get, set] |
| | Enable flash attention optimization (requires compatible model)
|
| |
| string | model [get, set] |
| | LLM model file path (.gguf format)
|
| |
| bool | reasoning [get, set] |
| | Enable LLM reasoning ('thinking' mode)
|
| |
| string | lora [get, set] |
| | LORA adapter model paths (.gguf format), separated by commas.
|
| |
| string | loraWeights [get, set] |
| | Weights for LORA adapters, separated by commas (default: 1.0 for each)
|
| |
| bool | remote [get, set] |
| | Enable remote server functionality to allow external connections.
|
| |
| int | port [get, set] |
| | Port to use for the remote LLM server.
|
| |
| string | APIKey [get, set] |
| | API key required for server access (leave empty to disable authentication)
|
| |
| string | SSLCert [get, set] |
| | SSL certificate for the remote LLM server.
|
| |
| string | SSLKey [get, set] |
| | SSL key for the remote LLM server.
|
| |
| bool | started = false [get] |
| | True if the LLM server has started and is ready to receive requests.
|
| |
| bool | failed = false [get] |
| | True if the LLM server failed to start.
|
| |
| static bool | modelSetupFailed = false [get] |
| | True if model setup failed during initialization.
|
| |
| static bool | modelSetupComplete = false [get] |
| | True if model setup completed (successfully or not)
|
| |
| LLMService | llmService [get] |
| | The underlying LLM service instance.
|
| |
| string | architecture [get] |
| | Model architecture name (e.g., llama, mistral)
|
| |
| bool | embeddingsOnly [get] |
| | True if this model only supports embeddings (no text generation)
|
| |
| int | embeddingLength [get] |
| | Number of dimensions in embedding vectors (0 if not an embedding model)
|
| |
Unity MonoBehaviour component that manages a local LLM server instance. Handles model loading, GPU acceleration, LORA adapters, and provides completion, tokenization, and embedding functionality.
Definition at line 20 of file LLM.cs.
◆ LLM()
◆ AddLora()
| void LLMUnity.LLM.AddLora |
( |
string | path, |
|
|
float | weight = 1f ) |
|
inline |
Adds a LORA adapter to the existing set.
- Parameters
-
| path | Path to LORA file (.gguf format) |
| weight | Adapter weight (default: 1.0) |
Definition at line 674 of file LLM.cs.
◆ Awake()
| async void LLMUnity.LLM.Awake |
( |
| ) |
|
|
inline |
Unity Awake method that initializes the LLM server. Sets up the model, starts the service, and handles GPU fallback if needed.
Definition at line 349 of file LLM.cs.
◆ Destroy()
| void LLMUnity.LLM.Destroy |
( |
| ) |
|
|
inline |
Stops and cleans up the LLM service.
Definition at line 821 of file LLM.cs.
◆ ListLoras()
| List< LoraIdScalePath > LLMUnity.LLM.ListLoras |
( |
| ) |
|
|
inline |
Gets a list of loaded LORA adapters.
- Returns
- List of LORA adapter information
Definition at line 648 of file LLM.cs.
◆ OnDestroy()
| void LLMUnity.LLM.OnDestroy |
( |
| ) |
|
|
inline |
◆ Register()
| void LLMUnity.LLM.Register |
( |
LLMClient | llmClient | ) |
|
|
inline |
Registers an LLMClient for slot management.
- Parameters
-
| llmClient | Client to register |
- Returns
- Assigned slot ID
Definition at line 634 of file LLM.cs.
◆ RemoveLora()
| void LLMUnity.LLM.RemoveLora |
( |
string | path | ) |
|
|
inline |
Removes a specific LORA adapter.
- Parameters
-
| path | Path to LORA file to remove |
Definition at line 685 of file LLM.cs.
◆ RemoveLoras()
| void LLMUnity.LLM.RemoveLoras |
( |
| ) |
|
|
inline |
Removes all LORA adapters.
Definition at line 695 of file LLM.cs.
◆ SetEmbeddings()
| void LLMUnity.LLM.SetEmbeddings |
( |
int | embeddingLength, |
|
|
bool | embeddingsOnly ) |
|
inline |
Configure the LLM for embedding generation.
- Parameters
-
| embeddingLength | Number of embedding dimensions |
| embeddingsOnly | True if model only supports embeddings |
Definition at line 619 of file LLM.cs.
◆ SetLora()
| void LLMUnity.LLM.SetLora |
( |
string | path, |
|
|
float | weight = 1f ) |
|
inline |
Sets a single LORA adapter, replacing any existing ones.
- Parameters
-
| path | Path to LORA file (.gguf format) |
| weight | Adapter weight (default: 1.0) |
Definition at line 662 of file LLM.cs.
◆ SetLoraWeight()
| void LLMUnity.LLM.SetLoraWeight |
( |
string | path, |
|
|
float | weight ) |
|
inline |
Changes the weight of a specific LORA adapter.
- Parameters
-
| path | Path to LORA file |
| weight | New weight value |
Definition at line 707 of file LLM.cs.
◆ SetLoraWeights()
| void LLMUnity.LLM.SetLoraWeights |
( |
Dictionary< string, float > | loraToWeight | ) |
|
|
inline |
Changes the weights of multiple LORA adapters.
- Parameters
-
| loraToWeight | Dictionary mapping LORA paths to weights |
Definition at line 718 of file LLM.cs.
◆ SetModel()
| void LLMUnity.LLM.SetModel |
( |
string | path | ) |
|
|
inline |
Sets the model file to use. Automatically configures context size and embedding settings.
- Parameters
-
| path | Path to the model file (.gguf format) |
Definition at line 577 of file LLM.cs.
◆ SetReasoning()
| void LLMUnity.LLM.SetReasoning |
( |
bool | reasoning | ) |
|
|
inline |
Enable LLM reasoning ("thinking" mode)
<param name="reasoning"Use LLM reasoning
Definition at line 609 of file LLM.cs.
◆ SetSSLCertFromFile()
| void LLMUnity.LLM.SetSSLCertFromFile |
( |
string | path | ) |
|
|
inline |
Sets the SSL certificate for secure server connections.
- Parameters
-
| path | Path to SSL certificate file |
Definition at line 770 of file LLM.cs.
◆ SetSSLKeyFromFile()
| void LLMUnity.LLM.SetSSLKeyFromFile |
( |
string | path | ) |
|
|
inline |
Sets the SSL private key for secure server connections.
- Parameters
-
| path | Path to SSL private key file |
Definition at line 779 of file LLM.cs.
◆ WaitUntilModelSetup()
| static async Task< bool > LLMUnity.LLM.WaitUntilModelSetup |
( |
Action< float > | downloadProgressCallback = null | ) |
|
|
inlinestatic |
Waits asynchronously until model setup is complete.
- Parameters
-
| downloadProgressCallback | Optional callback for download progress updates |
- Returns
- True if setup succeeded, false if it failed
Definition at line 558 of file LLM.cs.
◆ WaitUntilReady()
| async Task LLMUnity.LLM.WaitUntilReady |
( |
| ) |
|
|
inline |
Waits asynchronously until the LLM is ready to accept requests.
- Returns
- Task that completes when LLM is ready
Definition at line 540 of file LLM.cs.
◆ advancedOptions
| bool LLMUnity.LLM.advancedOptions = false |
Show/hide advanced options in the inspector.
Definition at line 25 of file LLM.cs.
◆ dontDestroyOnLoad
| bool LLMUnity.LLM.dontDestroyOnLoad = true |
Persist this LLM GameObject across scene transitions.
Definition at line 89 of file LLM.cs.
◆ APIKey
| string LLMUnity.LLM.APIKey |
|
getset |
API key required for server access (leave empty to disable authentication)
Definition at line 247 of file LLM.cs.
◆ architecture
| string LLMUnity.LLM.architecture |
|
get |
Model architecture name (e.g., llama, mistral)
Definition at line 302 of file LLM.cs.
◆ batchSize
| int LLMUnity.LLM.batchSize |
|
getset |
Batch size for prompt processing (larger = more memory, potentially faster)
Definition at line 157 of file LLM.cs.
◆ contextSize
| int LLMUnity.LLM.contextSize |
|
getset |
Size of the prompt context in tokens (0 = use model's default context size)
Definition at line 144 of file LLM.cs.
◆ embeddingLength
| int LLMUnity.LLM.embeddingLength |
|
get |
Number of dimensions in embedding vectors (0 if not an embedding model)
Definition at line 310 of file LLM.cs.
◆ embeddingsOnly
| bool LLMUnity.LLM.embeddingsOnly |
|
get |
True if this model only supports embeddings (no text generation)
Definition at line 306 of file LLM.cs.
◆ failed
| bool LLMUnity.LLM.failed = false |
|
get |
True if the LLM server failed to start.
Definition at line 289 of file LLM.cs.
◆ flashAttention
| bool LLMUnity.LLM.flashAttention |
|
getset |
Enable flash attention optimization (requires compatible model)
Definition at line 170 of file LLM.cs.
◆ llmService
| LLMService LLMUnity.LLM.llmService |
|
get |
The underlying LLM service instance.
Definition at line 298 of file LLM.cs.
◆ lora
LORA adapter model paths (.gguf format), separated by commas.
Definition at line 195 of file LLM.cs.
◆ loraWeights
| string LLMUnity.LLM.loraWeights |
|
getset |
Weights for LORA adapters, separated by commas (default: 1.0 for each)
Definition at line 208 of file LLM.cs.
◆ model
| string LLMUnity.LLM.model |
|
getset |
LLM model file path (.gguf format)
Definition at line 181 of file LLM.cs.
◆ modelSetupComplete
| bool LLMUnity.LLM.modelSetupComplete = false |
|
staticget |
True if model setup completed (successfully or not)
Definition at line 295 of file LLM.cs.
◆ modelSetupFailed
| bool LLMUnity.LLM.modelSetupFailed = false |
|
staticget |
True if model setup failed during initialization.
Definition at line 292 of file LLM.cs.
◆ numGPULayers
| int LLMUnity.LLM.numGPULayers |
|
getset |
Number of model layers to offload to GPU (0 = CPU only)
Definition at line 118 of file LLM.cs.
◆ numThreads
| int LLMUnity.LLM.numThreads |
|
getset |
Number of threads to use for processing (-1 = use all available threads)
Definition at line 105 of file LLM.cs.
◆ parallelPrompts
| int LLMUnity.LLM.parallelPrompts |
|
getset |
Number of prompts that can be processed in parallel (-1 = auto-detect from clients)
Definition at line 131 of file LLM.cs.
◆ port
Port to use for the remote LLM server.
Definition at line 233 of file LLM.cs.
◆ reasoning
| bool LLMUnity.LLM.reasoning |
|
getset |
Enable LLM reasoning ('thinking' mode)
Definition at line 188 of file LLM.cs.
◆ remote
Enable remote server functionality to allow external connections.
Definition at line 221 of file LLM.cs.
◆ SSLCert
| string LLMUnity.LLM.SSLCert |
|
getset |
SSL certificate for the remote LLM server.
Definition at line 259 of file LLM.cs.
◆ SSLKey
| string LLMUnity.LLM.SSLKey |
|
getset |
SSL key for the remote LLM server.
Definition at line 271 of file LLM.cs.
◆ started
| bool LLMUnity.LLM.started = false |
|
get |
True if the LLM server has started and is ready to receive requests.
Definition at line 286 of file LLM.cs.
The documentation for this class was generated from the following file: