LLM for Unity  v2.4.2
Create characters in Unity with LLMs!
Loading...
Searching...
No Matches
LLMUnity.LLM Class Reference

Class implementing the LLM server. More...

Inheritance diagram for LLMUnity.LLM:
[legend]

Public Member Functions

async void Awake ()
 The Unity Awake function that starts the LLM server.
 
async Task WaitUntilReady ()
 Allows to wait until the LLM is ready.
 
void SetModel (string path)
 Allows to set the model used by the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void SetLora (string path, float weight=1)
 Allows to set a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void AddLora (string path, float weight=1)
 Allows to add a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void RemoveLora (string path)
 Allows to remove a LORA model from the LLM. Models supported are in .gguf format.
 
void RemoveLoras ()
 Allows to remove all LORA models from the LLM.
 
void SetLoraWeight (string path, float weight)
 Allows to change the weight (scale) of a LORA model in the LLM.
 
void SetLoraWeights (Dictionary< string, float > loraToWeight)
 Allows to change the weights (scale) of the LORA models in the LLM.
 
void UpdateLoras ()
 
void SetTemplate (string templateName, bool setDirty=true)
 Set the chat template for the LLM.
 
void SetEmbeddings (int embeddingLength, bool embeddingsOnly)
 Set LLM Embedding parameters.
 
void SetSSLCert (string path)
 Use a SSL certificate for the LLM server.
 
void SetSSLKey (string path)
 Use a SSL key for the LLM server.
 
string GetTemplate ()
 Returns the chat template of the LLM.
 
int Register (LLMCaller llmCaller)
 Registers a local LLMCaller object. This allows to bind the LLMCaller "client" to a specific slot of the LLM.
 
void Update ()
 The Unity Update function. It is used to retrieve the LLM replies.
 
async Task< string > Tokenize (string json)
 Tokenises the provided query.
 
async Task< string > Detokenize (string json)
 Detokenises the provided query.
 
async Task< string > Embeddings (string json)
 Computes the embeddings of the provided query.
 
void ApplyLoras ()
 Sets the lora scale, only works after the LLM service has started.
 
async Task< List< LoraWeightResult > > ListLoras ()
 Gets a list of the lora adapters.
 
async Task< string > Slot (string json)
 Allows to save / restore the state of a slot.
 
async Task< string > Completion (string json, Callback< string > streamCallback=null)
 Allows to use the chat and completion functionality of the LLM.
 
void CancelRequest (int id_slot)
 Allows to cancel the requests in a specific slot of the LLM.
 
void Destroy ()
 Stops and destroys the LLM.
 
void OnDestroy ()
 The Unity OnDestroy function called when the onbject is destroyed. The function StopProcess is called to stop the LLM server.
 

Static Public Member Functions

static async Task< bool > WaitUntilModelSetup (Callback< float > downloadProgressCallback=null)
 Allows to wait until the LLM models are downloaded and ready.
 

Public Attributes

bool advancedOptions = false
 show/hide advanced options in the GameObject
 
bool remote = false
 enable remote server functionality
 
int port = 13333
 port to use for the remote LLM server
 
int numThreads = -1
 number of threads to use (-1 = all)
 
int numGPULayers = 0
 number of model layers to offload to the GPU (0 = GPU not used). If the user's GPU is not supported, the LLM will fall back to the CPU
 
bool debug = false
 log the output of the LLM in the Unity Editor.
 
int parallelPrompts = -1
 number of prompts that can happen in parallel (-1 = number of LLMCaller objects)
 
bool dontDestroyOnLoad = true
 do not destroy the LLM GameObject when loading a new Scene.
 
int contextSize = 8192
 Size of the prompt context (0 = context size of the model). This is the number of tokens the model can take as input when generating responses.
 
int batchSize = 512
 Batch size for prompt processing.
 
string model = ""
 LLM model to use (.gguf format)
 
string chatTemplate = ChatTemplate.DefaultTemplate
 Chat template for the model.
 
string lora = ""
 LORA models to use (.gguf format)
 
string loraWeights = ""
 the weights of the LORA models being used.
 
bool flashAttention = false
 enable use of flash attention
 
string APIKey
 API key to use for the server.
 
string SSLCertPath = ""
 
string SSLKeyPath = ""
 

Properties

bool started = false [get]
 Boolean set to true if the server has started and is ready to receive requests, false otherwise.
 
bool failed = false [get]
 Boolean set to true if the server has failed to start.
 
static bool modelSetupFailed = false [get]
 Boolean set to true if the models were not downloaded successfully.
 
static bool modelSetupComplete = false [get]
 Boolean set to true if the server has started and is ready to receive requests, false otherwise.
 

Detailed Description

Class implementing the LLM server.

Definition at line 18 of file LLM.cs.

Constructor & Destructor Documentation

◆ LLM()

LLMUnity.LLM.LLM ( )
inline

Definition at line 109 of file LLM.cs.

Member Function Documentation

◆ AddLora()

void LLMUnity.LLM.AddLora ( string path,
float weight = 1 )
inline

Allows to add a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to LORA model to use (.gguf format)

Definition at line 272 of file LLM.cs.

Here is the caller graph for this function:

◆ ApplyLoras()

void LLMUnity.LLM.ApplyLoras ( )
inline

Sets the lora scale, only works after the LLM service has started.

Returns
switch result

Definition at line 720 of file LLM.cs.

Here is the caller graph for this function:

◆ Awake()

async void LLMUnity.LLM.Awake ( )
inline

The Unity Awake function that starts the LLM server.

Definition at line 126 of file LLM.cs.

◆ CancelRequest()

void LLMUnity.LLM.CancelRequest ( int id_slot)
inline

Allows to cancel the requests in a specific slot of the LLM.

Parameters
id_slotslot of the LLM

Definition at line 797 of file LLM.cs.

◆ Completion()

async Task< string > LLMUnity.LLM.Completion ( string json,
Callback< string > streamCallback = null )
inline

Allows to use the chat and completion functionality of the LLM.

Parameters
jsonjson request containing the query
streamCallbackcallback function to call with intermediate responses
Returns
completion result

Definition at line 779 of file LLM.cs.

◆ Destroy()

void LLMUnity.LLM.Destroy ( )
inline

Stops and destroys the LLM.

Definition at line 807 of file LLM.cs.

Here is the caller graph for this function:

◆ Detokenize()

async Task< string > LLMUnity.LLM.Detokenize ( string json)
inline

Detokenises the provided query.

Parameters
jsonjson request containing the query
Returns
detokenisation result

Definition at line 691 of file LLM.cs.

◆ Embeddings()

async Task< string > LLMUnity.LLM.Embeddings ( string json)
inline

Computes the embeddings of the provided query.

Parameters
jsonjson request containing the query
Returns
embeddings result

Definition at line 706 of file LLM.cs.

◆ GetTemplate()

string LLMUnity.LLM.GetTemplate ( )
inline

Returns the chat template of the LLM.

Returns
chat template of the LLM

Definition at line 399 of file LLM.cs.

Here is the caller graph for this function:

◆ ListLoras()

async Task< List< LoraWeightResult > > LLMUnity.LLM.ListLoras ( )
inline

Gets a list of the lora adapters.

Returns
list of lara adapters

Definition at line 745 of file LLM.cs.

◆ OnDestroy()

void LLMUnity.LLM.OnDestroy ( )
inline

The Unity OnDestroy function called when the onbject is destroyed. The function StopProcess is called to stop the LLM server.

Definition at line 842 of file LLM.cs.

◆ Register()

int LLMUnity.LLM.Register ( LLMCaller llmCaller)
inline

Registers a local LLMCaller object. This allows to bind the LLMCaller "client" to a specific slot of the LLM.

Parameters
llmCaller
Returns

Definition at line 569 of file LLM.cs.

Here is the caller graph for this function:

◆ RemoveLora()

void LLMUnity.LLM.RemoveLora ( string path)
inline

Allows to remove a LORA model from the LLM. Models supported are in .gguf format.

Parameters
pathpath to LORA model to remove (.gguf format)

Definition at line 284 of file LLM.cs.

Here is the caller graph for this function:

◆ RemoveLoras()

void LLMUnity.LLM.RemoveLoras ( )
inline

Allows to remove all LORA models from the LLM.

Definition at line 294 of file LLM.cs.

◆ SetEmbeddings()

void LLMUnity.LLM.SetEmbeddings ( int embeddingLength,
bool embeddingsOnly )
inline

Set LLM Embedding parameters.

Parameters
embeddingLengthnumber of embedding dimensions
embeddingsOnlyif true, the LLM will be used only for embeddings

Definition at line 351 of file LLM.cs.

Here is the caller graph for this function:

◆ SetLora()

void LLMUnity.LLM.SetLora ( string path,
float weight = 1 )
inline

Allows to set a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to LORA model to use (.gguf format)

Definition at line 259 of file LLM.cs.

◆ SetLoraWeight()

void LLMUnity.LLM.SetLoraWeight ( string path,
float weight )
inline

Allows to change the weight (scale) of a LORA model in the LLM.

Parameters
pathpath of LORA model to change (.gguf format)
weightweight of LORA

Definition at line 306 of file LLM.cs.

◆ SetLoraWeights()

void LLMUnity.LLM.SetLoraWeights ( Dictionary< string, float > loraToWeight)
inline

Allows to change the weights (scale) of the LORA models in the LLM.

Parameters
loraToWeightDictionary (string, float) mapping the path of LORA models with weights to change

Definition at line 317 of file LLM.cs.

◆ SetModel()

void LLMUnity.LLM.SetModel ( string path)
inline

Allows to set the model used by the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to model to use (.gguf format)

Definition at line 231 of file LLM.cs.

◆ SetSSLCert()

void LLMUnity.LLM.SetSSLCert ( string path)
inline

Use a SSL certificate for the LLM server.

Parameters
templateNamethe SSL certificate path

Definition at line 379 of file LLM.cs.

◆ SetSSLKey()

void LLMUnity.LLM.SetSSLKey ( string path)
inline

Use a SSL key for the LLM server.

Parameters
templateNamethe SSL key path

Definition at line 389 of file LLM.cs.

◆ SetTemplate()

void LLMUnity.LLM.SetTemplate ( string templateName,
bool setDirty = true )
inline

Set the chat template for the LLM.

Parameters
templateNamethe chat template to use. The available templates can be found in the ChatTemplate.templates.Keys array

Definition at line 337 of file LLM.cs.

Here is the caller graph for this function:

◆ Slot()

async Task< string > LLMUnity.LLM.Slot ( string json)
inline

Allows to save / restore the state of a slot.

Parameters
jsonjson request containing the query
Returns
slot result

Definition at line 763 of file LLM.cs.

◆ Tokenize()

async Task< string > LLMUnity.LLM.Tokenize ( string json)
inline

Tokenises the provided query.

Parameters
jsonjson request containing the query
Returns
tokenisation result

Definition at line 676 of file LLM.cs.

◆ Update()

void LLMUnity.LLM.Update ( )
inline

The Unity Update function. It is used to retrieve the LLM replies.

Definition at line 603 of file LLM.cs.

◆ UpdateLoras()

void LLMUnity.LLM.UpdateLoras ( )
inline

Definition at line 324 of file LLM.cs.

◆ WaitUntilModelSetup()

static async Task< bool > LLMUnity.LLM.WaitUntilModelSetup ( Callback< float > downloadProgressCallback = null)
inlinestatic

Allows to wait until the LLM models are downloaded and ready.

Parameters
downloadProgressCallbackfunction to call with the download progress (float)

Definition at line 161 of file LLM.cs.

◆ WaitUntilReady()

async Task LLMUnity.LLM.WaitUntilReady ( )
inline

Allows to wait until the LLM is ready.

Definition at line 152 of file LLM.cs.

Member Data Documentation

◆ advancedOptions

bool LLMUnity.LLM.advancedOptions = false

show/hide advanced options in the GameObject

Definition at line 22 of file LLM.cs.

◆ APIKey

string LLMUnity.LLM.APIKey

API key to use for the server.

Definition at line 77 of file LLM.cs.

◆ batchSize

int LLMUnity.LLM.batchSize = 512

Batch size for prompt processing.

Definition at line 51 of file LLM.cs.

◆ chatTemplate

string LLMUnity.LLM.chatTemplate = ChatTemplate.DefaultTemplate

Chat template for the model.

Definition at line 65 of file LLM.cs.

◆ contextSize

int LLMUnity.LLM.contextSize = 8192

Size of the prompt context (0 = context size of the model). This is the number of tokens the model can take as input when generating responses.

Definition at line 48 of file LLM.cs.

◆ debug

bool LLMUnity.LLM.debug = false

log the output of the LLM in the Unity Editor.

Definition at line 38 of file LLM.cs.

◆ dontDestroyOnLoad

bool LLMUnity.LLM.dontDestroyOnLoad = true

do not destroy the LLM GameObject when loading a new Scene.

Definition at line 44 of file LLM.cs.

◆ flashAttention

bool LLMUnity.LLM.flashAttention = false

enable use of flash attention

Definition at line 74 of file LLM.cs.

◆ lora

string LLMUnity.LLM.lora = ""

LORA models to use (.gguf format)

Definition at line 68 of file LLM.cs.

◆ loraWeights

string LLMUnity.LLM.loraWeights = ""

the weights of the LORA models being used.

Definition at line 71 of file LLM.cs.

◆ model

string LLMUnity.LLM.model = ""

LLM model to use (.gguf format)

Definition at line 62 of file LLM.cs.

◆ numGPULayers

int LLMUnity.LLM.numGPULayers = 0

number of model layers to offload to the GPU (0 = GPU not used). If the user's GPU is not supported, the LLM will fall back to the CPU

Definition at line 35 of file LLM.cs.

◆ numThreads

int LLMUnity.LLM.numThreads = -1

number of threads to use (-1 = all)

Definition at line 31 of file LLM.cs.

◆ parallelPrompts

int LLMUnity.LLM.parallelPrompts = -1

number of prompts that can happen in parallel (-1 = number of LLMCaller objects)

Definition at line 41 of file LLM.cs.

◆ port

int LLMUnity.LLM.port = 13333

port to use for the remote LLM server

Definition at line 28 of file LLM.cs.

◆ remote

bool LLMUnity.LLM.remote = false

enable remote server functionality

Definition at line 25 of file LLM.cs.

◆ SSLCertPath

string LLMUnity.LLM.SSLCertPath = ""

Definition at line 82 of file LLM.cs.

◆ SSLKeyPath

string LLMUnity.LLM.SSLKeyPath = ""

Definition at line 86 of file LLM.cs.

Property Documentation

◆ failed

bool LLMUnity.LLM.failed = false
get

Boolean set to true if the server has failed to start.

Definition at line 55 of file LLM.cs.

◆ modelSetupComplete

bool LLMUnity.LLM.modelSetupComplete = false
staticget

Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Definition at line 59 of file LLM.cs.

◆ modelSetupFailed

bool LLMUnity.LLM.modelSetupFailed = false
staticget

Boolean set to true if the models were not downloaded successfully.

Definition at line 57 of file LLM.cs.

◆ started

bool LLMUnity.LLM.started = false
get

Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Definition at line 53 of file LLM.cs.


The documentation for this class was generated from the following file: