LLM for Unity  v2.3.0
Create characters in Unity with LLMs!
Loading...
Searching...
No Matches
LLMUnity.LLM Class Reference

Class implementing the LLM server. More...

Inheritance diagram for LLMUnity.LLM:
[legend]

Public Member Functions

async void Awake ()
 The Unity Awake function that starts the LLM server.
 
async Task WaitUntilReady ()
 Allows to wait until the LLM is ready.
 
void SetModel (string path)
 Allows to set the model used by the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void SetLora (string path, float weight=1)
 Allows to set a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void AddLora (string path, float weight=1)
 Allows to add a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void RemoveLora (string path)
 Allows to remove a LORA model from the LLM. Models supported are in .gguf format.
 
void RemoveLoras ()
 Allows to remove all LORA models from the LLM.
 
void SetLoraWeight (string path, float weight)
 Allows to change the weight (scale) of a LORA model in the LLM.
 
void SetLoraWeights (Dictionary< string, float > loraToWeight)
 Allows to change the weights (scale) of the LORA models in the LLM.
 
void UpdateLoras ()
 
void SetTemplate (string templateName, bool setDirty=true)
 Set the chat template for the LLM.
 
void SetEmbeddings (int embeddingLength, bool embeddingsOnly)
 Set LLM Embedding parameters.
 
void SetSSLCert (string path)
 Use a SSL certificate for the LLM server.
 
void SetSSLKey (string path)
 Use a SSL key for the LLM server.
 
string GetTemplate ()
 Returns the chat template of the LLM.
 
int Register (LLMCaller llmCaller)
 Registers a local LLMCaller object. This allows to bind the LLMCaller "client" to a specific slot of the LLM.
 
void Update ()
 The Unity Update function. It is used to retrieve the LLM replies.
 
async Task< string > Tokenize (string json)
 Tokenises the provided query.
 
async Task< string > Detokenize (string json)
 Detokenises the provided query.
 
async Task< string > Embeddings (string json)
 Computes the embeddings of the provided query.
 
void ApplyLoras ()
 Sets the lora scale, only works after the LLM service has started.
 
async Task< List< LoraWeightResult > > ListLoras ()
 Gets a list of the lora adapters.
 
async Task< string > Slot (string json)
 Allows to save / restore the state of a slot.
 
async Task< string > Completion (string json, Callback< string > streamCallback=null)
 Allows to use the chat and completion functionality of the LLM.
 
async Task SetBasePrompt (string base_prompt)
 
void CancelRequest (int id_slot)
 Allows to cancel the requests in a specific slot of the LLM.
 
void Destroy ()
 Stops and destroys the LLM.
 
void OnDestroy ()
 The Unity OnDestroy function called when the onbject is destroyed. The function StopProcess is called to stop the LLM server.
 

Static Public Member Functions

static async Task< bool > WaitUntilModelSetup (Callback< float > downloadProgressCallback=null)
 Allows to wait until the LLM models are downloaded and ready.
 

Public Attributes

bool advancedOptions = false
 toggle to show/hide advanced options in the GameObject
 
bool remote = false
 toggle to enable remote server functionality
 
int port = 13333
 port to use for the LLM server
 
int numThreads = -1
 number of threads to use (-1 = all)
 
int numGPULayers = 0
 number of model layers to offload to the GPU (0 = GPU not used). Use a large number i.e. >30 to utilise the GPU as much as possible. If the user's GPU is not supported, the LLM will fall back to the CPU
 
bool debug = false
 select to log the output of the LLM in the Unity Editor.
 
int parallelPrompts = -1
 number of prompts that can happen in parallel (-1 = number of LLMCaller objects)
 
bool dontDestroyOnLoad = true
 select to not destroy the LLM GameObject when loading a new Scene.
 
int contextSize = 8192
 Size of the prompt context (0 = context size of the model). This is the number of tokens the model can take as input when generating responses.
 
int batchSize = 512
 Batch size for prompt processing.
 
string basePrompt = ""
 a base prompt to use as a base for all LLMCaller objects
 
string model = ""
 the LLM model to use. Models with .gguf format are allowed.
 
string chatTemplate = ChatTemplate.DefaultTemplate
 Chat template used for the model.
 
string lora = ""
 the paths of the LORA models being used (relative to the Assets/StreamingAssets folder). Models with .gguf format are allowed.
 
string loraWeights = ""
 the weights of the LORA models being used.
 
bool flashAttention = false
 enable use of flash attention
 
string APIKey
 API key to use for the server (optional)
 
string SSLCertPath = ""
 
string SSLKeyPath = ""
 

Properties

bool started = false [get]
 Boolean set to true if the server has started and is ready to receive requests, false otherwise.
 
bool failed = false [get]
 Boolean set to true if the server has failed to start.
 
static bool modelSetupFailed = false [get]
 Boolean set to true if the models were not downloaded successfully.
 
static bool modelSetupComplete = false [get]
 Boolean set to true if the server has started and is ready to receive requests, false otherwise.
 

Detailed Description

Class implementing the LLM server.

Definition at line 18 of file LLM.cs.

Constructor & Destructor Documentation

◆ LLM()

LLMUnity.LLM.LLM ( )
inline

Definition at line 99 of file LLM.cs.

Member Function Documentation

◆ AddLora()

void LLMUnity.LLM.AddLora ( string path,
float weight = 1 )
inline

Allows to add a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to LORA model to use (.gguf format)

Definition at line 260 of file LLM.cs.

◆ ApplyLoras()

void LLMUnity.LLM.ApplyLoras ( )
inline

Sets the lora scale, only works after the LLM service has started.

Returns
switch result

Definition at line 708 of file LLM.cs.

◆ Awake()

async void LLMUnity.LLM.Awake ( )
inline

The Unity Awake function that starts the LLM server.

Definition at line 116 of file LLM.cs.

◆ CancelRequest()

void LLMUnity.LLM.CancelRequest ( int id_slot)
inline

Allows to cancel the requests in a specific slot of the LLM.

Parameters
id_slotslot of the LLM

Definition at line 791 of file LLM.cs.

◆ Completion()

async Task< string > LLMUnity.LLM.Completion ( string json,
Callback< string > streamCallback = null )
inline

Allows to use the chat and completion functionality of the LLM.

Parameters
jsonjson request containing the query
streamCallbackcallback function to call with intermediate responses
Returns
completion result

Definition at line 766 of file LLM.cs.

◆ Destroy()

void LLMUnity.LLM.Destroy ( )
inline

Stops and destroys the LLM.

Definition at line 801 of file LLM.cs.

◆ Detokenize()

async Task< string > LLMUnity.LLM.Detokenize ( string json)
inline

Detokenises the provided query.

Parameters
jsonjson request containing the query
Returns
detokenisation result

Definition at line 679 of file LLM.cs.

◆ Embeddings()

async Task< string > LLMUnity.LLM.Embeddings ( string json)
inline

Computes the embeddings of the provided query.

Parameters
jsonjson request containing the query
Returns
embeddings result

Definition at line 694 of file LLM.cs.

◆ GetTemplate()

string LLMUnity.LLM.GetTemplate ( )
inline

Returns the chat template of the LLM.

Returns
chat template of the LLM

Definition at line 387 of file LLM.cs.

◆ ListLoras()

async Task< List< LoraWeightResult > > LLMUnity.LLM.ListLoras ( )
inline

Gets a list of the lora adapters.

Returns
list of lara adapters

Definition at line 732 of file LLM.cs.

◆ OnDestroy()

void LLMUnity.LLM.OnDestroy ( )
inline

The Unity OnDestroy function called when the onbject is destroyed. The function StopProcess is called to stop the LLM server.

Definition at line 836 of file LLM.cs.

◆ Register()

int LLMUnity.LLM.Register ( LLMCaller llmCaller)
inline

Registers a local LLMCaller object. This allows to bind the LLMCaller "client" to a specific slot of the LLM.

Parameters
llmCaller
Returns

Definition at line 557 of file LLM.cs.

◆ RemoveLora()

void LLMUnity.LLM.RemoveLora ( string path)
inline

Allows to remove a LORA model from the LLM. Models supported are in .gguf format.

Parameters
pathpath to LORA model to remove (.gguf format)

Definition at line 272 of file LLM.cs.

◆ RemoveLoras()

void LLMUnity.LLM.RemoveLoras ( )
inline

Allows to remove all LORA models from the LLM.

Definition at line 282 of file LLM.cs.

◆ SetBasePrompt()

async Task LLMUnity.LLM.SetBasePrompt ( string base_prompt)
inline

Definition at line 780 of file LLM.cs.

◆ SetEmbeddings()

void LLMUnity.LLM.SetEmbeddings ( int embeddingLength,
bool embeddingsOnly )
inline

Set LLM Embedding parameters.

Parameters
embeddingLengthnumber of embedding dimensions
embeddingsOnlyif true, the LLM will be used only for embeddings

Definition at line 339 of file LLM.cs.

◆ SetLora()

void LLMUnity.LLM.SetLora ( string path,
float weight = 1 )
inline

Allows to set a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to LORA model to use (.gguf format)

Definition at line 247 of file LLM.cs.

◆ SetLoraWeight()

void LLMUnity.LLM.SetLoraWeight ( string path,
float weight )
inline

Allows to change the weight (scale) of a LORA model in the LLM.

Parameters
pathpath of LORA model to change (.gguf format)
weightweight of LORA

Definition at line 294 of file LLM.cs.

◆ SetLoraWeights()

void LLMUnity.LLM.SetLoraWeights ( Dictionary< string, float > loraToWeight)
inline

Allows to change the weights (scale) of the LORA models in the LLM.

Parameters
loraToWeightDictionary (string, float) mapping the path of LORA models with weights to change

Definition at line 305 of file LLM.cs.

◆ SetModel()

void LLMUnity.LLM.SetModel ( string path)
inline

Allows to set the model used by the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to model to use (.gguf format)

Definition at line 219 of file LLM.cs.

◆ SetSSLCert()

void LLMUnity.LLM.SetSSLCert ( string path)
inline

Use a SSL certificate for the LLM server.

Parameters
templateNamethe SSL certificate path

Definition at line 367 of file LLM.cs.

◆ SetSSLKey()

void LLMUnity.LLM.SetSSLKey ( string path)
inline

Use a SSL key for the LLM server.

Parameters
templateNamethe SSL key path

Definition at line 377 of file LLM.cs.

◆ SetTemplate()

void LLMUnity.LLM.SetTemplate ( string templateName,
bool setDirty = true )
inline

Set the chat template for the LLM.

Parameters
templateNamethe chat template to use. The available templates can be found in the ChatTemplate.templates.Keys array

Definition at line 325 of file LLM.cs.

◆ Slot()

async Task< string > LLMUnity.LLM.Slot ( string json)
inline

Allows to save / restore the state of a slot.

Parameters
jsonjson request containing the query
Returns
slot result

Definition at line 750 of file LLM.cs.

◆ Tokenize()

async Task< string > LLMUnity.LLM.Tokenize ( string json)
inline

Tokenises the provided query.

Parameters
jsonjson request containing the query
Returns
tokenisation result

Definition at line 664 of file LLM.cs.

◆ Update()

void LLMUnity.LLM.Update ( )
inline

The Unity Update function. It is used to retrieve the LLM replies.

Definition at line 591 of file LLM.cs.

◆ UpdateLoras()

void LLMUnity.LLM.UpdateLoras ( )
inline

Definition at line 312 of file LLM.cs.

◆ WaitUntilModelSetup()

static async Task< bool > LLMUnity.LLM.WaitUntilModelSetup ( Callback< float > downloadProgressCallback = null)
inlinestatic

Allows to wait until the LLM models are downloaded and ready.

Parameters
downloadProgressCallbackfunction to call with the download progress (float)

Definition at line 152 of file LLM.cs.

◆ WaitUntilReady()

async Task LLMUnity.LLM.WaitUntilReady ( )
inline

Allows to wait until the LLM is ready.

Definition at line 143 of file LLM.cs.

Member Data Documentation

◆ advancedOptions

bool LLMUnity.LLM.advancedOptions = false

toggle to show/hide advanced options in the GameObject

Definition at line 21 of file LLM.cs.

◆ APIKey

string LLMUnity.LLM.APIKey

API key to use for the server (optional)

Definition at line 68 of file LLM.cs.

◆ basePrompt

string LLMUnity.LLM.basePrompt = ""

a base prompt to use as a base for all LLMCaller objects

Definition at line 44 of file LLM.cs.

◆ batchSize

int LLMUnity.LLM.batchSize = 512

Batch size for prompt processing.

Definition at line 42 of file LLM.cs.

◆ chatTemplate

string LLMUnity.LLM.chatTemplate = ChatTemplate.DefaultTemplate

Chat template used for the model.

Definition at line 58 of file LLM.cs.

◆ contextSize

int LLMUnity.LLM.contextSize = 8192

Size of the prompt context (0 = context size of the model). This is the number of tokens the model can take as input when generating responses.

Definition at line 40 of file LLM.cs.

◆ debug

bool LLMUnity.LLM.debug = false

select to log the output of the LLM in the Unity Editor.

Definition at line 33 of file LLM.cs.

◆ dontDestroyOnLoad

bool LLMUnity.LLM.dontDestroyOnLoad = true

select to not destroy the LLM GameObject when loading a new Scene.

Definition at line 37 of file LLM.cs.

◆ flashAttention

bool LLMUnity.LLM.flashAttention = false

enable use of flash attention

Definition at line 65 of file LLM.cs.

◆ lora

string LLMUnity.LLM.lora = ""

the paths of the LORA models being used (relative to the Assets/StreamingAssets folder). Models with .gguf format are allowed.

Definition at line 61 of file LLM.cs.

◆ loraWeights

string LLMUnity.LLM.loraWeights = ""

the weights of the LORA models being used.

Definition at line 63 of file LLM.cs.

◆ model

string LLMUnity.LLM.model = ""

the LLM model to use. Models with .gguf format are allowed.

Definition at line 56 of file LLM.cs.

◆ numGPULayers

int LLMUnity.LLM.numGPULayers = 0

number of model layers to offload to the GPU (0 = GPU not used). Use a large number i.e. >30 to utilise the GPU as much as possible. If the user's GPU is not supported, the LLM will fall back to the CPU

Definition at line 31 of file LLM.cs.

◆ numThreads

int LLMUnity.LLM.numThreads = -1

number of threads to use (-1 = all)

Definition at line 27 of file LLM.cs.

◆ parallelPrompts

int LLMUnity.LLM.parallelPrompts = -1

number of prompts that can happen in parallel (-1 = number of LLMCaller objects)

Definition at line 35 of file LLM.cs.

◆ port

int LLMUnity.LLM.port = 13333

port to use for the LLM server

Definition at line 25 of file LLM.cs.

◆ remote

bool LLMUnity.LLM.remote = false

toggle to enable remote server functionality

Definition at line 23 of file LLM.cs.

◆ SSLCertPath

string LLMUnity.LLM.SSLCertPath = ""

Definition at line 72 of file LLM.cs.

◆ SSLKeyPath

string LLMUnity.LLM.SSLKeyPath = ""

Definition at line 76 of file LLM.cs.

Property Documentation

◆ failed

bool LLMUnity.LLM.failed = false
get

Boolean set to true if the server has failed to start.

Definition at line 48 of file LLM.cs.

◆ modelSetupComplete

bool LLMUnity.LLM.modelSetupComplete = false
staticget

Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Definition at line 52 of file LLM.cs.

◆ modelSetupFailed

bool LLMUnity.LLM.modelSetupFailed = false
staticget

Boolean set to true if the models were not downloaded successfully.

Definition at line 50 of file LLM.cs.

◆ started

bool LLMUnity.LLM.started = false
get

Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Definition at line 46 of file LLM.cs.


The documentation for this class was generated from the following file: