LLM for Unity  v2.2.5
Create characters in Unity with LLMs!
No Matches
LLMUnity.LLM Class Reference

Class implementing the LLM server. More...

Inheritance diagram for LLMUnity.LLM:

Public Member Functions

async void Awake ()
 The Unity Awake function that starts the LLM server. The server can be started asynchronously if the asynchronousStartup option is set.
async Task WaitUntilReady ()
void SetModel (string path)
 Allows to set the model used by the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
void SetLora (string path, float weight=1)
 Allows to set a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
void AddLora (string path, float weight=1)
 Allows to add a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
void RemoveLora (string path)
 Allows to remove a LORA model from the LLM. Models supported are in .gguf format.
void RemoveLoras ()
 Allows to remove all LORA models from the LLM.
void SetLoraWeight (string path, float weight)
 Allows to change the weight (scale) of a LORA model in the LLM.
void SetLoraWeights (Dictionary< string, float > loraToWeight)
 Allows to change the weights (scale) of the LORA models in the LLM.
void UpdateLoras ()
void SetTemplate (string templateName, bool setDirty=true)
 Set the chat template for the LLM.
void SetSSLCert (string path)
 Use a SSL certificate for the LLM server.
void SetSSLKey (string path)
 Use a SSL key for the LLM server.
string GetTemplate ()
 Returns the chat template of the LLM.
int Register (LLMCharacter llmCharacter)
 Registers a local LLMCharacter object. This allows to bind the LLMCharacter "client" to a specific slot of the LLM.
void Update ()
 The Unity Update function. It is used to retrieve the LLM replies.
async Task< string > Tokenize (string json)
 Tokenises the provided query.
async Task< string > Detokenize (string json)
 Detokenises the provided query.
async Task< string > Embeddings (string json)
 Computes the embeddings of the provided query.
void ApplyLoras ()
 Sets the lora scale, only works after the LLM service has started.
async Task< List< LoraWeightResult > > ListLoras ()
 Gets a list of the lora adapters.
async Task< string > Slot (string json)
 Allows to save / restore the state of a slot.
async Task< string > Completion (string json, Callback< string > streamCallback=null)
 Allows to use the chat and completion functionality of the LLM.
async Task SetBasePrompt (string base_prompt)
void CancelRequest (int id_slot)
 Allows to cancel the requests in a specific slot of the LLM.
void Destroy ()
 Stops and destroys the LLM.
void OnDestroy ()
 The Unity OnDestroy function called when the onbject is destroyed. The function StopProcess is called to stop the LLM server.

Static Public Member Functions

static async Task< bool > WaitUntilModelSetup (Callback< float > downloadProgressCallback=null)
static string GetLLMManagerAsset (string path)
static string GetLLMManagerAssetEditor (string path)
static string GetLLMManagerAssetRuntime (string path)

Public Attributes

bool advancedOptions = false
 toggle to show/hide advanced options in the GameObject
bool remote = false
 toggle to enable remote server functionality
int port = 13333
 port to use for the LLM server
int numThreads = -1
 number of threads to use (-1 = all)
int numGPULayers = 0
 number of model layers to offload to the GPU (0 = GPU not used). Use a large number i.e. >30 to utilise the GPU as much as possible. If the user's GPU is not supported, the LLM will fall back to the CPU
bool debug = false
 select to log the output of the LLM in the Unity Editor.
int parallelPrompts = -1
 number of prompts that can happen in parallel (-1 = number of LLMCharacter objects)
bool dontDestroyOnLoad = true
 select to not destroy the LLM GameObject when loading a new Scene.
int contextSize = 8192
 Size of the prompt context (0 = context size of the model). This is the number of tokens the model can take as input when generating responses.
int batchSize = 512
 Batch size for prompt processing.
string basePrompt = ""
 a base prompt to use as a base for all LLMCharacter objects
string model = ""
 the LLM model to use. Models with .gguf format are allowed.
string chatTemplate = ChatTemplate.DefaultTemplate
 Chat template used for the model.
string lora = ""
 the paths of the LORA models being used (relative to the Assets/StreamingAssets folder). Models with .gguf format are allowed.
string loraWeights = ""
 the weights of the LORA models being used.
bool flashAttention = false
 enable use of flash attention
string APIKey
 API key to use for the server (optional)
string SSLCertPath = ""
string SSLKeyPath = ""


bool started = false [get]
 Boolean set to true if the server has started and is ready to receive requests, false otherwise.
bool failed = false [get]
 Boolean set to true if the server has failed to start.
static bool modelSetupFailed = false [get]
 Boolean set to true if the models were not downloaded successfully.
static bool modelSetupComplete = false [get]
 Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Detailed Description

Class implementing the LLM server.

Definition at line 18 of file LLM.cs.

Constructor & Destructor Documentation

◆ LLM()

LLMUnity.LLM.LLM ( )

Definition at line 97 of file LLM.cs.

Member Function Documentation

◆ AddLora()

void LLMUnity.LLM.AddLora ( string path,
float weight = 1 )

Allows to add a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

pathpath to LORA model to use (.gguf format)

Definition at line 248 of file LLM.cs.

◆ ApplyLoras()

void LLMUnity.LLM.ApplyLoras ( )

Sets the lora scale, only works after the LLM service has started.

switch result

Definition at line 681 of file LLM.cs.

◆ Awake()

async void LLMUnity.LLM.Awake ( )

The Unity Awake function that starts the LLM server. The server can be started asynchronously if the asynchronousStartup option is set.

Definition at line 115 of file LLM.cs.

◆ CancelRequest()

void LLMUnity.LLM.CancelRequest ( int id_slot)

Allows to cancel the requests in a specific slot of the LLM.

id_slotslot of the LLM

Definition at line 764 of file LLM.cs.

◆ Completion()

async Task< string > LLMUnity.LLM.Completion ( string json,
Callback< string > streamCallback = null )

Allows to use the chat and completion functionality of the LLM.

jsonjson request containing the query
streamCallbackcallback function to call with intermediate responses
completion result

Definition at line 739 of file LLM.cs.

◆ Destroy()

void LLMUnity.LLM.Destroy ( )

Stops and destroys the LLM.

Definition at line 774 of file LLM.cs.

◆ Detokenize()

async Task< string > LLMUnity.LLM.Detokenize ( string json)

Detokenises the provided query.

jsonjson request containing the query
detokenisation result

Definition at line 652 of file LLM.cs.

◆ Embeddings()

async Task< string > LLMUnity.LLM.Embeddings ( string json)

Computes the embeddings of the provided query.

jsonjson request containing the query
embeddings result

Definition at line 667 of file LLM.cs.

◆ GetLLMManagerAsset()

static string LLMUnity.LLM.GetLLMManagerAsset ( string path)

Definition at line 151 of file LLM.cs.

◆ GetLLMManagerAssetEditor()

static string LLMUnity.LLM.GetLLMManagerAssetEditor ( string path)

Definition at line 159 of file LLM.cs.

◆ GetLLMManagerAssetRuntime()

static string LLMUnity.LLM.GetLLMManagerAssetRuntime ( string path)

Definition at line 188 of file LLM.cs.

◆ GetTemplate()

string LLMUnity.LLM.GetTemplate ( )

Returns the chat template of the LLM.

chat template of the LLM

Definition at line 361 of file LLM.cs.

◆ ListLoras()

async Task< List< LoraWeightResult > > LLMUnity.LLM.ListLoras ( )

Gets a list of the lora adapters.

list of lara adapters

Definition at line 705 of file LLM.cs.

◆ OnDestroy()

void LLMUnity.LLM.OnDestroy ( )

The Unity OnDestroy function called when the onbject is destroyed. The function StopProcess is called to stop the LLM server.

Definition at line 809 of file LLM.cs.

◆ Register()

int LLMUnity.LLM.Register ( LLMCharacter llmCharacter)

Registers a local LLMCharacter object. This allows to bind the LLMCharacter "client" to a specific slot of the LLM.


Definition at line 530 of file LLM.cs.

◆ RemoveLora()

void LLMUnity.LLM.RemoveLora ( string path)

Allows to remove a LORA model from the LLM. Models supported are in .gguf format.

pathpath to LORA model to remove (.gguf format)

Definition at line 260 of file LLM.cs.

◆ RemoveLoras()

void LLMUnity.LLM.RemoveLoras ( )

Allows to remove all LORA models from the LLM.

Definition at line 270 of file LLM.cs.

◆ SetBasePrompt()

async Task LLMUnity.LLM.SetBasePrompt ( string base_prompt)

Definition at line 753 of file LLM.cs.

◆ SetLora()

void LLMUnity.LLM.SetLora ( string path,
float weight = 1 )

Allows to set a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

pathpath to LORA model to use (.gguf format)

Definition at line 235 of file LLM.cs.

◆ SetLoraWeight()

void LLMUnity.LLM.SetLoraWeight ( string path,
float weight )

Allows to change the weight (scale) of a LORA model in the LLM.

pathpath of LORA model to change (.gguf format)
weightweight of LORA

Definition at line 282 of file LLM.cs.

◆ SetLoraWeights()

void LLMUnity.LLM.SetLoraWeights ( Dictionary< string, float > loraToWeight)

Allows to change the weights (scale) of the LORA models in the LLM.

loraToWeightDictionary (string, float) mapping the path of LORA models with weights to change

Definition at line 293 of file LLM.cs.

◆ SetModel()

void LLMUnity.LLM.SetModel ( string path)

Allows to set the model used by the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

pathpath to model to use (.gguf format)

Definition at line 208 of file LLM.cs.

◆ SetSSLCert()

void LLMUnity.LLM.SetSSLCert ( string path)

Use a SSL certificate for the LLM server.

templateNamethe SSL certificate path

Definition at line 341 of file LLM.cs.

◆ SetSSLKey()

void LLMUnity.LLM.SetSSLKey ( string path)

Use a SSL key for the LLM server.

templateNamethe SSL key path

Definition at line 351 of file LLM.cs.

◆ SetTemplate()

void LLMUnity.LLM.SetTemplate ( string templateName,
bool setDirty = true )

Set the chat template for the LLM.

templateNamethe chat template to use. The available templates can be found in the ChatTemplate.templates.Keys array

Definition at line 313 of file LLM.cs.

◆ Slot()

async Task< string > LLMUnity.LLM.Slot ( string json)

Allows to save / restore the state of a slot.

jsonjson request containing the query
slot result

Definition at line 723 of file LLM.cs.

◆ Tokenize()

async Task< string > LLMUnity.LLM.Tokenize ( string json)

Tokenises the provided query.

jsonjson request containing the query
tokenisation result

Definition at line 637 of file LLM.cs.

◆ Update()

void LLMUnity.LLM.Update ( )

The Unity Update function. It is used to retrieve the LLM replies.

Definition at line 564 of file LLM.cs.

◆ UpdateLoras()

void LLMUnity.LLM.UpdateLoras ( )

Definition at line 300 of file LLM.cs.

◆ WaitUntilModelSetup()

static async Task< bool > LLMUnity.LLM.WaitUntilModelSetup ( Callback< float > downloadProgressCallback = null)

Definition at line 144 of file LLM.cs.

◆ WaitUntilReady()

async Task LLMUnity.LLM.WaitUntilReady ( )

Definition at line 139 of file LLM.cs.

Member Data Documentation

◆ advancedOptions

bool LLMUnity.LLM.advancedOptions = false

toggle to show/hide advanced options in the GameObject

Definition at line 21 of file LLM.cs.

◆ APIKey

string LLMUnity.LLM.APIKey

API key to use for the server (optional)

Definition at line 68 of file LLM.cs.

◆ basePrompt

string LLMUnity.LLM.basePrompt = ""

a base prompt to use as a base for all LLMCharacter objects

Definition at line 44 of file LLM.cs.

◆ batchSize

int LLMUnity.LLM.batchSize = 512

Batch size for prompt processing.

Definition at line 42 of file LLM.cs.

◆ chatTemplate

string LLMUnity.LLM.chatTemplate = ChatTemplate.DefaultTemplate

Chat template used for the model.

Definition at line 58 of file LLM.cs.

◆ contextSize

int LLMUnity.LLM.contextSize = 8192

Size of the prompt context (0 = context size of the model). This is the number of tokens the model can take as input when generating responses.

Definition at line 40 of file LLM.cs.

◆ debug

bool LLMUnity.LLM.debug = false

select to log the output of the LLM in the Unity Editor.

Definition at line 33 of file LLM.cs.

◆ dontDestroyOnLoad

bool LLMUnity.LLM.dontDestroyOnLoad = true

select to not destroy the LLM GameObject when loading a new Scene.

Definition at line 37 of file LLM.cs.

◆ flashAttention

bool LLMUnity.LLM.flashAttention = false

enable use of flash attention

Definition at line 65 of file LLM.cs.

◆ lora

string LLMUnity.LLM.lora = ""

the paths of the LORA models being used (relative to the Assets/StreamingAssets folder). Models with .gguf format are allowed.

Definition at line 61 of file LLM.cs.

◆ loraWeights

string LLMUnity.LLM.loraWeights = ""

the weights of the LORA models being used.

Definition at line 63 of file LLM.cs.

◆ model

string LLMUnity.LLM.model = ""

the LLM model to use. Models with .gguf format are allowed.

Definition at line 56 of file LLM.cs.

◆ numGPULayers

int LLMUnity.LLM.numGPULayers = 0

number of model layers to offload to the GPU (0 = GPU not used). Use a large number i.e. >30 to utilise the GPU as much as possible. If the user's GPU is not supported, the LLM will fall back to the CPU

Definition at line 31 of file LLM.cs.

◆ numThreads

int LLMUnity.LLM.numThreads = -1

number of threads to use (-1 = all)

Definition at line 27 of file LLM.cs.

◆ parallelPrompts

int LLMUnity.LLM.parallelPrompts = -1

number of prompts that can happen in parallel (-1 = number of LLMCharacter objects)

Definition at line 35 of file LLM.cs.

◆ port

int LLMUnity.LLM.port = 13333

port to use for the LLM server

Definition at line 25 of file LLM.cs.

◆ remote

bool LLMUnity.LLM.remote = false

toggle to enable remote server functionality

Definition at line 23 of file LLM.cs.

◆ SSLCertPath

string LLMUnity.LLM.SSLCertPath = ""

Definition at line 72 of file LLM.cs.

◆ SSLKeyPath

string LLMUnity.LLM.SSLKeyPath = ""

Definition at line 76 of file LLM.cs.

Property Documentation

◆ failed

bool LLMUnity.LLM.failed = false

Boolean set to true if the server has failed to start.

Definition at line 48 of file LLM.cs.

◆ modelSetupComplete

bool LLMUnity.LLM.modelSetupComplete = false

Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Definition at line 52 of file LLM.cs.

◆ modelSetupFailed

bool LLMUnity.LLM.modelSetupFailed = false

Boolean set to true if the models were not downloaded successfully.

Definition at line 50 of file LLM.cs.

◆ started

bool LLMUnity.LLM.started = false

Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Definition at line 46 of file LLM.cs.

The documentation for this class was generated from the following file: