LLM for Unity  v2.2.5
Create characters in Unity with LLMs!
Loading...
Searching...
No Matches
LLMUnity.LLM Class Reference

Class implementing the LLM server. More...

Inheritance diagram for LLMUnity.LLM:
[legend]

Public Member Functions

async void Awake ()
 The Unity Awake function that starts the LLM server. The server can be started asynchronously if the asynchronousStartup option is set.
 
async Task WaitUntilReady ()
 
void SetModel (string path)
 Allows to set the model used by the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void SetLora (string path, float weight=1)
 Allows to set a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void AddLora (string path, float weight=1)
 Allows to add a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.
 
void RemoveLora (string path)
 Allows to remove a LORA model from the LLM. Models supported are in .gguf format.
 
void RemoveLoras ()
 Allows to remove all LORA models from the LLM.
 
void SetLoraWeight (string path, float weight)
 Allows to change the weight (scale) of a LORA model in the LLM.
 
void SetLoraWeights (Dictionary< string, float > loraToWeight)
 Allows to change the weights (scale) of the LORA models in the LLM.
 
void UpdateLoras ()
 
void SetTemplate (string templateName, bool setDirty=true)
 Set the chat template for the LLM.
 
void SetSSLCert (string path)
 Use a SSL certificate for the LLM server.
 
void SetSSLKey (string path)
 Use a SSL key for the LLM server.
 
string GetTemplate ()
 Returns the chat template of the LLM.
 
int Register (LLMCharacter llmCharacter)
 Registers a local LLMCharacter object. This allows to bind the LLMCharacter "client" to a specific slot of the LLM.
 
void Update ()
 The Unity Update function. It is used to retrieve the LLM replies.
 
async Task< string > Tokenize (string json)
 Tokenises the provided query.
 
async Task< string > Detokenize (string json)
 Detokenises the provided query.
 
async Task< string > Embeddings (string json)
 Computes the embeddings of the provided query.
 
void ApplyLoras ()
 Sets the lora scale, only works after the LLM service has started.
 
async Task< List< LoraWeightResult > > ListLoras ()
 Gets a list of the lora adapters.
 
async Task< string > Slot (string json)
 Allows to save / restore the state of a slot.
 
async Task< string > Completion (string json, Callback< string > streamCallback=null)
 Allows to use the chat and completion functionality of the LLM.
 
async Task SetBasePrompt (string base_prompt)
 
void CancelRequest (int id_slot)
 Allows to cancel the requests in a specific slot of the LLM.
 
void Destroy ()
 Stops and destroys the LLM.
 
void OnDestroy ()
 The Unity OnDestroy function called when the onbject is destroyed. The function StopProcess is called to stop the LLM server.
 

Static Public Member Functions

static async Task< bool > WaitUntilModelSetup (Callback< float > downloadProgressCallback=null)
 
static string GetLLMManagerAsset (string path)
 
static string GetLLMManagerAssetEditor (string path)
 
static string GetLLMManagerAssetRuntime (string path)
 

Public Attributes

bool advancedOptions = false
 toggle to show/hide advanced options in the GameObject
 
bool remote = false
 toggle to enable remote server functionality
 
int port = 13333
 port to use for the LLM server
 
int numThreads = -1
 number of threads to use (-1 = all)
 
int numGPULayers = 0
 number of model layers to offload to the GPU (0 = GPU not used). Use a large number i.e. >30 to utilise the GPU as much as possible. If the user's GPU is not supported, the LLM will fall back to the CPU
 
bool debug = false
 select to log the output of the LLM in the Unity Editor.
 
int parallelPrompts = -1
 number of prompts that can happen in parallel (-1 = number of LLMCharacter objects)
 
bool dontDestroyOnLoad = true
 select to not destroy the LLM GameObject when loading a new Scene.
 
int contextSize = 8192
 Size of the prompt context (0 = context size of the model). This is the number of tokens the model can take as input when generating responses.
 
int batchSize = 512
 Batch size for prompt processing.
 
string basePrompt = ""
 a base prompt to use as a base for all LLMCharacter objects
 
string model = ""
 the LLM model to use. Models with .gguf format are allowed.
 
string chatTemplate = ChatTemplate.DefaultTemplate
 Chat template used for the model.
 
string lora = ""
 the paths of the LORA models being used (relative to the Assets/StreamingAssets folder). Models with .gguf format are allowed.
 
string loraWeights = ""
 the weights of the LORA models being used.
 
bool flashAttention = false
 enable use of flash attention
 
string APIKey
 API key to use for the server (optional)
 
string SSLCertPath = ""
 
string SSLKeyPath = ""
 

Properties

bool started = false [get]
 Boolean set to true if the server has started and is ready to receive requests, false otherwise.
 
bool failed = false [get]
 Boolean set to true if the server has failed to start.
 
static bool modelSetupFailed = false [get]
 Boolean set to true if the models were not downloaded successfully.
 
static bool modelSetupComplete = false [get]
 Boolean set to true if the server has started and is ready to receive requests, false otherwise.
 

Detailed Description

Class implementing the LLM server.

Definition at line 18 of file LLM.cs.

Constructor & Destructor Documentation

◆ LLM()

LLMUnity.LLM.LLM ( )
inline

Definition at line 97 of file LLM.cs.

Member Function Documentation

◆ AddLora()

void LLMUnity.LLM.AddLora ( string path,
float weight = 1 )
inline

Allows to add a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to LORA model to use (.gguf format)

Definition at line 248 of file LLM.cs.

◆ ApplyLoras()

void LLMUnity.LLM.ApplyLoras ( )
inline

Sets the lora scale, only works after the LLM service has started.

Returns
switch result

Definition at line 681 of file LLM.cs.

◆ Awake()

async void LLMUnity.LLM.Awake ( )
inline

The Unity Awake function that starts the LLM server. The server can be started asynchronously if the asynchronousStartup option is set.

Definition at line 115 of file LLM.cs.

◆ CancelRequest()

void LLMUnity.LLM.CancelRequest ( int id_slot)
inline

Allows to cancel the requests in a specific slot of the LLM.

Parameters
id_slotslot of the LLM

Definition at line 764 of file LLM.cs.

◆ Completion()

async Task< string > LLMUnity.LLM.Completion ( string json,
Callback< string > streamCallback = null )
inline

Allows to use the chat and completion functionality of the LLM.

Parameters
jsonjson request containing the query
streamCallbackcallback function to call with intermediate responses
Returns
completion result

Definition at line 739 of file LLM.cs.

◆ Destroy()

void LLMUnity.LLM.Destroy ( )
inline

Stops and destroys the LLM.

Definition at line 774 of file LLM.cs.

◆ Detokenize()

async Task< string > LLMUnity.LLM.Detokenize ( string json)
inline

Detokenises the provided query.

Parameters
jsonjson request containing the query
Returns
detokenisation result

Definition at line 652 of file LLM.cs.

◆ Embeddings()

async Task< string > LLMUnity.LLM.Embeddings ( string json)
inline

Computes the embeddings of the provided query.

Parameters
jsonjson request containing the query
Returns
embeddings result

Definition at line 667 of file LLM.cs.

◆ GetLLMManagerAsset()

static string LLMUnity.LLM.GetLLMManagerAsset ( string path)
inlinestatic

Definition at line 151 of file LLM.cs.

◆ GetLLMManagerAssetEditor()

static string LLMUnity.LLM.GetLLMManagerAssetEditor ( string path)
inlinestatic

Definition at line 159 of file LLM.cs.

◆ GetLLMManagerAssetRuntime()

static string LLMUnity.LLM.GetLLMManagerAssetRuntime ( string path)
inlinestatic

Definition at line 188 of file LLM.cs.

◆ GetTemplate()

string LLMUnity.LLM.GetTemplate ( )
inline

Returns the chat template of the LLM.

Returns
chat template of the LLM

Definition at line 361 of file LLM.cs.

◆ ListLoras()

async Task< List< LoraWeightResult > > LLMUnity.LLM.ListLoras ( )
inline

Gets a list of the lora adapters.

Returns
list of lara adapters

Definition at line 705 of file LLM.cs.

◆ OnDestroy()

void LLMUnity.LLM.OnDestroy ( )
inline

The Unity OnDestroy function called when the onbject is destroyed. The function StopProcess is called to stop the LLM server.

Definition at line 809 of file LLM.cs.

◆ Register()

int LLMUnity.LLM.Register ( LLMCharacter llmCharacter)
inline

Registers a local LLMCharacter object. This allows to bind the LLMCharacter "client" to a specific slot of the LLM.

Parameters
llmCharacter
Returns

Definition at line 530 of file LLM.cs.

◆ RemoveLora()

void LLMUnity.LLM.RemoveLora ( string path)
inline

Allows to remove a LORA model from the LLM. Models supported are in .gguf format.

Parameters
pathpath to LORA model to remove (.gguf format)

Definition at line 260 of file LLM.cs.

◆ RemoveLoras()

void LLMUnity.LLM.RemoveLoras ( )
inline

Allows to remove all LORA models from the LLM.

Definition at line 270 of file LLM.cs.

◆ SetBasePrompt()

async Task LLMUnity.LLM.SetBasePrompt ( string base_prompt)
inline

Definition at line 753 of file LLM.cs.

◆ SetLora()

void LLMUnity.LLM.SetLora ( string path,
float weight = 1 )
inline

Allows to set a LORA model to use in the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to LORA model to use (.gguf format)

Definition at line 235 of file LLM.cs.

◆ SetLoraWeight()

void LLMUnity.LLM.SetLoraWeight ( string path,
float weight )
inline

Allows to change the weight (scale) of a LORA model in the LLM.

Parameters
pathpath of LORA model to change (.gguf format)
weightweight of LORA

Definition at line 282 of file LLM.cs.

◆ SetLoraWeights()

void LLMUnity.LLM.SetLoraWeights ( Dictionary< string, float > loraToWeight)
inline

Allows to change the weights (scale) of the LORA models in the LLM.

Parameters
loraToWeightDictionary (string, float) mapping the path of LORA models with weights to change

Definition at line 293 of file LLM.cs.

◆ SetModel()

void LLMUnity.LLM.SetModel ( string path)
inline

Allows to set the model used by the LLM. The model provided is copied to the Assets/StreamingAssets folder that allows it to also work in the build. Models supported are in .gguf format.

Parameters
pathpath to model to use (.gguf format)

Definition at line 208 of file LLM.cs.

◆ SetSSLCert()

void LLMUnity.LLM.SetSSLCert ( string path)
inline

Use a SSL certificate for the LLM server.

Parameters
templateNamethe SSL certificate path

Definition at line 341 of file LLM.cs.

◆ SetSSLKey()

void LLMUnity.LLM.SetSSLKey ( string path)
inline

Use a SSL key for the LLM server.

Parameters
templateNamethe SSL key path

Definition at line 351 of file LLM.cs.

◆ SetTemplate()

void LLMUnity.LLM.SetTemplate ( string templateName,
bool setDirty = true )
inline

Set the chat template for the LLM.

Parameters
templateNamethe chat template to use. The available templates can be found in the ChatTemplate.templates.Keys array

Definition at line 313 of file LLM.cs.

◆ Slot()

async Task< string > LLMUnity.LLM.Slot ( string json)
inline

Allows to save / restore the state of a slot.

Parameters
jsonjson request containing the query
Returns
slot result

Definition at line 723 of file LLM.cs.

◆ Tokenize()

async Task< string > LLMUnity.LLM.Tokenize ( string json)
inline

Tokenises the provided query.

Parameters
jsonjson request containing the query
Returns
tokenisation result

Definition at line 637 of file LLM.cs.

◆ Update()

void LLMUnity.LLM.Update ( )
inline

The Unity Update function. It is used to retrieve the LLM replies.

Definition at line 564 of file LLM.cs.

◆ UpdateLoras()

void LLMUnity.LLM.UpdateLoras ( )
inline

Definition at line 300 of file LLM.cs.

◆ WaitUntilModelSetup()

static async Task< bool > LLMUnity.LLM.WaitUntilModelSetup ( Callback< float > downloadProgressCallback = null)
inlinestatic

Definition at line 144 of file LLM.cs.

◆ WaitUntilReady()

async Task LLMUnity.LLM.WaitUntilReady ( )
inline

Definition at line 139 of file LLM.cs.

Member Data Documentation

◆ advancedOptions

bool LLMUnity.LLM.advancedOptions = false

toggle to show/hide advanced options in the GameObject

Definition at line 21 of file LLM.cs.

◆ APIKey

string LLMUnity.LLM.APIKey

API key to use for the server (optional)

Definition at line 68 of file LLM.cs.

◆ basePrompt

string LLMUnity.LLM.basePrompt = ""

a base prompt to use as a base for all LLMCharacter objects

Definition at line 44 of file LLM.cs.

◆ batchSize

int LLMUnity.LLM.batchSize = 512

Batch size for prompt processing.

Definition at line 42 of file LLM.cs.

◆ chatTemplate

string LLMUnity.LLM.chatTemplate = ChatTemplate.DefaultTemplate

Chat template used for the model.

Definition at line 58 of file LLM.cs.

◆ contextSize

int LLMUnity.LLM.contextSize = 8192

Size of the prompt context (0 = context size of the model). This is the number of tokens the model can take as input when generating responses.

Definition at line 40 of file LLM.cs.

◆ debug

bool LLMUnity.LLM.debug = false

select to log the output of the LLM in the Unity Editor.

Definition at line 33 of file LLM.cs.

◆ dontDestroyOnLoad

bool LLMUnity.LLM.dontDestroyOnLoad = true

select to not destroy the LLM GameObject when loading a new Scene.

Definition at line 37 of file LLM.cs.

◆ flashAttention

bool LLMUnity.LLM.flashAttention = false

enable use of flash attention

Definition at line 65 of file LLM.cs.

◆ lora

string LLMUnity.LLM.lora = ""

the paths of the LORA models being used (relative to the Assets/StreamingAssets folder). Models with .gguf format are allowed.

Definition at line 61 of file LLM.cs.

◆ loraWeights

string LLMUnity.LLM.loraWeights = ""

the weights of the LORA models being used.

Definition at line 63 of file LLM.cs.

◆ model

string LLMUnity.LLM.model = ""

the LLM model to use. Models with .gguf format are allowed.

Definition at line 56 of file LLM.cs.

◆ numGPULayers

int LLMUnity.LLM.numGPULayers = 0

number of model layers to offload to the GPU (0 = GPU not used). Use a large number i.e. >30 to utilise the GPU as much as possible. If the user's GPU is not supported, the LLM will fall back to the CPU

Definition at line 31 of file LLM.cs.

◆ numThreads

int LLMUnity.LLM.numThreads = -1

number of threads to use (-1 = all)

Definition at line 27 of file LLM.cs.

◆ parallelPrompts

int LLMUnity.LLM.parallelPrompts = -1

number of prompts that can happen in parallel (-1 = number of LLMCharacter objects)

Definition at line 35 of file LLM.cs.

◆ port

int LLMUnity.LLM.port = 13333

port to use for the LLM server

Definition at line 25 of file LLM.cs.

◆ remote

bool LLMUnity.LLM.remote = false

toggle to enable remote server functionality

Definition at line 23 of file LLM.cs.

◆ SSLCertPath

string LLMUnity.LLM.SSLCertPath = ""

Definition at line 72 of file LLM.cs.

◆ SSLKeyPath

string LLMUnity.LLM.SSLKeyPath = ""

Definition at line 76 of file LLM.cs.

Property Documentation

◆ failed

bool LLMUnity.LLM.failed = false
get

Boolean set to true if the server has failed to start.

Definition at line 48 of file LLM.cs.

◆ modelSetupComplete

bool LLMUnity.LLM.modelSetupComplete = false
staticget

Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Definition at line 52 of file LLM.cs.

◆ modelSetupFailed

bool LLMUnity.LLM.modelSetupFailed = false
staticget

Boolean set to true if the models were not downloaded successfully.

Definition at line 50 of file LLM.cs.

◆ started

bool LLMUnity.LLM.started = false
get

Boolean set to true if the server has started and is ready to receive requests, false otherwise.

Definition at line 46 of file LLM.cs.


The documentation for this class was generated from the following file: