LLM for Unity  v2.4.1
Create characters in Unity with LLMs!
Loading...
Searching...
No Matches
LLMUnity.LLMCharacter Class Reference

Class implementing the LLM characters. More...

Inheritance diagram for LLMUnity.LLMCharacter:
[legend]

Public Member Functions

override void Awake ()
 The Unity Awake function that initializes the state before the application starts. The following actions are executed:
 
override bool IsValidLLM (LLM llmSet)
 Checks if a LLM is valid for the LLMCaller.
 
virtual string GetJsonSavePath (string filename)
 Allows to get the save path of the chat history based on the provided filename or relative path.
 
virtual string GetCacheSavePath (string filename)
 Allows to get the save path of the LLM cache based on the provided filename or relative path.
 
virtual void ClearChat ()
 Clear the chat of the LLMCharacter.
 
virtual void SetPrompt (string newPrompt, bool clearChat=true)
 Set the system prompt for the LLMCharacter.
 
virtual async Task LoadTemplate ()
 Loads the chat template of the LLMCharacter.
 
virtual async void SetGrammar (string path)
 Sets the grammar file of the LLMCharacter.
 
virtual void AddMessage (string role, string content)
 Allows to add a message in the chat history.
 
virtual void AddPlayerMessage (string content)
 Allows to add a player message in the chat history.
 
virtual void AddAIMessage (string content)
 Allows to add a AI message in the chat history.
 
virtual async Task< string > Chat (string query, Callback< string > callback=null, EmptyCallback completionCallback=null, bool addToHistory=true)
 Chat functionality of the LLM. It calls the LLM completion based on the provided query including the previous chat history. The function allows callbacks when the response is partially or fully received. The question is added to the history if specified.
 
virtual async Task< string > Complete (string prompt, Callback< string > callback=null, EmptyCallback completionCallback=null)
 Pure completion functionality of the LLM. It calls the LLM completion based solely on the provided prompt (no formatting by the chat template). The function allows callbacks when the response is partially or fully received.
 
virtual async Task Warmup (EmptyCallback completionCallback=null)
 Allow to warm-up a model by processing the prompt. The prompt processing will be cached (if cachePrompt=true) allowing for faster initialisation. The function allows callback for when the prompt is processed and the response received.
 
virtual async Task< string > AskTemplate ()
 Asks the LLM for the chat template to use.
 
virtual async Task< string > Save (string filename)
 Saves the chat history and cache to the provided filename / relative path.
 
virtual async Task< string > Load (string filename)
 Load the chat history and cache from the provided filename / relative path.
 
- Public Member Functions inherited from LLMUnity.LLMCaller
virtual bool IsAutoAssignableLLM (LLM llmSet)
 Checks if a LLM can be auto-assigned if the LLM of the LLMCaller is null.
 
virtual void CancelRequests ()
 Cancel the ongoing requests e.g. Chat, Complete.
 
virtual async Task< List< int > > Tokenize (string query, Callback< List< int > > callback=null)
 Tokenises the provided query.
 
virtual async Task< string > Detokenize (List< int > tokens, Callback< string > callback=null)
 Detokenises the provided tokens to a string.
 
virtual async Task< List< float > > Embeddings (string query, Callback< List< float > > callback=null)
 Computes the embeddings of the provided input.
 

Public Attributes

string save = ""
 file to save the chat history. The file is saved only for Chat calls with addToHistory set to true. The file will be saved within the persistentDataPath directory (see https://docs.unity3d.com/ScriptReference/Application-persistentDataPath.html).
 
bool saveCache = false
 toggle to save the LLM cache. This speeds up the prompt calculation but also requires ~100MB of space per character.
 
bool debugPrompt = false
 select to log the constructed prompt the Unity Editor.
 
int numPredict = 256
 number of tokens to predict (-1 = infinity, -2 = until context filled). This is the amount of tokens the model will maximum predict. When N predict is reached the model will stop generating. This means words / sentences might not get finished if this is too low.
 
int slot = -1
 specify which slot of the server to use for computation (affects caching)
 
string grammar = null
 grammar file used for the LLM in .cbnf format (relative to the Assets/StreamingAssets folder)
 
bool cachePrompt = true
 option to cache the prompt as it is being created by the chat to avoid reprocessing the entire prompt every time (default: true)
 
int seed = 0
 seed for reproducibility. For random results every time set to -1.
 
float temperature = 0.2f
 LLM temperature, lower values give more deterministic answers. The temperature setting adjusts how random the generated responses are. Turning it up makes the generated choices more varied and unpredictable. Turning it down makes the generated responses more predictable and focused on the most likely options.
 
int topK = 40
 top-k sampling (0 = disabled). The top k value controls the top k most probable tokens at each step of generation. This value can help fine tune the output and make this adhere to specific patterns or constraints.
 
float topP = 0.9f
 top-p sampling (1.0 = disabled). The top p value controls the cumulative probability of generated tokens. The model will generate tokens until this theshold (p) is reached. By lowering this value you can shorten output & encourage / discourage more diverse output.
 
float minP = 0.05f
 minimum probability for a token to be used. The probability is defined relative to the probability of the most likely token.
 
float repeatPenalty = 1.1f
 control the repetition of token sequences in the generated text. The penalty is applied to repeated tokens.
 
float presencePenalty = 0f
 repeated token presence penalty (0.0 = disabled). Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.
 
float frequencyPenalty = 0f
 repeated token frequency penalty (0.0 = disabled). Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.
 
float tfsZ = 1f
 enable tail free sampling with parameter z (1.0 = disabled).
 
float typicalP = 1f
 enable locally typical sampling with parameter p (1.0 = disabled).
 
int repeatLastN = 64
 last n tokens to consider for penalizing repetition (0 = disabled, -1 = ctx-size).
 
bool penalizeNl = true
 penalize newline tokens when applying the repeat penalty.
 
string penaltyPrompt
 prompt for the purpose of the penalty evaluation. Can be either null, a string or an array of numbers representing tokens (null/"" = use original prompt)
 
int mirostat = 0
 enable Mirostat sampling, controlling perplexity during text generation (0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).
 
float mirostatTau = 5f
 set the Mirostat target entropy, parameter tau.
 
float mirostatEta = 0.1f
 set the Mirostat learning rate, parameter eta.
 
int nProbs = 0
 if greater than 0, the response also contains the probabilities of top N tokens for each generated token.
 
bool ignoreEos = false
 ignore end of stream token and continue generating.
 
int nKeep = -1
 number of tokens to retain from the prompt when the model runs out of context (-1 = LLMCharacter prompt tokens if setNKeepToPrompt is set to true).
 
List< string > stop = new List<string>()
 stopwords to stop the LLM in addition to the default stopwords from the chat template.
 
Dictionary< int, string > logitBias = null
 the logit bias option allows to manually adjust the likelihood of specific tokens appearing in the generated text. By providing a token ID and a positive or negative bias value, you can increase or decrease the probability of that token being generated.
 
bool stream = true
 option to receive the reply from the model as it is produced (recommended!). If it is not selected, the full reply from the model is received in one go
 
string playerName = "user"
 the name of the player
 
string AIName = "assistant"
 the name of the AI
 
string prompt = "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions."
 a description of the AI role. This defines the LLMCharacter system prompt
 
bool setNKeepToPrompt = true
 option to set the number of tokens to retain from the prompt (nKeep) based on the LLMCharacter system prompt
 
List< ChatMessagechat = new List<ChatMessage>()
 the chat history as list of chat messages
 
string grammarString
 the grammar to use
 
- Public Attributes inherited from LLMUnity.LLMCaller
bool advancedOptions = false
 toggle to show/hide advanced options in the GameObject
 
bool remote = false
 toggle to use remote LLM server or local LLM
 
string APIKey
 allows to use a server with API key
 
string host = "localhost"
 host to use for the LLM server
 
int port = 13333
 port to use for the LLM server
 
int numRetries = 10
 number of retries to use for the LLM server requests (-1 = infinite)
 

Additional Inherited Members

- Properties inherited from LLMUnity.LLMCaller
LLM llm [get, set]
 

Detailed Description

Class implementing the LLM characters.

Definition at line 18 of file LLMCharacter.cs.

Member Function Documentation

◆ AddAIMessage()

virtual void LLMUnity.LLMCharacter.AddAIMessage ( string content)
inlinevirtual

Allows to add a AI message in the chat history.

Parameters
contentmessage content

Definition at line 371 of file LLMCharacter.cs.

◆ AddMessage()

virtual void LLMUnity.LLMCharacter.AddMessage ( string role,
string content )
inlinevirtual

Allows to add a message in the chat history.

Parameters
rolemessage role (e.g. playerName or AIName)
contentmessage content

Definition at line 352 of file LLMCharacter.cs.

◆ AddPlayerMessage()

virtual void LLMUnity.LLMCharacter.AddPlayerMessage ( string content)
inlinevirtual

Allows to add a player message in the chat history.

Parameters
contentmessage content

Definition at line 362 of file LLMCharacter.cs.

◆ AskTemplate()

virtual async Task< string > LLMUnity.LLMCharacter.AskTemplate ( )
inlinevirtual

Asks the LLM for the chat template to use.

Returns
the chat template of the LLM

Definition at line 524 of file LLMCharacter.cs.

◆ Awake()

override void LLMUnity.LLMCharacter.Awake ( )
inlinevirtual

The Unity Awake function that initializes the state before the application starts. The following actions are executed:

  • the corresponding LLM server is defined (if ran locally)
  • the grammar is set based on the grammar file
  • the prompt and chat history are initialised
  • the chat template is constructed
  • the number of tokens to keep are based on the system prompt (if setNKeepToPrompt=true)

Reimplemented from LLMUnity.LLMCaller.

Definition at line 128 of file LLMCharacter.cs.

◆ Chat()

virtual async Task< string > LLMUnity.LLMCharacter.Chat ( string query,
Callback< string > callback = null,
EmptyCallback completionCallback = null,
bool addToHistory = true )
inlinevirtual

Chat functionality of the LLM. It calls the LLM completion based on the provided query including the previous chat history. The function allows callbacks when the response is partially or fully received. The question is added to the history if specified.

Parameters
queryuser query
callbackcallback function that receives the response as string
completionCallbackcallback function called when the full response has been received
addToHistorywhether to add the user query to the chat history
Returns
the LLM response

Definition at line 430 of file LLMCharacter.cs.

◆ ClearChat()

virtual void LLMUnity.LLMCharacter.ClearChat ( )
inlinevirtual

Clear the chat of the LLMCharacter.

Definition at line 210 of file LLMCharacter.cs.

◆ Complete()

virtual async Task< string > LLMUnity.LLMCharacter.Complete ( string prompt,
Callback< string > callback = null,
EmptyCallback completionCallback = null )
inlinevirtual

Pure completion functionality of the LLM. It calls the LLM completion based solely on the provided prompt (no formatting by the chat template). The function allows callbacks when the response is partially or fully received.

Parameters
promptuser query
callbackcallback function that receives the response as string
completionCallbackcallback function called when the full response has been received
Returns
the LLM response

Definition at line 483 of file LLMCharacter.cs.

◆ GetCacheSavePath()

virtual string LLMUnity.LLMCharacter.GetCacheSavePath ( string filename)
inlinevirtual

Allows to get the save path of the LLM cache based on the provided filename or relative path.

Parameters
filenamefilename or relative path used for the save
Returns
save path

Definition at line 202 of file LLMCharacter.cs.

◆ GetJsonSavePath()

virtual string LLMUnity.LLMCharacter.GetJsonSavePath ( string filename)
inlinevirtual

Allows to get the save path of the chat history based on the provided filename or relative path.

Parameters
filenamefilename or relative path used for the save
Returns
save path

Definition at line 192 of file LLMCharacter.cs.

◆ IsValidLLM()

override bool LLMUnity.LLMCharacter.IsValidLLM ( LLM llmSet)
inlinevirtual

Checks if a LLM is valid for the LLMCaller.

Parameters
llmSetLLM object
Returns
bool specifying whether the LLM is valid

Reimplemented from LLMUnity.LLMCaller.

Definition at line 157 of file LLMCharacter.cs.

◆ Load()

virtual async Task< string > LLMUnity.LLMCharacter.Load ( string filename)
inlinevirtual

Load the chat history and cache from the provided filename / relative path.

Parameters
filenamefilename / relative path to load the chat history from
Returns

Definition at line 568 of file LLMCharacter.cs.

◆ LoadTemplate()

virtual async Task LLMUnity.LLMCharacter.LoadTemplate ( )
inlinevirtual

Loads the chat template of the LLMCharacter.

Returns

Definition at line 271 of file LLMCharacter.cs.

◆ Save()

virtual async Task< string > LLMUnity.LLMCharacter.Save ( string filename)
inlinevirtual

Saves the chat history and cache to the provided filename / relative path.

Parameters
filenamefilename / relative path to save the chat history
Returns

Definition at line 549 of file LLMCharacter.cs.

◆ SetGrammar()

virtual async void LLMUnity.LLMCharacter.SetGrammar ( string path)
inlinevirtual

Sets the grammar file of the LLMCharacter.

Parameters
pathpath to the grammar file

Definition at line 294 of file LLMCharacter.cs.

◆ SetPrompt()

virtual void LLMUnity.LLMCharacter.SetPrompt ( string newPrompt,
bool clearChat = true )
inlinevirtual

Set the system prompt for the LLMCharacter.

Parameters
newPromptthe system prompt
clearChatwhether to clear (true) or keep (false) the current chat history on top of the system prompt.

Definition at line 222 of file LLMCharacter.cs.

◆ Warmup()

virtual async Task LLMUnity.LLMCharacter.Warmup ( EmptyCallback completionCallback = null)
inlinevirtual

Allow to warm-up a model by processing the prompt. The prompt processing will be cached (if cachePrompt=true) allowing for faster initialisation. The function allows callback for when the prompt is processed and the response received.

The function calls the Chat function with a predefined query without adding it to history.

Parameters
completionCallbackcallback function called when the full response has been received
queryuser prompt used during the initialisation (not added to history)
Returns
the LLM response

Definition at line 506 of file LLMCharacter.cs.

Member Data Documentation

◆ AIName

string LLMUnity.LLMCharacter.AIName = "assistant"

the name of the AI

Definition at line 103 of file LLMCharacter.cs.

◆ cachePrompt

bool LLMUnity.LLMCharacter.cachePrompt = true

option to cache the prompt as it is being created by the chat to avoid reprocessing the entire prompt every time (default: true)

Definition at line 38 of file LLMCharacter.cs.

◆ chat

List<ChatMessage> LLMUnity.LLMCharacter.chat = new List<ChatMessage>()

the chat history as list of chat messages

Definition at line 109 of file LLMCharacter.cs.

◆ debugPrompt

bool LLMUnity.LLMCharacter.debugPrompt = false

select to log the constructed prompt the Unity Editor.

Definition at line 27 of file LLMCharacter.cs.

◆ frequencyPenalty

float LLMUnity.LLMCharacter.frequencyPenalty = 0f

repeated token frequency penalty (0.0 = disabled). Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Definition at line 65 of file LLMCharacter.cs.

◆ grammar

string LLMUnity.LLMCharacter.grammar = null

grammar file used for the LLM in .cbnf format (relative to the Assets/StreamingAssets folder)

Definition at line 36 of file LLMCharacter.cs.

◆ grammarString

string LLMUnity.LLMCharacter.grammarString

the grammar to use

Definition at line 111 of file LLMCharacter.cs.

◆ ignoreEos

bool LLMUnity.LLMCharacter.ignoreEos = false

ignore end of stream token and continue generating.

Definition at line 87 of file LLMCharacter.cs.

◆ logitBias

Dictionary<int, string> LLMUnity.LLMCharacter.logitBias = null

the logit bias option allows to manually adjust the likelihood of specific tokens appearing in the generated text. By providing a token ID and a positive or negative bias value, you can increase or decrease the probability of that token being generated.

Definition at line 95 of file LLMCharacter.cs.

◆ minP

float LLMUnity.LLMCharacter.minP = 0.05f

minimum probability for a token to be used. The probability is defined relative to the probability of the most likely token.

Definition at line 56 of file LLMCharacter.cs.

◆ mirostat

int LLMUnity.LLMCharacter.mirostat = 0

enable Mirostat sampling, controlling perplexity during text generation (0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).

Definition at line 79 of file LLMCharacter.cs.

◆ mirostatEta

float LLMUnity.LLMCharacter.mirostatEta = 0.1f

set the Mirostat learning rate, parameter eta.

Definition at line 83 of file LLMCharacter.cs.

◆ mirostatTau

float LLMUnity.LLMCharacter.mirostatTau = 5f

set the Mirostat target entropy, parameter tau.

Definition at line 81 of file LLMCharacter.cs.

◆ nKeep

int LLMUnity.LLMCharacter.nKeep = -1

number of tokens to retain from the prompt when the model runs out of context (-1 = LLMCharacter prompt tokens if setNKeepToPrompt is set to true).

Definition at line 90 of file LLMCharacter.cs.

◆ nProbs

int LLMUnity.LLMCharacter.nProbs = 0

if greater than 0, the response also contains the probabilities of top N tokens for each generated token.

Definition at line 85 of file LLMCharacter.cs.

◆ numPredict

int LLMUnity.LLMCharacter.numPredict = 256

number of tokens to predict (-1 = infinity, -2 = until context filled). This is the amount of tokens the model will maximum predict. When N predict is reached the model will stop generating. This means words / sentences might not get finished if this is too low.

Definition at line 32 of file LLMCharacter.cs.

◆ penalizeNl

bool LLMUnity.LLMCharacter.penalizeNl = true

penalize newline tokens when applying the repeat penalty.

Definition at line 74 of file LLMCharacter.cs.

◆ penaltyPrompt

string LLMUnity.LLMCharacter.penaltyPrompt

prompt for the purpose of the penalty evaluation. Can be either null, a string or an array of numbers representing tokens (null/"" = use original prompt)

Definition at line 77 of file LLMCharacter.cs.

◆ playerName

string LLMUnity.LLMCharacter.playerName = "user"

the name of the player

Definition at line 101 of file LLMCharacter.cs.

◆ presencePenalty

float LLMUnity.LLMCharacter.presencePenalty = 0f

repeated token presence penalty (0.0 = disabled). Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

Definition at line 62 of file LLMCharacter.cs.

◆ prompt

string LLMUnity.LLMCharacter.prompt = "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions."

a description of the AI role. This defines the LLMCharacter system prompt

Definition at line 105 of file LLMCharacter.cs.

◆ repeatLastN

int LLMUnity.LLMCharacter.repeatLastN = 64

last n tokens to consider for penalizing repetition (0 = disabled, -1 = ctx-size).

Definition at line 72 of file LLMCharacter.cs.

◆ repeatPenalty

float LLMUnity.LLMCharacter.repeatPenalty = 1.1f

control the repetition of token sequences in the generated text. The penalty is applied to repeated tokens.

Definition at line 59 of file LLMCharacter.cs.

◆ save

string LLMUnity.LLMCharacter.save = ""

file to save the chat history. The file is saved only for Chat calls with addToHistory set to true. The file will be saved within the persistentDataPath directory (see https://docs.unity3d.com/ScriptReference/Application-persistentDataPath.html).

Definition at line 23 of file LLMCharacter.cs.

◆ saveCache

bool LLMUnity.LLMCharacter.saveCache = false

toggle to save the LLM cache. This speeds up the prompt calculation but also requires ~100MB of space per character.

Definition at line 25 of file LLMCharacter.cs.

◆ seed

int LLMUnity.LLMCharacter.seed = 0

seed for reproducibility. For random results every time set to -1.

Definition at line 40 of file LLMCharacter.cs.

◆ setNKeepToPrompt

bool LLMUnity.LLMCharacter.setNKeepToPrompt = true

option to set the number of tokens to retain from the prompt (nKeep) based on the LLMCharacter system prompt

Definition at line 107 of file LLMCharacter.cs.

◆ slot

int LLMUnity.LLMCharacter.slot = -1

specify which slot of the server to use for computation (affects caching)

Definition at line 34 of file LLMCharacter.cs.

◆ stop

List<string> LLMUnity.LLMCharacter.stop = new List<string>()

stopwords to stop the LLM in addition to the default stopwords from the chat template.

Definition at line 92 of file LLMCharacter.cs.

◆ stream

bool LLMUnity.LLMCharacter.stream = true

option to receive the reply from the model as it is produced (recommended!). If it is not selected, the full reply from the model is received in one go

Definition at line 99 of file LLMCharacter.cs.

◆ temperature

float LLMUnity.LLMCharacter.temperature = 0.2f

LLM temperature, lower values give more deterministic answers. The temperature setting adjusts how random the generated responses are. Turning it up makes the generated choices more varied and unpredictable. Turning it down makes the generated responses more predictable and focused on the most likely options.

Definition at line 45 of file LLMCharacter.cs.

◆ tfsZ

float LLMUnity.LLMCharacter.tfsZ = 1f

enable tail free sampling with parameter z (1.0 = disabled).

Definition at line 68 of file LLMCharacter.cs.

◆ topK

int LLMUnity.LLMCharacter.topK = 40

top-k sampling (0 = disabled). The top k value controls the top k most probable tokens at each step of generation. This value can help fine tune the output and make this adhere to specific patterns or constraints.

Definition at line 48 of file LLMCharacter.cs.

◆ topP

float LLMUnity.LLMCharacter.topP = 0.9f

top-p sampling (1.0 = disabled). The top p value controls the cumulative probability of generated tokens. The model will generate tokens until this theshold (p) is reached. By lowering this value you can shorten output & encourage / discourage more diverse output.

Definition at line 53 of file LLMCharacter.cs.

◆ typicalP

float LLMUnity.LLMCharacter.typicalP = 1f

enable locally typical sampling with parameter p (1.0 = disabled).

Definition at line 70 of file LLMCharacter.cs.


The documentation for this class was generated from the following file: