Class implementing the LLM characters. More...

Inheritance diagram for LLMUnity.LLMCharacter:

Public Member Functions
override void	Awake ()
	The Unity Awake function that initializes the state before the application starts. The following actions are executed:

override bool	IsValidLLM (LLM llmSet)
	Checks if a LLM is valid for the LLMCaller.

virtual string	GetJsonSavePath (string filename)
	Allows to get the save path of the chat history based on the provided filename or relative path.

virtual string	GetCacheSavePath (string filename)
	Allows to get the save path of the LLM cache based on the provided filename or relative path.

virtual void	ClearChat ()
	Clear the chat of the LLMCharacter.

virtual void	SetPrompt (string newPrompt, bool clearChat=true)
	Set the system prompt for the LLMCharacter.

virtual async Task	LoadTemplate ()
	Loads the chat template of the LLMCharacter.

virtual async void	SetGrammar (string path)
	Sets the grammar file of the LLMCharacter.

virtual void	AddMessage (string role, string content)
	Allows to add a message in the chat history.

virtual void	AddPlayerMessage (string content)
	Allows to add a player message in the chat history.

virtual void	AddAIMessage (string content)
	Allows to add a AI message in the chat history.

virtual async Task< string >	Chat (string query, Callback< string > callback=null, EmptyCallback completionCallback=null, bool addToHistory=true)
	Chat functionality of the LLM. It calls the LLM completion based on the provided query including the previous chat history. The function allows callbacks when the response is partially or fully received. The question is added to the history if specified.

virtual async Task< string >	Complete (string prompt, Callback< string > callback=null, EmptyCallback completionCallback=null)
	Pure completion functionality of the LLM. It calls the LLM completion based solely on the provided prompt (no formatting by the chat template). The function allows callbacks when the response is partially or fully received.

virtual async Task	Warmup (EmptyCallback completionCallback=null)
	Allow to warm-up a model by processing the system prompt. The prompt processing will be cached (if cachePrompt=true) allowing for faster initialisation. The function allows a callback function for when the prompt is processed and the response received.

virtual async Task	Warmup (string query, EmptyCallback completionCallback=null)
	Allow to warm-up a model by processing the provided prompt without adding it to history. The prompt processing will be cached (if cachePrompt=true) allowing for faster initialisation. The function allows a callback function for when the prompt is processed and the response received.

virtual async Task< string >	AskTemplate ()
	Asks the LLM for the chat template to use.

virtual async Task< string >	Save (string filename)
	Saves the chat history and cache to the provided filename / relative path.

virtual async Task< string >	Load (string filename)
	Load the chat history and cache from the provided filename / relative path.

Public Member Functions inherited from LLMUnity.LLMCaller
virtual bool	IsAutoAssignableLLM (LLM llmSet)
	Checks if a LLM can be auto-assigned if the LLM of the LLMCaller is null.

virtual void	CancelRequests ()
	Cancel the ongoing requests e.g. Chat, Complete.

virtual async Task< List< int > >	Tokenize (string query, Callback< List< int > > callback=null)
	Tokenises the provided query.

virtual async Task< string >	Detokenize (List< int > tokens, Callback< string > callback=null)
	Detokenises the provided tokens to a string.

virtual async Task< List< float > >	Embeddings (string query, Callback< List< float > > callback=null)
	Computes the embeddings of the provided input.

Public Attributes
string	save = ""
	file to save the chat history. The file will be saved within the persistentDataPath directory.

bool	saveCache = false
	save the LLM cache. Speeds up the prompt calculation when reloading from history but also requires ~100MB of space per character.

bool	debugPrompt = false
	log the constructed prompt the Unity Editor.

int	numPredict = 256
	maximum number of tokens that the LLM will predict (-1 = infinity, -2 = until context filled).

int	slot = -1
	slot of the server to use for computation (affects caching)

string	grammar = null
	grammar file used for the LLMCharacter (.gbnf format)

bool	cachePrompt = true
	cache the processed prompt to avoid reprocessing the entire prompt every time (default: true, recommended!)

int	seed = 0
	seed for reproducibility (-1 = no reproducibility).

float	temperature = 0.2f
	LLM temperature, lower values give more deterministic answers.

int	topK = 40
	Top-k sampling selects the next token only from the top k most likely predicted tokens (0 = disabled). Higher values lead to more diverse text, while lower value will generate more focused and conservative text.

float	topP = 0.9f
	Top-p sampling selects the next token from a subset of tokens that together have a cumulative probability of at least p (1.0 = disabled). Higher values lead to more diverse text, while lower value will generate more focused and conservative text.

float	minP = 0.05f
	minimum probability for a token to be used.

float	repeatPenalty = 1.1f
	Penalty based on repeated tokens to control the repetition of token sequences in the generated text.

float	presencePenalty = 0f
	Penalty based on token presence in previous responses to control the repetition of token sequences in the generated text. (0.0 = disabled).

float	frequencyPenalty = 0f
	Penalty based on token frequency in previous responses to control the repetition of token sequences in the generated text. (0.0 = disabled).

float	typicalP = 1f
	enable locally typical sampling (1.0 = disabled). Higher values will promote more contextually coherent tokens, while lower values will promote more diverse tokens.

int	repeatLastN = 64
	last n tokens to consider for penalizing repetition (0 = disabled, -1 = ctx-size).

bool	penalizeNl = true
	penalize newline tokens when applying the repeat penalty.

string	penaltyPrompt
	prompt for the purpose of the penalty evaluation. Can be either null, a string or an array of numbers representing tokens (null/'' = use original prompt)

int	mirostat = 0
	enable Mirostat sampling, controlling perplexity during text generation (0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).

float	mirostatTau = 5f
	The Mirostat target entropy (tau) controls the balance between coherence and diversity in the generated text.

float	mirostatEta = 0.1f
	The Mirostat learning rate (eta) controls how quickly the algorithm responds to feedback from the generated text.

int	nProbs = 0
	if greater than 0, the response also contains the probabilities of top N tokens for each generated token.

bool	ignoreEos = false
	ignore end of stream token and continue generating.

int	nKeep = -1
	number of tokens to retain from the prompt when the model runs out of context (-1 = LLMCharacter prompt tokens if setNKeepToPrompt is set to true).

List< string >	stop = new List<string>()
	stopwords to stop the LLM in addition to the default stopwords from the chat template.

Dictionary< int, string >	logitBias = null
	the logit bias option allows to manually adjust the likelihood of specific tokens appearing in the generated text. By providing a token ID and a positive or negative bias value, you can increase or decrease the probability of that token being generated.

bool	stream = true
	Receive the reply from the model as it is produced (recommended!). If not selected, the full reply from the model is received in one go.

string	playerName = "user"
	the name of the player

string	AIName = "assistant"
	the name of the AI

string	prompt = "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions."
	a description of the AI role (system prompt)

bool	setNKeepToPrompt = true
	set the number of tokens to always retain from the prompt (nKeep) based on the LLMCharacter system prompt

List< ChatMessage >	chat = new List<ChatMessage>()
	the chat history as list of chat messages

string	grammarString
	the grammar to use

Public Attributes inherited from LLMUnity.LLMCaller
bool	advancedOptions = false
	show/hide advanced options in the GameObject

bool	remote = false
	use remote LLM server

string	APIKey
	API key for the remote server.

string	host = "localhost"
	host of the remote LLM server

int	port = 13333
	port of the remote LLM server

int	numRetries = 10
	number of retries to use for the remote LLM server requests (-1 = infinite)

Additional Inherited Members
Properties inherited from LLMUnity.LLMCaller
LLM	llm `[get, set]`

Detailed Description

Class implementing the LLM characters.

Definition at line 18 of file LLMCharacter.cs.

Member Function Documentation

◆ AddAIMessage()

virtual void LLMUnity.LLMCharacter.AddAIMessage ( string content )

inlinevirtual

Allows to add a AI message in the chat history.

Parameters

content message content

Definition at line 387 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ AddMessage()

virtual void LLMUnity.LLMCharacter.AddMessage	(	string	role,
		string	content )

inlinevirtual

Allows to add a message in the chat history.

Parameters

role	message role (e.g. playerName or AIName)
content	message content

Definition at line 368 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ AddPlayerMessage()

virtual void LLMUnity.LLMCharacter.AddPlayerMessage ( string content )

inlinevirtual

Allows to add a player message in the chat history.

Parameters

content message content

Definition at line 378 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ AskTemplate()

virtual async Task< string > LLMUnity.LLMCharacter.AskTemplate ( )

inlinevirtual

Asks the LLM for the chat template to use.

Returns: the chat template of the LLM

Definition at line 565 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ Awake()

override void LLMUnity.LLMCharacter.Awake ( )

inlinevirtual

The Unity Awake function that initializes the state before the application starts. The following actions are executed:

the corresponding LLM server is defined (if ran locally)
the grammar is set based on the grammar file
the prompt and chat history are initialised
the chat template is constructed
the number of tokens to keep are based on the system prompt (if setNKeepToPrompt=true)

Reimplemented from LLMUnity.LLMCaller.

Definition at line 145 of file LLMCharacter.cs.

◆ Chat()

virtual async Task< string > LLMUnity.LLMCharacter.Chat	(	string	query,
		Callback< string >	callback = null,
		EmptyCallback	completionCallback = null,
		bool	addToHistory = true )

inlinevirtual

Chat functionality of the LLM. It calls the LLM completion based on the provided query including the previous chat history. The function allows callbacks when the response is partially or fully received. The question is added to the history if specified.

Parameters

query	user query
callback	callback function that receives the response as string
completionCallback	callback function called when the full response has been received
addToHistory	whether to add the user query to the chat history

Returns: the LLM response

Definition at line 464 of file LLMCharacter.cs.

◆ ClearChat()

virtual void LLMUnity.LLMCharacter.ClearChat ( )

inlinevirtual

Clear the chat of the LLMCharacter.

Definition at line 227 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ Complete()

virtual async Task< string > LLMUnity.LLMCharacter.Complete	(	string	prompt,
		Callback< string >	callback = null,
		EmptyCallback	completionCallback = null )

inlinevirtual

Pure completion functionality of the LLM. It calls the LLM completion based solely on the provided prompt (no formatting by the chat template). The function allows callbacks when the response is partially or fully received.

Parameters

prompt	user query
callback	callback function that receives the response as string
completionCallback	callback function called when the full response has been received

Returns: the LLM response

Definition at line 504 of file LLMCharacter.cs.

◆ GetCacheSavePath()

virtual string LLMUnity.LLMCharacter.GetCacheSavePath ( string filename )

inlinevirtual

Allows to get the save path of the LLM cache based on the provided filename or relative path.

Parameters

filename filename or relative path used for the save

Returns: save path

Definition at line 219 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ GetJsonSavePath()

virtual string LLMUnity.LLMCharacter.GetJsonSavePath ( string filename )

inlinevirtual

Allows to get the save path of the chat history based on the provided filename or relative path.

Parameters

filename filename or relative path used for the save

Returns: save path

Definition at line 209 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ IsValidLLM()

override bool LLMUnity.LLMCharacter.IsValidLLM ( LLM llmSet )

inlinevirtual

Checks if a LLM is valid for the LLMCaller.

Parameters

llmSet LLM object

Returns: bool specifying whether the LLM is valid

Reimplemented from LLMUnity.LLMCaller.

Definition at line 174 of file LLMCharacter.cs.

◆ Load()

virtual async Task< string > LLMUnity.LLMCharacter.Load ( string filename )

inlinevirtual

Load the chat history and cache from the provided filename / relative path.

Parameters

filename filename / relative path to load the chat history from

Returns

Definition at line 609 of file LLMCharacter.cs.

◆ LoadTemplate()

virtual async Task LLMUnity.LLMCharacter.LoadTemplate ( )

inlinevirtual

Loads the chat template of the LLMCharacter.

Returns

Definition at line 288 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ Save()

virtual async Task< string > LLMUnity.LLMCharacter.Save ( string filename )

inlinevirtual

Saves the chat history and cache to the provided filename / relative path.

Parameters

filename filename / relative path to save the chat history

Returns

Definition at line 590 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ SetGrammar()

virtual async void LLMUnity.LLMCharacter.SetGrammar ( string path )

inlinevirtual

Sets the grammar file of the LLMCharacter.

Parameters

path	path to the grammar file

Definition at line 311 of file LLMCharacter.cs.

◆ SetPrompt()

virtual void LLMUnity.LLMCharacter.SetPrompt	(	string	newPrompt,
		bool	clearChat = true )

inlinevirtual

Set the system prompt for the LLMCharacter.

Parameters

newPrompt	the system prompt
clearChat	whether to clear (true) or keep (false) the current chat history on top of the system prompt.

Definition at line 239 of file LLMCharacter.cs.

◆ Warmup() [1/2]

virtual async Task LLMUnity.LLMCharacter.Warmup ( EmptyCallback completionCallback = null )

inlinevirtual

Allow to warm-up a model by processing the system prompt. The prompt processing will be cached (if cachePrompt=true) allowing for faster initialisation. The function allows a callback function for when the prompt is processed and the response received.

Parameters

completionCallback callback function called when the full response has been received

Returns: the LLM response

Definition at line 524 of file LLMCharacter.cs.

Here is the caller graph for this function:

◆ Warmup() [2/2]

virtual async Task LLMUnity.LLMCharacter.Warmup	(	string	query,
		EmptyCallback	completionCallback = null )

inlinevirtual

Allow to warm-up a model by processing the provided prompt without adding it to history. The prompt processing will be cached (if cachePrompt=true) allowing for faster initialisation. The function allows a callback function for when the prompt is processed and the response received.

Parameters

query	user prompt used during the initialisation (not added to history)
completionCallback	callback function called when the full response has been received

Returns: the LLM response

Definition at line 538 of file LLMCharacter.cs.

Member Data Documentation

◆ AIName

string LLMUnity.LLMCharacter.AIName = "assistant"

the name of the AI

Definition at line 116 of file LLMCharacter.cs.

◆ cachePrompt

bool LLMUnity.LLMCharacter.cachePrompt = true

cache the processed prompt to avoid reprocessing the entire prompt every time (default: true, recommended!)

Definition at line 41 of file LLMCharacter.cs.

◆ chat

List<ChatMessage> LLMUnity.LLMCharacter.chat = new List<ChatMessage>()

the chat history as list of chat messages

Definition at line 125 of file LLMCharacter.cs.

◆ debugPrompt

bool LLMUnity.LLMCharacter.debugPrompt = false

log the constructed prompt the Unity Editor.

Definition at line 29 of file LLMCharacter.cs.

◆ frequencyPenalty

float LLMUnity.LLMCharacter.frequencyPenalty = 0f

Penalty based on token frequency in previous responses to control the repetition of token sequences in the generated text. (0.0 = disabled).

Definition at line 69 of file LLMCharacter.cs.

◆ grammar

string LLMUnity.LLMCharacter.grammar = null

grammar file used for the LLMCharacter (.gbnf format)

Definition at line 38 of file LLMCharacter.cs.

◆ grammarString

string LLMUnity.LLMCharacter.grammarString

the grammar to use

Definition at line 128 of file LLMCharacter.cs.

◆ ignoreEos

bool LLMUnity.LLMCharacter.ignoreEos = false

ignore end of stream token and continue generating.

Definition at line 96 of file LLMCharacter.cs.

◆ logitBias

Dictionary<int, string> LLMUnity.LLMCharacter.logitBias = null

the logit bias option allows to manually adjust the likelihood of specific tokens appearing in the generated text. By providing a token ID and a positive or negative bias value, you can increase or decrease the probability of that token being generated.

Definition at line 106 of file LLMCharacter.cs.

◆ minP

float LLMUnity.LLMCharacter.minP = 0.05f

minimum probability for a token to be used.

Definition at line 60 of file LLMCharacter.cs.

◆ mirostat

int LLMUnity.LLMCharacter.mirostat = 0

enable Mirostat sampling, controlling perplexity during text generation (0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0).

Definition at line 84 of file LLMCharacter.cs.

◆ mirostatEta

float LLMUnity.LLMCharacter.mirostatEta = 0.1f

The Mirostat learning rate (eta) controls how quickly the algorithm responds to feedback from the generated text.

Definition at line 90 of file LLMCharacter.cs.

◆ mirostatTau

float LLMUnity.LLMCharacter.mirostatTau = 5f

The Mirostat target entropy (tau) controls the balance between coherence and diversity in the generated text.

Definition at line 87 of file LLMCharacter.cs.

◆ nKeep

int LLMUnity.LLMCharacter.nKeep = -1

number of tokens to retain from the prompt when the model runs out of context (-1 = LLMCharacter prompt tokens if setNKeepToPrompt is set to true).

Definition at line 99 of file LLMCharacter.cs.

◆ nProbs

int LLMUnity.LLMCharacter.nProbs = 0

if greater than 0, the response also contains the probabilities of top N tokens for each generated token.

Definition at line 93 of file LLMCharacter.cs.

◆ numPredict

int LLMUnity.LLMCharacter.numPredict = 256

maximum number of tokens that the LLM will predict (-1 = infinity, -2 = until context filled).

Definition at line 32 of file LLMCharacter.cs.

◆ penalizeNl

bool LLMUnity.LLMCharacter.penalizeNl = true

penalize newline tokens when applying the repeat penalty.

Definition at line 78 of file LLMCharacter.cs.

◆ penaltyPrompt

string LLMUnity.LLMCharacter.penaltyPrompt

prompt for the purpose of the penalty evaluation. Can be either null, a string or an array of numbers representing tokens (null/'' = use original prompt)

Definition at line 81 of file LLMCharacter.cs.

◆ playerName

string LLMUnity.LLMCharacter.playerName = "user"

the name of the player

Definition at line 113 of file LLMCharacter.cs.

◆ presencePenalty

float LLMUnity.LLMCharacter.presencePenalty = 0f

Penalty based on token presence in previous responses to control the repetition of token sequences in the generated text. (0.0 = disabled).

Definition at line 66 of file LLMCharacter.cs.

◆ prompt

string LLMUnity.LLMCharacter.prompt = "A chat between a curious human and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the human's questions."

a description of the AI role (system prompt)

Definition at line 119 of file LLMCharacter.cs.

◆ repeatLastN

int LLMUnity.LLMCharacter.repeatLastN = 64

last n tokens to consider for penalizing repetition (0 = disabled, -1 = ctx-size).

Definition at line 75 of file LLMCharacter.cs.

◆ repeatPenalty

float LLMUnity.LLMCharacter.repeatPenalty = 1.1f

Penalty based on repeated tokens to control the repetition of token sequences in the generated text.

Definition at line 63 of file LLMCharacter.cs.

◆ save

string LLMUnity.LLMCharacter.save = ""

file to save the chat history. The file will be saved within the persistentDataPath directory.

Definition at line 23 of file LLMCharacter.cs.

◆ saveCache

bool LLMUnity.LLMCharacter.saveCache = false

save the LLM cache. Speeds up the prompt calculation when reloading from history but also requires ~100MB of space per character.

Definition at line 26 of file LLMCharacter.cs.

◆ seed

int LLMUnity.LLMCharacter.seed = 0

seed for reproducibility (-1 = no reproducibility).

Definition at line 44 of file LLMCharacter.cs.

◆ setNKeepToPrompt

bool LLMUnity.LLMCharacter.setNKeepToPrompt = true

set the number of tokens to always retain from the prompt (nKeep) based on the LLMCharacter system prompt

Definition at line 122 of file LLMCharacter.cs.

◆ slot

int LLMUnity.LLMCharacter.slot = -1

slot of the server to use for computation (affects caching)

Definition at line 35 of file LLMCharacter.cs.

◆ stop

List<string> LLMUnity.LLMCharacter.stop = new List<string>()

stopwords to stop the LLM in addition to the default stopwords from the chat template.

Definition at line 102 of file LLMCharacter.cs.

◆ stream

bool LLMUnity.LLMCharacter.stream = true

Receive the reply from the model as it is produced (recommended!). If not selected, the full reply from the model is received in one go.

Definition at line 110 of file LLMCharacter.cs.

◆ temperature

float LLMUnity.LLMCharacter.temperature = 0.2f

LLM temperature, lower values give more deterministic answers.

Definition at line 47 of file LLMCharacter.cs.

◆ topK

int LLMUnity.LLMCharacter.topK = 40

Top-k sampling selects the next token only from the top k most likely predicted tokens (0 = disabled). Higher values lead to more diverse text, while lower value will generate more focused and conservative text.

Definition at line 52 of file LLMCharacter.cs.

◆ topP

float LLMUnity.LLMCharacter.topP = 0.9f

Top-p sampling selects the next token from a subset of tokens that together have a cumulative probability of at least p (1.0 = disabled). Higher values lead to more diverse text, while lower value will generate more focused and conservative text.

Definition at line 57 of file LLMCharacter.cs.

◆ typicalP

float LLMUnity.LLMCharacter.typicalP = 1f

enable locally typical sampling (1.0 = disabled). Higher values will promote more contextually coherent tokens, while lower values will promote more diverse tokens.

Definition at line 72 of file LLMCharacter.cs.

The documentation for this class was generated from the following file:

Runtime/LLMCharacter.cs

Public Member Functions

Public Attributes

Additional Inherited Members

Detailed Description

Member Function Documentation

◆ AddAIMessage()

◆ AddMessage()

◆ AddPlayerMessage()

◆ AskTemplate()

◆ Awake()

◆ Chat()

◆ ClearChat()

◆ Complete()

◆ GetCacheSavePath()

◆ GetJsonSavePath()

◆ IsValidLLM()

◆ Load()

◆ LoadTemplate()

◆ Save()

◆ SetGrammar()

◆ SetPrompt()

◆ Warmup() [1/2]

◆ Warmup() [2/2]

Member Data Documentation

◆ AIName

◆ cachePrompt

◆ chat

◆ debugPrompt

◆ frequencyPenalty

◆ grammar

◆ grammarString

◆ ignoreEos

◆ logitBias

◆ minP

◆ mirostat

◆ mirostatEta

◆ mirostatTau

◆ nKeep

◆ nProbs

◆ numPredict

◆ penalizeNl

◆ penaltyPrompt

◆ playerName

◆ presencePenalty

◆ prompt

◆ repeatLastN

◆ repeatPenalty

◆ save

◆ saveCache

◆ seed

◆ setNKeepToPrompt

◆ slot

◆ stop

◆ stream

◆ temperature

◆ topK

◆ topP

◆ typicalP