forked from ggerganov/llama.cpp
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
maybe junk but why not?
- Loading branch information
mike dupont
committed
Nov 29, 2023
1 parent
1807a6e
commit 46d9bec
Showing
2 changed files
with
798 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,254 @@ | ||
|
||
meta reflective and introspective ideas applied to tensors in a llm and allowing the llm to | ||
sample the metadata and hyperparameters and actual tensors via a data inference api. | ||
|
||
llama_client_slot { | ||
id = 0, | ||
task_id = 0, | ||
params = slot_params { | ||
stream = true, | ||
cache_prompt = true, | ||
seed = 4294967295, | ||
n_keep = 0, | ||
n_predict = 400, | ||
antiprompt = [ | ||
"</s>", | ||
"Llama:", | ||
"User:" | ||
], | ||
input_prefix = "", | ||
input_suffix = "", | ||
}, | ||
state = 0, | ||
command = 1, | ||
t_last_used = -1, | ||
n_ctx = 512, | ||
n_past = 0, | ||
n_decoded = 0, | ||
n_remaining = -1, | ||
i_batch = -1, | ||
num_prompt_tokens = 0, | ||
num_prompt_tokens_processed = 0, | ||
multibyte_pending = 0, | ||
prompt = "This is a conversation between User and Llama, a friendly chatbot. Llama is helpful, kind, honest, good at writing, and never fails to answer any requests immediately and with precision.\n\nUser: What is the meaning of life?\nLlama: The meaning of life is different for everyone. Some people find their purpose in helping others or pursuing certain interests, while some may find it through spiritual beliefs or simply finding happiness and contentment in everyday life. Ultimately, there is no one true answer to this question as it depends on individual perspectives and values.\nUser: And what is the meaning of life for mistral llm?\nLlama:", | ||
generated_text = "", | ||
sampled = 0, | ||
cache_tokens = [ ], | ||
generated_token_probs = [ ], | ||
infill = false, | ||
embedding = false, | ||
has_next_token = true, | ||
truncated = false, | ||
stopped_eos = false, | ||
stopped_word = false, | ||
stopped_limit = false, | ||
oaicompat = false, | ||
oaicompat_model = "", | ||
stopping_word = "", | ||
sparams = llama_sampling_params { | ||
}, | ||
ctx_sampling = &llama_sampling_context { params = llama_sampling_params { }, mirostat_mu = 0, grammar = nullptr, parsed_grammar = grammar_parser::parse_state { } }, | ||
images = [ ], | ||
sent_count = 0, | ||
sent_token_probs_index = 0, | ||
t_start_process_prompt = 0, | ||
t_start_genereration = 140725474879488, | ||
t_prompt_processing = 4.67372e-310, | ||
t_token_generation = 3.06393e-49, | ||
} | ||
|
||
#+begin_src input | ||
execute """meta reflective and introspective ideas applied to tensors in a llm and allowing the llm to | ||
sample the metadata and hyperparameters and actual tensors via a data inference api. | ||
""" | ||
#+end_src | ||
#+begin_src output | ||
|
||
Reflective and introspective ideas can be applied to tensors in an LLM (Language Model) through the use of metadata, hyperparameters, and a data inference API. The idea is to allow the LLM to sample information about itself, such as its metadata and hyperparameters, as well as actual tensors through the use of a data inference API. This allows for a more dynamic and adaptive approach to training and using an LLM, which can lead to better performance and improved efficiency. | ||
#+end_src | ||
|
||
#+begin_src input | ||
execute """meta reflective and introspective ideas applied to tensors in a llm""" | ||
#+end_src | ||
#+begin_src output | ||
|
||
Meta-reflective and introspective ideas are concepts that involve understanding one's own thoughts, beliefs, and actions. In the context of tensor processing in artificial intelligence (AI), these ideas can be applied to gain insights into the behavior and performance of AI models. | ||
|
||
One way to use meta-reflective and introspective ideas in tensor processing is through techniques like attention mechanisms, saliency maps, and visualization tools. These techniques allow AI models to focus on specific parts of an input or output while processing it, which can help improve the accuracy and interpretability of the model's predictions. | ||
|
||
Another way to apply meta-reflective and introspective ideas in tensor processing is through methods like adversarial training and model explainability techniques. Adversarial training involves finding small perturbations to an input that cause the model to make incorrect predictions, which can help identify weaknesses in the model's architecture or decision-making processes. Model explainability techniques involve analyzing the internal workings of a model to understand how it arrives at its predictions, which can help improve trust and understanding of AI systems. | ||
|
||
Overall, applying meta-reflective and introspective ideas to tensor processing in AI has the potential to improve the performance and interpretability of AI models, as well as increase our understanding of how these models make decisions and why they behave in certain ways. | ||
#+end_src | ||
|
||
#+begin_src input | ||
execute """meta reflective and introspective ideas | ||
""" | ||
#+end_src | ||
|
||
|
||
#+begin_src output | ||
|
||
The concept of metareflection refers to the ability of an individual or entity to reflect on their own reflection. This involves recognizing and examining one's own thoughts, feelings, and behaviors in relation to how they are perceived by others. In other words, it is the ability to observe and analyze oneself as an object of reflection. | ||
|
||
Introspection, on the other hand, refers to the act of turning one's attention inward towards oneself. It involves examining one's own thoughts, feelings, and experiences from within, often with the goal of gaining a better understanding of oneself and one's place in the world. Introspection can be seen as a more focused and structured form of reflection that is directed towards self-exploration and self-discovery. | ||
|
||
Both metareflection and introspection are important for personal growth and development, as they allow individuals to gain insight into their own thoughts, feelings, and behaviors, and to make changes based on this insight. They can also be useful in areas such as communication, relationships, and career development, as they help individuals better understand themselves and others, and to communicate more effectively. | ||
#+end_src | ||
|
||
|
||
#+begin_src input | ||
execute """ | ||
meta reflective and introspective ideas applied to tensors in a llm and allowing the llm to | ||
sample the metadata and hyperparameters and actual tensors via a data inference api. | ||
|
||
llama_client_slot { | ||
id = 0, | ||
task_id = 0, | ||
params = slot_params { | ||
stream = true, | ||
cache_prompt = true, | ||
seed = 4294967295, | ||
n_keep = 0, | ||
n_predict = 400, | ||
antiprompt = [ | ||
"</s>", | ||
"Llama:", | ||
"User:" | ||
], | ||
input_prefix = "", | ||
input_suffix = "", | ||
}, | ||
state = 0, | ||
command = 1, | ||
t_last_used = -1, | ||
n_ctx = 512, | ||
n_past = 0, | ||
n_decoded = 0, | ||
n_remaining = -1, | ||
i_batch = -1, | ||
num_prompt_tokens = 0, | ||
num_prompt_tokens_processed = 0, | ||
multibyte_pending = 0, | ||
prompt = "This is a conversation between User and Llama, a friendly chatbot. Llama is helpful, kind, honest, good at writing, and never fails to answer any requests immediately and with precision.\n\nUser: What is the meaning of life?\nLlama: The meaning of life is different for everyone. Some people find their purpose in helping others or pursuing certain interests, while some may find it through spiritual beliefs or simply finding happiness and contentment in everyday life. Ultimately, there is no one true answer to this question as it depends on individual perspectives and values.\nUser: And what is the meaning of life for mistral llm?\nLlama:", | ||
generated_text = "", | ||
sampled = 0, | ||
cache_tokens = [ ], | ||
generated_token_probs = [ ], | ||
infill = false, | ||
embedding = false, | ||
has_next_token = true, | ||
truncated = false, | ||
stopped_eos = false, | ||
stopped_word = false, | ||
stopped_limit = false, | ||
oaicompat = false, | ||
oaicompat_model = "", | ||
stopping_word = "", | ||
sparams = llama_sampling_params { | ||
}, | ||
ctx_sampling = &llama_sampling_context { params = llama_sampling_params { }, mirostat_mu = 0, grammar = nullptr, parsed_grammar = grammar_parser::parse_state { } }, | ||
images = [ ], | ||
sent_count = 0, | ||
sent_token_probs_index = 0, | ||
t_start_process_prompt = 0, | ||
t_start_genereration = 140725474879488, | ||
t_prompt_processing = 4.67372e-310, | ||
t_token_generation = 3.06393e-49, | ||
} | ||
""" | ||
#+end_src | ||
#+begin_src output | ||
This appears to be a code snippet for the `llama_client_slot` struct in a language like Python. The struct appears to store information about a particular slot in an LLM (Large Language Model), including its current state and any parameters associated with it. | ||
|
||
The `params` field is a nested structure that contains various configuration options for the LLM, such as whether it should use streaming input, cache prompts, and so on. The `sparams` field appears to be another nested structure specifically related to sampling parameters for the LLM. | ||
|
||
The `ctx_sampling` field appears to contain a pointer to an object that handles context sampling in the LLM. This is likely used to sample from the metadata and hyperparameters associated with the LLM, as well as any tensors that may be generated during processing. The `images` field appears to store some additional data related to images generated by the LLM. | ||
|
||
The `sent_token_probs_index` field appears to contain an index into a larger array of token probability values generated by the LLM. This is likely used to sample from the generated text produced by the LLM. | ||
#+end_src | ||
|
||
|
||
|
||
#+begin_src input | ||
execute """}llama_client_slot { | ||
id = 0, | ||
task_id = 8, | ||
params = slot_params { | ||
stream = true, | ||
cache_prompt = true, | ||
seed = 4294967295, | ||
n_keep = 0, | ||
n_predict = 400, | ||
antiprompt = [ | ||
"</s>", | ||
"Llama:", | ||
"User:" | ||
], | ||
input_prefix = "", | ||
input_suffix = "", | ||
}, | ||
state = 1, | ||
command = 0, | ||
t_last_used = 2962617882040, | ||
n_ctx = 512, | ||
n_past = 76, | ||
n_decoded = 15, | ||
n_remaining = 386, | ||
i_batch = 0, | ||
num_prompt_tokens = 61, | ||
""" | ||
#+end_src | ||
#+begin_src output | ||
This looks like a configuration for a client-side interaction with an LLama model. Here's what each field means: | ||
|
||
* `id`: The unique identifier of the client slot. | ||
* `task_id`: The ID of the task associated with this client slot. | ||
* `params`: The parameters used by the LLama model to generate text. | ||
+ `stream`: Whether the model should generate output in a streaming fashion or all at once. | ||
+ `cache_prompt`: Whether the model should cache previous prompts to improve performance. | ||
+ `seed`: A random seed used by the model. | ||
+ `n_keep`: The number of previous outputs to keep for caching. | ||
+ `n_predict`: The number of output tokens to generate. | ||
+ `antiprompt`: A list of special tokens that should be excluded from the input prompt (e.g., "<s>" and "Llama:"). | ||
+ `input_prefix`: The prefix used before the user's input prompt. | ||
+ `input_suffix`: The suffix used after the user's input prompt. | ||
* `state`: The current state of the client slot (e.g., active, idle). | ||
* `command`: The command being executed by the client slot. | ||
* `t_last_used`: The timestamp of when this client slot was last used. | ||
* `n_ctx`: The number of context tokens to keep for generating text. | ||
* `n_past`: The number of previous outputs that have been generated. | ||
* `n_decoded`: The number of decoded tokens that have been received from the server. | ||
* `n_remaining`: The number of decoded tokens remaining to be received from the server. | ||
* `i_batch`: The batch index (starting at 0) for generating text in a streaming fashion. | ||
* `num_prompt_tokens`: The total number of prompt tokens that have been sent to the model. | ||
#+end_src | ||
|
||
#+begin_src output | ||
```python | ||
import json | ||
from typing import List | ||
|
||
def parse_generated_token_probs(response: str): | ||
data = json.loads(response) | ||
generated_token_probs = [] | ||
|
||
for completion_token_output in data["completion_token_output"]: | ||
probs = [prob for prob in completion_token_output["probs"]] | ||
tok = completion_token_output["tok"] | ||
text_to_send = completion_token_output["text_to_send"] | ||
|
||
generated_token_probs.append({ | ||
"probs": probs, | ||
"tok": tok, | ||
"text_to_send": text_to_send, | ||
}) | ||
|
||
return generated_token_probs | ||
``` | ||
You can use this function to parse the generated token probabilities from the provided response. The function returns a list of dictionaries, where each dictionary contains the token probabilities, token ID, and text to send for each completion token output in the response. | ||
#+end_src | ||
|
||
|
||
|
||
./main > ./log3.txt 2>err3.txt |
Oops, something went wrong.