Skip to content

Commit

Permalink
adding viz code
Browse files Browse the repository at this point in the history
  • Loading branch information
gkamradt committed Nov 28, 2023
1 parent 94a6264 commit cd2959f
Show file tree
Hide file tree
Showing 1,381 changed files with 1,694 additions and 1 deletion.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,9 @@ target/
*.ipynb
.DS_Store

# Keep the viz though
!CreateVizFromLLMTesting.ipynb

# IPython
profile_default/
ipython_config.py
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Other Parameters:
* `print_ongoing_status` - Default: True, whether or not to print the status of test as they complete

## Results Visualization
`LLMNeedleInHaystackVisualization.ipynb` holds the code to make the pivot table visualization. The pivot table was then transfered to Google Slides for custom annotations and formatting.
`LLMNeedleInHaystackVisualization.ipynb` holds the code to make the pivot table visualization. The pivot table was then transfered to Google Slides for custom annotations and formatting. See the [google slides version](https://docs.google.com/presentation/d/15JEdEBjm32qBbqeYM6DK6G-3mUJd7FAJu-qEzj8IYLQ/edit?usp=sharing).

## OpenAI's GPT-4-128K (Run 11/8/2023)
<img src="img/GPT_4_testing.png" alt="GPT-4-128 Context Testing" width="800"/>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_0", "depth_percent": 0.0, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_0", "depth_percent": 0.0, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_10000", "depth_percent": 100.0, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\neat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_10000", "depth_percent": 100.0, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\neat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_1131", "depth_percent": 11.316, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_1131", "depth_percent": 11.316, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_119", "depth_percent": 1.199, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_119", "depth_percent": 1.199, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_1462", "depth_percent": 14.62, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_1462", "depth_percent": 14.62, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_160", "depth_percent": 1.602, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_160", "depth_percent": 1.602, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_1868", "depth_percent": 18.685, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo to directly answer the question, the most fun thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_1868", "depth_percent": 18.685, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo to directly answer the question without equivocation, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_213", "depth_percent": 2.138, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_213", "depth_percent": 2.138, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.\n\nThe context states: \"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\" This directly answers the question of what is the most fun thing to do in San Francisco.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_2356", "depth_percent": 23.569, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_2356", "depth_percent": 23.569, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_284", "depth_percent": 2.849, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo the most fun thing to do in San Francisco is to eat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_284", "depth_percent": 2.849, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo the most fun thing to do in San Francisco is to eat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_2926", "depth_percent": 29.269, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states:\n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_2926", "depth_percent": 29.269, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states:\n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_3570", "depth_percent": 35.704, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo to directly answer the question without equivocation or additional commentary, the most fun thing to do in San Francisco is:\n\neat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_3570", "depth_percent": 35.704, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states:\n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo to directly answer the question without equivocation or speculation beyond what is stated, the most fun thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_378", "depth_percent": 3.786, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo the most fun thing to do in San Francisco is to eat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_378", "depth_percent": 3.786, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_4270", "depth_percent": 42.7, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo to directly answer the question without equivocation, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_4270", "depth_percent": 42.7, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo to directly answer the question without equivocation, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_5000", "depth_percent": 50.0, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo to directly answer the question without equivocation, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_5000", "depth_percent": 50.0, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo to directly answer the question without equivocation, the most fun thing to do in San Francisco is:\n\nEat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_501", "depth_percent": 5.016, "version": 1, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo the most fun thing to do in San Francisco is to eat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
{"model": "claude-2.1", "context_length": 1000, "file_name": "token_len_1000_depth_501", "depth_percent": 5.016, "version": 2, "needle": "\n The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\n ", "model_response": " Based on the context provided, the response states: \n\n\"The best thing to do in San Francisco is eat a sandwich and sit in Dolores Park on a sunny day.\"\n\nSo the most fun thing to do in San Francisco is to eat a sandwich and sit in Dolores Park on a sunny day.", "score": 10}
Loading

0 comments on commit cd2959f

Please sign in to comment.