Skip to content

Commit

Permalink
#1 - Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
obriensystems authored Jan 17, 2025
1 parent 42b32af commit bb0e528
Showing 1 changed file with 7 additions and 10 deletions.
17 changes: 7 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,27 +145,24 @@ mp: 0:2610744987 p: 1050 m: 0:966616035460 ms: 67696 dur: 182

## GPU
20250116: GPU code is CPU bound for thread processing - at 100% cpu (other process) GPU slows by 4-10 times. I need to increase the threads sent to the GPU past 64k

### Multi Threaded : 40 bit run
#### 128 bit native
##### CUDA 12.6: CPP
### 128 bit native
#### CUDA 12.6: CPP
##### Multi Threaded : 42 bit run
- sec 13900K b 32 core RTX-4090 Ada gpu 0 single 16384 cores 50% GPU 24% TDP 35840 threads 256 threads/block no av - batch 20
##### Multi Threaded : 40 bit run
- 4232 sec 13900K b 32 core RTX-4090 Ada gpu 0 single 16384 cores 50% GPU 24% TDP 35840 threads 256 threads/block no av - batch 20
- 5347 sec 14900K c RTX-A6000 single 60% GPU 54% TDP .5g/48g - 35840k threads / 256 threads/block batch 20 no av
- 5508 sec 13900KS d RTX-A4500 gpu 1 single 3.2/5.9 GHz - 20 batch - no av
- sec P1Gen6 13800H RTX-3500 Ada mobile 5120 cores 60% GPU - 20 batch, 256 7168*5 threads - with path - no av
- sec P1Gen6 13800H RTX-3500 Ada mobile 5120 cores 60% GPU - 20 batch, 256 7168*5 threads - with path - no av

### Multi Threaded : 37 bit run
#### 128 bit native
##### CUDA 12.6: CPP
- 528 sec 13900K b 32 core RTX-4090 Ada single 16384 cores 50% GPU 24% TDP 35840 threads 256 threads/block no av - batch 20
- sec 13900KS d RTX-A4500 gpu 0 single 3.2/5.9 GHz - 20 batch - no av
- 641 sec 13900KS d RTX-A4500 gpu 1 single 3.2/5.9 GHz - 20 batch - no av
- 658 sec 14900K c RTX-A6000 single 55% GPU 45% TDP .5g/48g - 35840k threads / 256 threads/block no av - batch 20? (rerun)
- 799 sec P1Gen6 13800H RTX-3500 Ada mobile 5120 cores 60% GPU - 20 batch, 256 7168*5 threads - with path - no av
-
### Multi Threaded : 32 bit run (search 0-(2^32-1) odd integer space)
#### 128 bit native
##### CUDA 12.6: CPP
- 14 sec 14900K c RTX-A6000 single 55% GPU 45% TDP .5g/48g - 32k threads / 512 threads/block
- 14 sec RTX-4090 Ada single 16384 cores 48% GPU 24% TDP 40960 threads 512 threads/block 80 blocks - batch 20
- 17 sec RTX-4090 Ada single 16384 cores 48% GPU 24% TDP 35840 threads 256/512 threads/block 160 blocks - batch 20
Expand All @@ -175,7 +172,7 @@ mp: 0:2610744987 p: 1050 m: 0:966616035460 ms: 67696 dur: 182
- 20 sec RTX-5000 TU104 16g mobile P17gen1
- 24 sec P1Gen6 13800H RTX-3500 Ada mobile 5120 cores 60% GPU - 20 batch, 256 7168*5 threads - with path - no av - cold start no thermal throttling
-
#### 64 bit native
### 64 bit native
Sec: 4 GlobalMax: 319804831 : 1414236446719942480 last search : 1073741825
- 9 sec 14900K c RTX-A6000 8/32c single 45% GPU 24% TDP .9g/48g - 32k threads / 256 threads/block
- 10 sec 13900K b RTX-4090 Ada single 45% GPU 22% TDP .9g/24g 32k threads / 256 threads/block
Expand Down

0 comments on commit bb0e528

Please sign in to comment.