-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Help me with my use case plz #38
Comments
Sorry for the late reply. If my understanding is not wrong, you want to get n top similar vectors inside the same Vector Set. So just follow below steps:
3.Fill data
You should notice here the number 1, 2 are their ids inside simbase, you can setup a map between your IDs and the ids here.
You can retrieve the inner id from your ID via the map, for example it is 1234, and then issue the command:
Hope the above instructions help. |
First two steps are okay. Others should be too, thank you! I want to perform mass insertion in redis, but there is something wrong. I saw the issues regarding redis, but figuring it out so far, I would very much appreciate your help. Dimensions of vectors are 300, so I have I either run this batch file with |
The pipe mode of redis protocol is not implemented. So I think below command will work. redis-cli vadd vector 1 8.748467856397232E-4 0.008283308086295923 0.014330921694636345 0.02630641683936119 ...
redis-cli vadd vector 2 0.032103515822779045 0.019140462851448155 0.035745080137117344 0.025860785591331394 ... |
Or you can use python etc import redis
dest = redis.Redis(host='localhost', port=7654)
with open('csvdatafile.txt') as data:
for idx, line in enumerate(data):
line = line[:-1]
components = line.split(',')
dest.execute_command('vadd', 'vector', idx, *components) |
Great the python script helped, I changed it a bit, and looks like this now: import redis
dest = redis.Redis(host='localhost', port=7654)
with open('tmpFiles/t300.txt') as t300:
for idx, line in enumerate(t300):
line = line[:-1]
b = line.split(' ')
print("Setting vector dimensions (b300): " + dest.execute_command('bmk', 'b300', line))
print("And the name (video) of vector set with b300 dimension: " + dest.execute_command('vmk', 'b300', 'video'))
print("Setting recommender (video->video): " + dest.execute_command('rmk', 'video', 'video', 'cosinesq'))
with open('tmpFiles/batch2.txt') as data:
for idx, line in enumerate(data):
line = line[:-1]
components = line.split(',')
print("ID:" + str(idx+1) + ": " + dest.execute_command('vadd', 'video', idx+1, *components)) And successfully executed it, but after I try to get some vector: |
Could you paste the error in log file, it is at log directory |
2017-02-27 19:24:35 INFO SimEngineImpl:313 - loading basis[b300]
2017-02-27 19:24:36 ERROR SimEngineImpl:56 - java.lang.ArrayIndexOutOfBoundsException
com.guokr.simbase.errors.SimException: java.lang.ArrayIndexOutOfBoundsException
at com.guokr.simbase.engine.SimBasis.bload(SimBasis.java:105)
at com.guokr.simbase.engine.SimEngineImpl$3.invoke(SimEngineImpl.java:322)
at com.guokr.simbase.engine.SimEngineImpl$AsyncSafeRunner.run(SimEngineImpl.java:54)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at gnu.trove.list.array.TFloatArrayList.toArray(TFloatArrayList.java:715)
at com.guokr.simbase.store.DenseVectorSet.get(DenseVectorSet.java:124)
at com.guokr.simbase.store.DenseVectorSet.get(DenseVectorSet.java:133)
at com.guokr.simbase.store.Recommendation.<init>(Recommendation.java:58)
at com.guokr.simbase.store.SerializerHelper$RecommendationSerializer.read(SerializerHelper.java:203)
at com.guokr.simbase.store.SerializerHelper.readR(SerializerHelper.java:300)
at com.guokr.simbase.store.SerializerHelper.readRecommendations(SerializerHelper.java:339)
at com.guokr.simbase.engine.SimBasis.bload(SimBasis.java:84)
... 5 more
2017-02-27 20:07:13 INFO SimEngineImpl:313 - loading basis[b300]
2017-02-27 20:07:13 ERROR SimEngineImpl:56 - java.lang.ArrayIndexOutOfBoundsException
com.guokr.simbase.errors.SimException: java.lang.ArrayIndexOutOfBoundsException
at com.guokr.simbase.engine.SimBasis.bload(SimBasis.java:105)
at com.guokr.simbase.engine.SimEngineImpl$3.invoke(SimEngineImpl.java:322)
at com.guokr.simbase.engine.SimEngineImpl$AsyncSafeRunner.run(SimEngineImpl.java:54)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at gnu.trove.list.array.TFloatArrayList.toArray(TFloatArrayList.java:715)
at com.guokr.simbase.store.DenseVectorSet.get(DenseVectorSet.java:124)
at com.guokr.simbase.store.DenseVectorSet.get(DenseVectorSet.java:133)
at com.guokr.simbase.store.Recommendation.<init>(Recommendation.java:58)
at com.guokr.simbase.store.SerializerHelper$RecommendationSerializer.read(SerializerHelper.java:203)
at com.guokr.simbase.store.SerializerHelper.readR(SerializerHelper.java:300)
at com.guokr.simbase.store.SerializerHelper.readRecommendations(SerializerHelper.java:339)
at com.guokr.simbase.engine.SimBasis.bload(SimBasis.java:84)
... 5 more |
@mountain any guess? :) Is it that either 300 dimensions are too many for basis vector or the lengths of floating points of the vectors are too big? |
I tried simbase with 10d vectors it's ok, and as I try with >10 dimension vectors (e.g. 11d, although I tried with 13d, 14d, 15d, 25d, 50d) I get the following error 2017-03-04 20:43:03 INFO SimEngineImpl:385 - basis[b11] created
2017-03-04 20:43:03 INFO SimEngineImpl:460 - vectorset[video] created under basis[b11]
2017-03-04 20:43:03 INFO SimEngineImpl:727 - creating recommendation[video_video] with funcscore[cosinesq]
2017-03-04 20:43:03 INFO SimEngineImpl:740 - recommendation[video_video] created with funcscore[cosinesq]
2017-03-04 20:43:03 ERROR SimEngineImpl:56 -
java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at gnu.trove.list.array.TFloatArrayList.toArray(TFloatArrayList.java:715)
at com.guokr.simbase.store.DenseVectorSet.get(DenseVectorSet.java:124)
at com.guokr.simbase.store.DenseVectorSet.rescore(DenseVectorSet.java:276)
at com.guokr.simbase.store.Recommendation.processDenseChangedEvt(Recommendation.java:129)
at com.guokr.simbase.store.Recommendation.onVectorAdded(Recommendation.java:208)
at com.guokr.simbase.store.DenseVectorSet.add(DenseVectorSet.java:152)
at com.guokr.simbase.engine.SimBasis.vadd(SimBasis.java:153)
at com.guokr.simbase.engine.SimEngineImpl$14.invoke(SimEngineImpl.java:513)
at com.guokr.simbase.engine.SimEngineImpl$AsyncSafeRunner.run(SimEngineImpl.java:54)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745) |
Could you please help me with a starter code for my use case)
I want to store in vector similarity db
key:
sentenceIDvalue:
vector. Examples:id_1 [0.06284283101558685, 0.046207964420318604, 0.0053909290581941605, ...]
id_2 [0.006631242576986551, 0.08234132081270218, -0.0787612572312355, ...]
And then I want n top similar vectors' IDs to the given vector.
The text was updated successfully, but these errors were encountered: