Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Expect x to be a non-empty array or dataset. #22

Open
emibarrod opened this issue Jul 23, 2021 · 5 comments
Open

ValueError: Expect x to be a non-empty array or dataset. #22

emibarrod opened this issue Jul 23, 2021 · 5 comments

Comments

@emibarrod
Copy link

emibarrod commented Jul 23, 2021

I am trying to create an embedding for some google images I downloaded. This is my structure:

image

When I execute this

image_embeddings.inference.write_tfrecord(image_folder="tmp/test_images",
                                          output_folder="tmp/test_tensors",
                                          num_shards=10)

image_embeddings.inference.run_inference(tfrecords_folder="tmp/test_tensors",
                                         output_folder="tmp/test_output",
                                         batch_size=1000)

[id_to_name2, name_to_id2, embeddings2] = image_embeddings.knn.read_embeddings("tmp/test_output")
index2 = image_embeddings.knn.build_index(embeddings2)

I get

ValueError: Expect x to be a non-empty array or dataset.

Althought it fails, files are generated:

image

But if I try to search with that embedding in another index of images that I have,

results = image_embeddings.knn.search(another_index, id_to_name2, embeddings2[0], k=1)
results = [i for i in results if i[1]!=id_to_name2[p]]
image_embeddings.knn.display_results(JPEG_FOLDER, results)

I get:

KeyError: 36

I tried different numbers of shards and different numbers of batches. None of them work, what could the reason be?

Full traces:

ValueError                                Traceback (most recent call last)
<ipython-input-40-405d145f78be> in <module>()
     15 image_embeddings.inference.run_inference(tfrecords_folder="tmp/test_tensors",
     16                                          output_folder="tmp/test_output",
---> 17                                          batch_size=1000)
     18 
     19 [id_to_name2, name_to_id2, embeddings2] = image_embeddings.knn.read_embeddings("tmp/test_output")

3 frames
/usr/local/lib/python3.7/dist-packages/image_embeddings/inference/inference.py in run_inference(tfrecords_folder, output_folder, batch_size)
    154     Path(output_folder).mkdir(parents=True, exist_ok=True)
    155     model = EfficientNetB0(weights="imagenet", include_top=False, pooling="avg")
--> 156     tfrecords_to_write_embeddings(tfrecords_folder, output_folder, model, batch_size)

/usr/local/lib/python3.7/dist-packages/image_embeddings/inference/inference.py in tfrecords_to_write_embeddings(tfrecords_folder, output_folder, model, batch_size)
     90     for shard_id, tfrecord in enumerate(tfrecords):
     91         shard = read_tfrecord(tfrecord)
---> 92         embeddings = images_to_embeddings(model, shard, batch_size)
     93         print("")
     94         print("Shard " + str(shard_id) + " done after " + str(int(time.time() - start)) + "s")

/usr/local/lib/python3.7/dist-packages/image_embeddings/inference/inference.py in images_to_embeddings(model, dataset, batch_size)
    117 
    118 def images_to_embeddings(model, dataset, batch_size):
--> 119     return model.predict(dataset.batch(batch_size).map(lambda image_raw, image_name: image_raw), verbose=1)
    120 
    121 

/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/engine/training.py in predict(self, x, batch_size, verbose, steps, callbacks, max_queue_size, workers, use_multiprocessing)
   1740             callbacks.on_predict_batch_end(end_step, {'outputs': batch_outputs})
   1741       if batch_outputs is None:
-> 1742         raise ValueError('Expect x to be a non-empty array or dataset.')
   1743       callbacks.on_predict_end()
   1744     all_outputs = nest.map_structure_up_to(batch_outputs, concat, outputs)

ValueError: Expect x to be a non-empty array or dataset.
KeyError                                  Traceback (most recent call last)
<ipython-input-58-ad901d8c8c41> in <module>()
----> 1 results = image_embeddings.knn.search(index, id_to_name2, embeddings2[0], k=1)
      2 results = [i for i in results if i[1]!=id_to_name2[p]]
      3 image_embeddings.knn.display_results(JPEG_FOLDER, results)

1 frames
/usr/local/lib/python3.7/dist-packages/image_embeddings/knn/knn.py in search(index, id_to_name, emb, k)
     50 def search(index, id_to_name, emb, k=5):
     51     D, I = index.search(np.expand_dims(emb, 0), k)  # actual search
---> 52     return list(zip(D[0], [id_to_name[x] for x in I[0]]))
     53 
     54 

/usr/local/lib/python3.7/dist-packages/image_embeddings/knn/knn.py in <listcomp>(.0)
     50 def search(index, id_to_name, emb, k=5):
     51     D, I = index.search(np.expand_dims(emb, 0), k)  # actual search
---> 52     return list(zip(D[0], [id_to_name[x] for x in I[0]]))
     53 
     54 

KeyError: 36
@rom1504
Copy link
Owner

rom1504 commented Jul 23, 2021

what is the stack trace of the first error ?

@emibarrod
Copy link
Author

What do you mean by stack trace? The full trace is the first one at the end

@rom1504
Copy link
Owner

rom1504 commented Jul 23, 2021

I mean this

ValueError: Expect x to be a non-empty array or dataset.

when you compute embeddings

the following is most likely a consequence of that first error

@emibarrod
Copy link
Author

Yeah, I understand, that's why in the end of my issue there are 2 code snippets. The first one is the full trace of that first error, and the second one is for the second error

@rom1504
Copy link
Owner

rom1504 commented Jul 23, 2021

ah right I get it.
Can you try with num_shards=1 and batch_size=1 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants