-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k-means anchors ratios #308
Comments
@mnslarcher Good to see this. |
Thanks @Cli98 ! |
@mnslarcher lol you claimed that your repo is faster than mine. I don't agree on that since I see no reasons to justify this. |
Hi @Cli98 :)! I'm saying this because you use the implementation of https://github.com/zhouyuangan/K-Means-Anchors. From that implementation I removed this
by vectoring the iou calculation. From you notebook:
This is mine:
This made the implementation faster as one would expect. I have an option to run K-Means more than one time but if you set --num-runs=1 you should see some benefit, obviously feel free to correct me but this is the reason |
Where is your vectorized iou code? Are you referring to this? nearest_centroids = np.argmax(iou(boxes, centroids), axis=1) |
In that line I'm using the vectorized iou, this is my implementation:
This is the original one:
If you see, one return a (n, k) array and the other a (k, 0) |
@mnslarcher I see. |
@Cli98 , I did it but honestly I don't remember the exact result, maybe I will re-do it this we and post it here :)
Why do you say that we have ~50 bboxs? in my dataset I have 100k images with more ore less 10 boxes each so 1mln bboxes, far more than 50 if I get your point (probably I misunderstood you). Clearly if your dataset is small the benefit is basically zero. In general the increase of speed is not a crucial feature of my repo, is only a plus, other than that I added a couple of functionalities that were useful for me and I will continue to add more, I have not done anything extraordinary, but if anyone wants to try it, I will be happy to receive any feedback |
Hi @Cli98 , I did the speed test, the result, if I haven't made any mistake, is that my implementation is much faster (70x faster on COCO train 2017). I saw in a comment that you then edited that you didn't find any evidence of this so I'll post here the results and also how you can replicate them, I did a very fast experiment so feel free to tell me if you spot any mistake, I didn't use timeit just because I didn't want to wait but feel free to replace %%time with %%timeit. The results (mine vs your): So if you see, for COCO mine take just 6.5s, your 7min 24s, both return the same result (mine is 70x faster in this experiment). Get the bboxes: import json
import numpy as np
import os
INSTANCES_PATH = "./annotations/instances_train2017.json"
if not os.path.exists(INSTANCES_PATH):
!wget http://images.cocodataset.org/annotations/annotations_trainval2017.zip
!unzip annotations_trainval2017.zip
with open(INSTANCES_PATH) as f:
instances = json.load(f)
bboxes = np.array([i["bbox"][2:] for i in instances["annotations"] if np.prod(i["bbox"][2:]) > 0])
del instances Define the functions (not_vectorized are the original fun that you are also using in your notebook, the others are the ones that I defined for my repo): def not_vectorized_iou(box, clusters):
"""
Calculates the Intersection over Union (IoU) between a box and k clusters.
:param box: tuple or array, shifted to the origin (i. e. width and height)
:param clusters: numpy array of shape (k, 2) where k is the number of clusters
:return: numpy array of shape (k, 0) where k is the number of clusters
"""
x = np.minimum(clusters[:, 0], box[0])
y = np.minimum(clusters[:, 1], box[1])
if np.count_nonzero(x == 0) > 0 or np.count_nonzero(y == 0) > 0:
raise ValueError("Box has no area")
intersection = x * y
box_area = box[0] * box[1]
cluster_area = clusters[:, 0] * clusters[:, 1]
iou_ = intersection / (box_area + cluster_area - intersection)
return iou_
def not_vectorized_kmeans(boxes, k, dist=np.median):
"""
Calculates k-means clustering with the Intersection over Union (IoU) metric.
:param boxes: numpy array of shape (r, 2), where r is the number of rows
:param k: number of clusters
:param dist: distance function
:return: numpy array of shape (k, 2)
"""
rows = boxes.shape[0]
distances = np.empty((rows, k))
last_clusters = np.zeros((rows,))
np.random.seed()
# the Forgy method will fail if the whole array contains the same rows
clusters = boxes[np.random.choice(rows, k, replace=False)]
while True:
for row in range(rows):
distances[row] = 1 - not_vectorized_iou(boxes[row], clusters)
nearest_clusters = np.argmin(distances, axis=1)
if (last_clusters == nearest_clusters).all():
break
for cluster in range(k):
clusters[cluster] = dist(boxes[nearest_clusters == cluster], axis=0)
last_clusters = nearest_clusters
return clusters
def iou(boxes, anchors):
"""Calculates the Intersection over Union (IoU) between a numpy array of
n boxes and an array of k anchors.
Arguments:
boxes (array_like): array of shape (n, 2) of boxes' widths and heights.
anchors (array_like): array of shape (k, 2) of anchors' widths and heights.
Returns:
A numpy array of shape (n, k) containing the IoU values for each combination
of boxes and anchors.
"""
intersection_width = np.minimum(anchors[:, [0]], boxes[:, 0]).T
intersection_height = np.minimum(anchors[:, [1]], boxes[:, 1]).T
if np.any(intersection_width == 0) or np.any(intersection_height == 0):
raise ValueError("Some boxes have zero size.")
intersection_area = intersection_width * intersection_height
boxes_area = np.prod(boxes, axis=1, keepdims=True)
anchors_area = np.prod(anchors, axis=1, keepdims=True).T
union_area = boxes_area + anchors_area - intersection_area
return intersection_area / union_area
def kmeans(boxes, num_clusters=3, max_iter=300, seed=None, centroid_calc_fn=np.median):
"""Calculates K-Means clustering using the Intersection over Union (IoU) metric.
Arguments:
boxes (array_like): array of the bounding boxes' heights and widths, with
shape (n, 2).
num_clusters (int): the number of clusters to form as well as the number of
centroids to generate.
max_iter (int, optional): maximum number of iterations of the K-Means algorithm
for a single run. Default: 300.
seed: (int, optional): random seed. Default: None.
centroid_calc_fn (function, optional): function used for calculating centroids
Default: numpy.median.
Returns:
A numpy array of shape (num_clusters, 2).
"""
np.random.seed(seed)
last_nearest_centroids = np.ones(boxes.shape[0]) * -1
# the Forgy method will fail if the whole array contains the same rows
centroids = boxes[np.random.choice(boxes.shape[0], num_clusters, replace=False)]
i = 0
while True:
if i >= max_iter:
logger.warning(
f"[{datetime.now().strftime('%m/%d %H:%M:%S')}] "
"Maximum number of iterations reached, increase max_inter to do more "
f"iterations (max_inter = {max_iter})"
)
break
nearest_centroids = np.argmax(iou(boxes, centroids), axis=1)
for centroid in range(num_clusters):
centroids[centroid] = centroid_calc_fn(
boxes[nearest_centroids == centroid], axis=0
)
if (nearest_centroids == last_nearest_centroids).all():
break
last_nearest_centroids = nearest_centroids
i += 1
return centroids Run the experiment that you wish, es:
Result:
Result:
|
Apologize for late reply. I'm referring to single image instead, not for all of them. Normally we do not conduct kmean experiments on all bounding boxes given their large variation, not to say class imbalance issue. |
@mnslarcher I actually modify my comment and delete the second sentence. But I have no idea why you see can see my previous statement. I actually conduct my experiment on all code snipes and find avg 1000 trials per 100 bboxs gives similar performance 38/57 ms. But you're get a different way to time it. I think it also makes sense. So I think that's the end of this discussion and I will try to do some optimization for my code XD |
Mmm I'm not sure I get your point, I understand that maybe you don't want to run K-Means on all you bounding boxes if your dateset is not balanced (mine is balanced) but I don't understand what is the utility of running K-Means on just one image to decide anchors ratios, but is late where I'm and I'm tired so I'll go to sleep now :)... |
"running K-Means on just one image to..." |
actually, I think sklearn.cluster.KMeans is much faster, considering it's a c impl. |
Probably, but what we call K-Means here is K-Means with IoU as a distance metric (is the main contribution of this kind of repos) and for my current understanding (in the beginning I tried to see if it was possible to use sklearn) is not straightforward to implement K-Means using as a distance metric IoU using sklearn, if someone know how to do the same using that implementation of K-Means I would be curious to see it and eventually to update my code. The first problem is that that implementation doesn’t let you choose a distance metric at all. Then there are other stuffs that you need to consider for this implementation, if you look at the k-means implementation of these repos you should be able to see it 😊. |
@mnslarcher
But yours is obviously much faster and more efficient because sklearn uses all available cpus while yours only takes one. And if the statistics are right, I think both of them provide the similar correct result. I think both using iou and euclidean distance are quite similar for this kind of situation. But yes, iou is more straightforward. Just realize this is why iou loss beats l1 loss in training bbox regression model. your algorithm:
sklearn's algorithm:
|
Nice! yes I think is true than one can have a decent approximation of K-Means with IoU using regular K-Means with euclidean distance, in both cases you end up with reasonable anchors ratios |
@mnslarcher
The ratio of bboxes with similar anchors reduces by 1% after using the optimal anchors. Also the reason why it sometimes doesn't help after the fitting should be also intriguing. |
It makes sense, in the end K-Means is not trying to maximize the coverage, is doing something correlated but not exactly the same, in particular is possible that the default anchors have a lower avg IoU with the bounding boxes but they have an IoU > 0.5 more frequently, instead the avg IoU of the K-Means anchors is probably higher but they sacrifice some bounding boxes for this. At least this is my current intuition. I think that default > k-means is almost always when both have a very good coverage after removing the effect of the anchors sizes (like in the case that you reported) For giving you an example of the limitations of K-Means, I have one detector that needs to detect electric poles. In almost all the images, for every poles, heights >> widths, but I have also some cases where the helicopter took the photo from a strange perspective and some pole height << pole width. In my script you can increase My solution for these cases is to run 1 time the algorithm with I'll add the comparison with the default anchors in the output of the script, thanks! :) |
Quick update: I did what @zylo117 suggested, now the output is something like this:
As you can see, I decided to report num. bboxes without similar anchors instead of num. bboxes with similar anchors, because it is easier to interpret when the anchors have an high coverage. |
@mnslarcher and @zylo117 I don't think this comparison should be suggested, as the result of kmean is not stable and highly depends on distribution of input data. To prove this, run multiple experiment multiple times, the avg iou may not be the same. And thus it is highly possible that your may see difference of AP flips from -1 to 1. Totally invalid conclusion. So I do not think there are any ways to solve this, even with the proposed solution discussed above. Let me know your thoughts. If you ask me, I will take the one differ from default. Even the performance drops, given the difference is not too much. As bbox distribution in customized dataset must have some difference compared with default. |
It's still a trade off here (number of anchors VS AP) and thus not the optimal. The better solution is to jump out of box. |
I don't agree. In my implementation I have a
This doesn't make sense to me. Is true that in most of the cases the default anchors are not the best ones but there is no logical reason to think that some other boxes are better just because they are different (especially if you have a metric that is telling you the opposite). Following this logic using default anchors + random noise would be better than using default anchors. There is a proper way to measure how good the anchors are, you can use a validation set (in this case I don't think we never over-fit the training set so also an evaluation on the training set is OK).
I don't understand what do you mean with "jump out of box". In any case I agree, is not optimal, is just an heuristic that I use to balance the limitation of K-Means. This is an example where this heuristic works: [3, 10] * 1000 In this case, with |
@mnslarcher Are we on the same page? I just point out my thoughts that the difference added to decide which is more useful may not lead to correct conclusion.
Interesting to see you get exactly same result. If this holds, I admit that the decision holds. But Kmeans is not a stable algorithm, is it? Is initialization stable? How could you produce same result? By doing what? I did not see in any part you actually use seed, any other way you have done?
Let me ask you a question, for a given customized dataset, I run your algorithm and get same results as compared with coco default[(1,1), (1.4,0.7), (0.7,1.4)], while I conduct EDA and find the average ratio<1 , If I get kmeans result [(1,1), (0.7,1.4)] and find it lower from default, which should I choose? Noise can present anywhere in this world. Running kmeans cannot kick all of them out.
Validation is a way. Though not effective to me. |
Sorry @Cli98 , I’m not sure I’m following you 100%, in any case just to clarify:
I’m not sure if I answer to you given that I’m not sure if I understood your point |
@mnslarcher So here is my point. I am not saying your kmean algorithm is incorrect. I'm just wondering if use the difference to measure the result of kmean really works. Since the result is not stable, such comparison may not really make sense. Just my thought. So I guess your point is, such randomness may not change conclusion as it is to some degree stable. This may not always hold from my perspective, actually depends on difficulty of dataset.And we agree each other on this. That's enough I guess. "I’m reporting my empirical evidence resulted from running K-Means many times with different initializations and some different datasets". I'm glad to see any reports in the near future. And this follows your concern #3. My points are, since you judge the result of kmeans from AP, you may see difference work. But how effective to judge by AP remains controversial to me. Let's see if your experiments can justify this. |
Hi,
If someone want to give it a try, like https://github.com/Cli98/anchor_computation_tool, I also adapted the code of https://github.com/zhouyuangan/K-Means-Anchors for obtaining anchors ratios using K-Means.
The main difference is that my implementation is faster (less for loops), it takes as input a .json file (not a xml) and has some additional settings (but the Changlin Li repo has more visualization).
https://github.com/mnslarcher/kmeans-anchors-ratios
If you try my code on COCO you will obtain anchors ratios in line with this repo (but for your data set they can be different and this is why it's useful).
Any feedback is appreciated.
UPDATE: I added the commands to get the optimal anchors ratios for COCO, the returned ratios are
[(0.6, 1.5), (1.0, 1.0), (1.4, 0.7)]
, very near to[(0.7, 1.4), (1.0, 1.0), (1.4, 0.7)]
(default values of this repo).The text was updated successfully, but these errors were encountered: