why dose the distance assign as 0? #1

scott198510 · 2022-05-03T02:35:32Z

    for i in range(0,15):
        max_index = np.argmax(distances)
        i1, i2 = np.unravel_index(max_index, distances.shape)
        distances[i1,i2] = 0.0

Using multiple iterations, the distance is assigned as 0. why?

The text was updated successfully, but these errors were encountered:

Lukas-Justen · 2022-06-25T16:35:32Z

@scott198510 I am not so sure why we did that. If I remember correctly, the issue was that some clusters still contained points that did not really fit to a line segment because they were "far" away from the actual line. You can consider these points as noise.

As a result, the max distances used to be between these outliers and the other end of the line. I tried to make a small visualization that might help to understand the issue. The orange point is the outlier and the red prototype lines are the ones that are the longest. By discarding the top N lines, the actual line should look much better. Finally, you should be left with the green line.

I guess there are more sophisticated workarounds to solve that kind of issue. For instance, you could use a linear or polynomial regression model to better approximate the line and ignore outliers. This is also being mentioned within the future work section.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

why dose the distance assign as 0? #1

why dose the distance assign as 0? #1

scott198510 commented May 3, 2022

Lukas-Justen commented Jun 25, 2022 •

edited

Loading

why dose the distance assign as 0? #1

why dose the distance assign as 0? #1

Comments

scott198510 commented May 3, 2022

Lukas-Justen commented Jun 25, 2022 • edited Loading

Lukas-Justen commented Jun 25, 2022 •

edited

Loading