You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@scott198510 I am not so sure why we did that. If I remember correctly, the issue was that some clusters still contained points that did not really fit to a line segment because they were "far" away from the actual line. You can consider these points as noise.
As a result, the max distances used to be between these outliers and the other end of the line. I tried to make a small visualization that might help to understand the issue. The orange point is the outlier and the red prototype lines are the ones that are the longest. By discarding the top N lines, the actual line should look much better. Finally, you should be left with the green line.
I guess there are more sophisticated workarounds to solve that kind of issue. For instance, you could use a linear or polynomial regression model to better approximate the line and ignore outliers. This is also being mentioned within the future work section.
Using multiple iterations, the distance is assigned as 0. why?
The text was updated successfully, but these errors were encountered: