-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
83 lines (66 loc) · 2.62 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
output: github_document
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
# FuzzyDBScan
This package implements fuzzy DBScan with fuzzy core and fuzzy border.
Therefore, it provides a method to initialize and run the algorithm and a
function to predict new data w.t.h. of `R6`.
The package is build upon the paper "Fuzzy Extensions of the DBScan algorithm"
from Dino Ienco and Gloria Bordogna.
The predict function assigns new data based on the same criteria as the algorithm itself.
However, the prediction function freezes the algorithm to preserve the trained
cluster structure and treats each new prediction object individually.
## Installation
You can install the development version of FuzzyDBScan from [GitHub](https://github.com/) with:
``` {r, results='hide', message=FALSE}
# install.packages("devtools")
devtools::install_github("henrifnk/FuzzyDBScan")
```
## Example
The following example shows how Fuzzy DBScan works with the `multishapes` data set
from the `factoextra` package.
We set the range of $\epsilon \in [0, 0.2]$.
Note that setting $\epsilon_{min} = 0$ implies that we expect fuzzieness through
the entire core.
The range of neighbors is set to the interval of $[3, 15]$ where $pts_{min} = 3$
means that we need at least three points to detect a fuzzy core point.
```{r example, message=FALSE}
library(factoextra)
dta = multishapes[, 1:2]
eps = c(0, 0.2)
pts = c(3, 15)
```
Next, we train the DBScan based on `dta`, `eps` and `pts`.
This is done by initializing the `R6` object.
`FuzzyDBScan` contains a scatterplot method, where the clusters (colours) and
fuzzieness (transparency) are plotted for any two features.
```{r}
library(FuzzyDBScan)
cl = FuzzyDBScan$new(dta, eps, pts)
cl$plot("x", "y")
```
`FuzzyDBScan` is equipped with a prediction method.
This method freezes the algorithm such that new data points are not used for
updating the cluster structure itself.<s
Each new data point is then assigned a cluster and fuzziness individually by the
same rules as during training.
```{r}
x <- seq(min(dta$x), max(dta$x), length.out = 50)
y <- seq(min(dta$y), max(dta$y), length.out = 50)
p_dta = expand.grid(x = x, y = y)
p = cl$predict(p_dta, FALSE)
ggplot(p, aes(x = p_dta[, 1], y = p_dta[, 2], colour = as.factor(cluster))) +
geom_point(alpha = p$dense)
```
## Reference
- Ienco, Dino, and Gloria Bordogna. [Fuzzy extensions of the DBScan clustering
algorithm](https://doi.org/10.1007/s00500-016-2435-0). Soft Computing 22.5 (2018): 1719-1730.