-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME
118 lines (81 loc) · 2.4 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
PLCM is a frequent itemset mining algorithm based on the LCM algorithm.
##################
#### Download ####
##################
Latest version at: http://www.lamsade.dauphine.fr/~bnegrevergne/webpage/software/plcm/plcm_latest.tar.gz
(or on github https://github.com/bnegreve/plcm)
##################################
##### Reference publications #####
##################################
Discovering Closed Frequent Itemsets on Multicore:
Parallelizing Computations and Optimizing Memory Accesses.
Benjamin Negrevergne Alexandre Termier Jean-François Méhaut And
Takeaki Uno.
LCM ver. 2: Efficient Mining Algorithms for
Frequent/Closed/Maximal Itemsets
Takeaki Uno1, Masashi Kiyomi, Hiroki Arimura2
###################
##### Compile #####
###################
./configure
make
make install
### debug mode ###
./configure --enable-debug=yes
make
make install
#################
###### Run ######
#################
plcm <dataset file> <absolute threshold> <output file> [ -t number of threads ]
(If output is '-' then plcm dumps itemsets on the standard output.)
#######################
##### File format #####
#######################
### Input file ###
Dataset must be a single ASCII file.
Each line is a transaction.
Each transaction contains distinct items.
Transaction must be ordered.
Last line must be empty.
eg.
--- file test.dat ---
1 2 4 6
1 2 3 5 7
2 3
is a valid dataset.
### Output file ###
plcm output files have the following format:
* each line is a frequent closed itemset
* frequency is stored at the end of the line into brackets.
eg.
running plcm on test.dat dataset, with the following command
./plcm test.dat 2 out.dat -t 1
will generate the following file
--- file out.dat ---
2 (3)
1 2 (2)
3 2 (2)
You can change the output format by modifying the dumpItemset()
function in plcm.hpp and plcm.cpp
################
##### BUGs #####
################
Report bugs and/or comments at:
My FirstName is Benjamin
My LastName is Negrevergne
If you observe a bug, please compile in debug mode and report the error message if any.
(To compile in debug mode use ./configure --enable-debug=yes.)
######################
##### Developers #####
######################
To checkout the source tree:
git clone <plcm_git_repos> plcm
cd plcm
git submodule init
git submodule update
Compile with the standard procedure (see Compile section)
./configure
make
make install