Machine Learning exercises for ttHyy
First checkout the code:
git clone
cd ttHyyML
Then, setup the virtualenv:
lsetup root
lsetup python
virtualenv --python=python2.7 ve
source ve/bin/activate
Then, checkout necessary packages:
pip install pip --upgrade
pip install theano keras h5py sklearn matplotlib tabulate
pip install --upgrade
If this is the first time you are using keras, you will want to change the backend to theano instead of the default tensorflow.
To do this, edit the appropriate line in ~/.keras/keras.json
After setting up the environment for the first time, you can return to this setup by doing source
, which is equivalent to:
lsetup root
source ve/bin/activate
export PATH="`pwd`:${PATH}"
The main steering macro is
, which will:
- load the data;
- split it into testing and training samples;
- train and test the neural net; and
- plot and save the ROC curve.
First, you will want to put the input ROOT files into directories called inputs_leptonic
and inputs_hadronic
It is suggested to use symbolics link to the public directories that the ntuples are located rather than copying them all to the working directory.
has a few options:
-c, --channel
, which allows you to change which channel you want to train on (leptonic or hadronic);--cat, --categorical
, which allows you to train a categorical model instead of a simple binary selector (not currently supported);-s, --signal
, which allows you to restrict the number of signal event to use relative to the number of background events, to prevent overtraining on the signal;-n, --name
, which changes the name of the saved ROC curve; and--save
, which will save the weights of the neural network to an HDF5 file in themodels
To apply the trained weights to an ntuple, use the macro
It has the following options:
-c, --channel
, which is the same as above
;-m, --model
, the path to the input weights file;-i, --input
, the name of the input file (with theinputs
directory and.root
stripped!);-t, --tree
, the name of the input tree; and-n, --name
, the name appended to the output file.
Outputs are saved in the output
The models used in the analysis can be found in ttHyy/
Currently, only a shallow model is used.
Deeper models will be added as the need for additional neural network complexity arises.