Skip to content

An implementation of several transfer learning algorithms

License

Notifications You must be signed in to change notification settings

ashtonwebster/tl_algs

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Transfer Learning Algorithm Library

Wikipedia defines transfer learning as "a research problem in machine learning that focuses on storing knowledge gained while solving one problem and applying it to a different but related problem." Basically, this just means training using two or more domains and attempting to extend the predictions of these source domains to a new, previously unseen, target domain. This library serves as an implementation of several transfer learning algorithms.

Implemented Algorithms

  • Burak: Use the k closest source instances to each target instance (a.k.a. NN filtering) [1]
  • Peters: Find closest target instances to training instances [2]
  • TrBagg: Bootstrap sample to create many learners, then filter [3]
  • Transfer Naive Bayes (TNB): Modification of Naive Bayes using gravitational weighting [4]
  • Transfer Component Analysis (TCA): Use a projection method to bring source and target domains to a common space [5]
  • Improved Transfer Component Analysis (TCA+): Similar to TCA, but uses normalization before applying the projection [6]
  • Baselines:
    • Hybrid (all available source AND target training instances)
    • All available target training
    • All available source domain training

Installing

For best results, install Anaconda. This will (a) ensure you have all of the necessary dependencies and (b) allow you to view the example as a jupyter notebook.

# install the TCA dependency
> git clone https://github.com/ashtonwebster/transferlearning
> cd transferlearning/code/python
# install the dependency
> pip install -e .
> cd ../../../
# clone the repository
> git clone https://github.com/ashtonwebster/tl_algs.git
> cd tl_algs
# install the package!
> python setup.py install

Another option is to install so that (local) updates to the source will immediately affect the installed package:

pip install -e .

Now you can use tl_algs in any file by including:

import tl_algs

At the top of the file.

Example

See examples/example1.ipynb or examples/example1.html for an example usage.

References

[1] Turhan, B., Menzies, T., Bener, A. B., & Di Stefano, J. (2009). On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 14(5), 540–578. https://doi.org/10.1007/s10664-008-9103-7

[2] Peters, F., Menzies, T., & Marcus, A. (2013). Better cross company defect prediction. In IEEE International Working Conference on Mining Software Repositories (pp. 409–418). https://doi.org/10.1109/MSR.2013.6624057

[3] Kamishima, T., Hamasaki, M., & Akaho, S. (2009). TrBagg: A simple transfer learning method and its application to personalization in collaborative tagging. Proceedings - IEEE International Conference on Data Mining, ICDM, 219–228. https://doi.org/10.1109/ICDM.2009.9

[4] Ma, Y., Luo, G., Zeng, X., & Chen, A. (2012). Transfer learning for cross-company software defect prediction. Information and Software Technology, 54(3), 248–256. https://doi.org/10.1016/j.infsof.2011.09.007

[5] Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2011). Domain Adaptation via Transfer Component Analysis. IEEE Transactions on Neural Networks, 22(2), 199–210. https://doi.org/10.1109/TNN.2010.2091281

[6] Nam, J., Pan, S. J., & Kim, S. (2013). Transfer defect learning. In Proceedings - International Conference on Software Engineering (pp. 382–391). https://doi.org/10.1109/ICSE.2013.6606584

About

An implementation of several transfer learning algorithms

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published