Abstract: Developments for Multivariate Data Analysis (TMVA)
Provide a short (one paragraph) description of your proposal.
Description: Toolkit for Multivariate Analyses (TMVA) is a machine-learning framework integrated into the ROOT software framework, containing ML packages for classification and regression frequently used by high-energy physicists in searches for new particles, used for example in the discovery of the Higgs Boson. Recently TMVA has been undergoing a significant makeover both in performance, features and functionality.
Requirements: Strong C++ background is desired, strong machine learning knowledge is a plus.
Mentors: Lorenzo Moneta, Sergei Gleyzer
Describe your application in detail. Provide some background, describe the work that you are expecting to do in the time leading to the GSoC start.
Development Plan
1) Rewrite TMVA removing static and global variables to prepare de code for parallelization. Also redesign of TMVA's methods that needs static variables to do minimization with minuit by example.
2) Write a new Method for Deep Learning with GPU support with two class classification initially: suggested Theano(Supports gpu with Cuda and OpenCL also Cpu with threads) others packages can be evaluted.
3) Write a system to run multiple Train/Test methods in TMVA::Factory in parallel using threads.
4) Write needed code to integrated the ROC plots with the new method that can be use easy in notebooks.
5) Help to process pull reuquests for TMVA in github
- https://github.com/root-mirror/root/pull/121
- https://github.com/root-mirror/root/pull/104
- https://github.com/root-mirror/root/pull/100
- https://github.com/root-mirror/root/pull/53
6) Write documentation for all classes in doxygen.
What do you expect as a deliverable for your project? Please try to be as precise as you can (e.g. a ready to deploy package (or a patch) ABC implementing XYZ feature tested on Linux/Mac/Windows)
Source Code in an repository that lets you use TMVA with the next features:
1) factory->TrainAllMethod("Jobs=4"); //method parallelized to run 4 train methods in parallel.
2) factory->TestAllMethod("Jobs=4");//method parallelized to run 4 test methods in parallel.
3)