How can I use event-wise weights in the TMVA factory?

  • Thread starter ChrisVer
  • Start date
  • Tags
    Root
In summary, the conversation is discussing the use of global and event-wise weights in the TMVA factory. These weights can be set and used by calling their names and using specific methods. The conversation also mentions accessing elements of a tree and using different methods for weighting background and signal events. There is also a discussion about the source code and potential errors in using the method.
  • #1
ChrisVer
Gold Member
3,378
464
I have one question (simple I hope).
In the TMVA factory one can set global weights by typing:

C:
// global event weights (see below for setting event-wise weights)
Double_t signalWeight = 1.0;
Double_t backgroundWeight = 1.0;

And use them afterwards by calling their names signalWeight, backgroundWeight.

If I want to use event-wise weights, the code below uses the factory->SetBackgroundWeightExpression etc:

C:
factory->SetWeightExpression("EventWeight");
factory->SetSignalWeightExpression("EventWeight");
factory->SetBackgroundWeightExpression("EventWeight");

// This would set individual event weights (the variables defined in the
// expression need to exist in the original TTree)
// for signal : factory->SetSignalWeightExpression("weight1*weight2");
// for background: factory->SetBackgroundWeightExpression("weight1*weight2");

I was wondering, later on how can I use those weights? Should I put this whole command in position where "signalWeight" should be used?

Thanks. (https://hep.pa.msu.edu/wiki/pub/AtlasSingleTop/AnalysisVersion1405006Check/TMVAnalysis_test3.C)
 
Technology news on Phys.org
  • #2
Well, you use elements of the tree, so you should be able to access them like everything else in the tree if you have to. GetEntry, GetEvent or whatever you like as method, together with a branch address set somewhere.
 
  • #3
mfb said:
together with a branch address set somewhere.

What do you mean by that?
 
  • #4
In my tree I have the histogram of weights, let's say it's called: "weight".
If I try to write something like:
C:
//BckgTraining is a string
factory->AddBackgroundTree(BckgTraining,
                           factory->SetBackgroundWeightExpression("weight") , "Training");

I am getting an error because the AddBackgroundTree expects to get a Double_t as a 2nd argument.
Or you mean that I could write:
weight->GetEntry ?
 
  • #5
How do you access things in your tree in general?
There are multiple ways to do that.

I don't understand what your code example is supposed to do. The set method will probably return an integer as success message, and should be called exactly once to setup the readout.
 
  • #6
It's pretty much the same as in the example of the link (https://hep.pa.msu.edu/wiki/pub/AtlasSingleTop/AnalysisVersion1405006Check/TMVAnalysis_test3.C)

My tree is in a root file, I open the file and use the commands
C:
TTree *signalTraining = (TTree*)inputTraining->Get("TMVAInputTree");
TTree *backgroundTraining = (TTree*)inputTraining->Get("TMVAInputTree");

TTree *signalTesting = (TTree*)inputTesting->Get("TMVAInputTree");
TTree *backgroundTesting = (TTree*)inputTesting->Get("TMVAInputTree");
The signal/bckg variable distributions are registered with AddVariable() or AddSpectator().

The thing is that when I read about weighting by event it gives that command with factory...

What I want to do... In the tree I have the signal and background events together with the weight histogram I want to use. I want to run the training/testing with weighting the background to the signal. In that case I could use a global signal weight:
C:
Double_t sig_weight = 1.0;
But I cannot use the global background weight 1.0... Instead I want to use the weight histogram values.

C:
factory->AddSignalTree( signalTraining, sig_weight, "Training" );
factory->AddSignalTree( signalTesting, sig_weight, "Test" );

factory->AddBackgroundTree( backgroundTraining, backgroundWeight, "Training" );
factory->AddBackgroundTree( backgroundTesting, backgroundWeight, "Test" );
 
  • #7
Looking at the source, SetBackgroundWeightExpression sets some variable somewhere. There are so many AddTree methods that I don't see where something is actually done (probably in line 399, but where is this hidden?).

The background weight is optional - what happens if you just omit it?
What happens if you plug in a number, but call SetBackgroundWeightExpression before (compared to not calling it)?
 
  • #8
mfb said:
The background weight is optional - what happens if you just omit it?

I don't think I want to omit it... If I do then there will be some differencies between the variables because they are correlated to the momenta. In particular the weight I got, was by trying to make the background and signal Pt distributions look the same.
I tried doing that with a simple number (with that call and a background weight=1.0) and OK, my BDT ran normally , but didn't get what I started for to get (it didn't work). I know it didn't because if it did, the resulted Pt variable distributions would look similar which wasn't the case.
mfb said:
Looking at the source, SetBackgroundWeightExpression sets some variable somewhere. There are so many AddTree methods that I don't see where something is actually done (probably in line 399, but where is this hidden?).

The one leads to the other... The:
AddBackgroundTree() (line 431) leads to AddTree (line 367 ) and that to AddTree of line 384 . I am not sure what is that in line 399...
 
  • #9
ChrisVer said:
I don't think I want to omit it... If I do then there will be some differencies between the variables because they are correlated to the momenta. In particular the weight I got, was by trying to make the background and signal Pt distributions look the same.
Omit it in the call of the method, not in terms of physics. Ideally it takes the weight you declared before.
ChrisVer said:
I tried doing that with a simple number (with that call and a background weight=1.0) and OK, my BDT ran normally , but didn't get what I started for to get (it didn't work). I know it didn't because if it did, the resulted Pt variable distributions would look similar which wasn't the case.
Was the output different from the code where you just used a weight of 1 without the other method before?
 
  • #10
mfb said:
Was the output different from the code where you just used a weight of 1 without the other method before?
Hmmm...I just tried it and no they are different... If I understood how that method works, then probably I'd figure out the error.
 
Last edited:

What is ROOT TMVA::factory weights?

ROOT TMVA::factory weights is a function in the ROOT framework used for multivariate data analysis. It allows for the creation and optimization of classification and regression models using machine learning algorithms.

How do I use ROOT TMVA::factory weights?

To use ROOT TMVA::factory weights, you first need to load the TMVA library and create a TMVA::Factory object. Then, you can add input variables, specify the training and testing data, and choose the machine learning algorithms to be used. Finally, you can call the Train method to train the models and save the resulting weights.

What is the purpose of ROOT TMVA::factory weights?

The purpose of ROOT TMVA::factory weights is to provide a user-friendly and efficient way to perform multivariate data analysis using machine learning techniques. It allows for the creation, training, and evaluation of various models, making it a powerful tool for data scientists and researchers.

What machine learning algorithms are available in ROOT TMVA::factory weights?

ROOT TMVA::factory weights offers a variety of machine learning algorithms including decision trees, neural networks, support vector machines, and boosted decision trees. It also supports ensemble methods such as bagging and boosting.

Can I use ROOT TMVA::factory weights for both classification and regression tasks?

Yes, ROOT TMVA::factory weights can be used for both classification and regression tasks. It allows for the creation of models that can predict discrete class labels as well as continuous numerical values, making it a versatile tool for data analysis.

Similar threads

Replies
11
Views
2K
  • Programming and Computer Science
Replies
9
Views
3K
  • Programming and Computer Science
Replies
5
Views
1K
Replies
2
Views
995
Replies
6
Views
2K
  • Programming and Computer Science
Replies
1
Views
1K
  • Engineering and Comp Sci Homework Help
Replies
16
Views
1K
  • Programming and Computer Science
Replies
3
Views
2K
Replies
18
Views
6K
  • Engineering and Comp Sci Homework Help
Replies
8
Views
1K
Back
Top