ClasTool is a ROOT based package for analyzing CLAS data. It consists of a set of C++ classes in a number of libraries which you can use to build your own CLAS Analysis program or script. The intention of this package is to provide an easier access to the data. To do this the ClasTool allows you to connect to a ROOT based DST file (which you can create from a BOS file using the included WriteRootDST program), or an NT10 style HBOOK file. Other input formats are possible (eg1 dst etc) if an appropriate interface it written.

WARNING: This is a "work in progress", and will remain so in perpetuity.
The author of this code does not make any warranties that use of this code will lead to a nobel prize, nor in fact that it will work at all. Any damage to your brain or computer is due to the amount of coffee you consume, not due to the use of this project.


Index

  1. Getting Started.
    1. Setting up ROOT.
    2. Setting up your environment.
    3. Getting the code.
    4. Making the code.
  2. Creating a root DST file.
  3. Starting your own Analysis.
  4. Code Documentation.
  5. Making Contributions.
  6. Adding new data classes.
  7. Tricks and Tips.

Quick Links:


Getting Started

To use this code you will need a standard Linux setup with the basic GNU tools. It may work fine on other systems but I have not tested this. The following steps will get you going.

Setting up ROOT.

If you are at JLab, then ROOT will already be installed for you and all you need to do is setup a few environment variables. If you are not at JLab you will have to get a copy of the ROOT package. ROOT runs a website where you can get the code and documentation: root.cern.ch.

To setup your ROOT environment at JLab type: "use root" at the command line. If you are not at JLab, use the script that comes with your ROOT distribution. At a miminum, make sure that the ROOTSYS and LD_LYBRARY_PATH variables are set properly. (If you don't know what this means, talk to your local Linux guru.)

Setting up the environment.

This code depends on several environment variables being set properly so that it can find some of the sub-parts. Needed is the standard Linux MySQL package (devel), and a CLAS software release (only needed to read BOS files.) The following table lists all the environment variables you need to set:

Environment Variables
Variable Example setting Notes:
CLASTOOL
`pwd`
Location of the CLASTool package, ususually somewhere in your home directory.
ROOTSYS
/usr/local/root
Location of the ROOT package. At Jlab: "use root"
PATH
$ROOTSYS/bin:$PATH
Location of the ROOT binaries. At Jlab: "use root"
MYSQL_INCLUDE
/usr/include/mysql/
Location of MySQL includes. See scripts/find_mysql to find this.
MYSQL_LIB
/usr/lib/mysql
Location of MySQL libraries. See scripts/find_mysql to find this.
OS_NAME
LinuxRHEL3
Extended n ame of the OS. (CLAS default). This is set with the usual CLAS startup script. If you are not at Jlab and only use one type of system, you can just set it to "Linux".
LD_LIBRARY_PATH
$ROOTSYS/lib:$CLASTOOL/slib/$OS_NAME
Tells the system where to look for shared libraries, which helps root to find your class libraries.
CVSROOT
/group/clas/clas_cvs (on jlab site)OR username@cvs.jlab.org:/group/clas/clas_cvs
Tell CVS where the code depository is.
CVS_RSH
ssh
When not at Jlab, this is crucial to get access.

You can put these environment definitions in a little script that runs when you log in, called the .tcshrc script.
I use "bash", so it is called .bash_profile. If you want my bash scripts, look in ~holtrop/.bash_profile and ~holtrop/bin/scripts/00functions.sh

Getting the code.

A quick primer on using the Code Versioning System (CVS). To get the real deal with all the powerfull gory glory look at the CLAS Offline Software Page, the CVS Manual and CVS Bubbes pages.

After setting up the environment (see above) you should be able to type:
csh> cvs checkout packages/ClasTool
and get a bunch of information written to your screen. You will then find a directory called "packages" which includes a directory called "ClasTool". This is your owen personal copy of the code. If you have already "checked out" a copy and want to update it for the latest version, you want to type the following in the packages/ClasTool directory:
csh> cvs update -A

Notes:
You can "browse" the entire CVS code tree using a web interface using the CVS BROWSER. This will tell you when what was updated etc.
Once the code is checked out, you can move the directory, and "cvs update" and other commands will still work.
Before you "check in" any changes you made, read the "Making Contributions" section.

Making the code.

Once you have the code checked out and your environment set up, you should be able to just type "make" in the CLASTOOL directory. I say "should" because this is a work in progress and errors can easily creep in. Send me an email if there are still problems after you have done a "cvs update -A".

The code will make a set of libraries in $CLASTOOL/slib/$OS_NAME and a binary in $CLASTOOL/bin/$OS_NAME.

Other important make features:

Huge issue when modifying virtual classes is that your virtual class and inheriting class can get "out of synch". This is bad. It will call the wrong routine to be called when you use one of the virtual methods. Be really careful when coding virtuals, and alsways recompile everything, just to be sure.

To clean up the directory after all the code is made (i.e. remove object and dictionary files) type "make clean". To clean up really well, i.e. also remove the libraries and executables, type "make distclean". This can be useful to make absolutely sure that all code is recreated with the next make command. This is important when you change operating system, root version or other system properties.

To make proper dependency files, type "make dep". This will create a dependency list in the file Make_depends, which is automatically included into the Makefile. A dependency file tells the make program which files need to be checked to see if the library is out of date and needs to be remade. Usually dependency files are created and updated automatically, but sometimes you may want to force a remake: just delete the Make_depends file and run "make dep".

To make the html documentation: type "make docs" in the top directory. The HTML documentation is made with the use of the THtml class, which is part of ROOT. To do this is a smooth way, each directory has a root script called HTML_Load.C with instruction to the main document script (in the html directory) which libraries to load and for which classes to make documentation. If you add a new class you will need to update or create the HTML_Load.C file, or else no documentation is generated and nobody will know about your work.

Sometimes "make distclean" is needed to force all the subtle changes through the entire system. This would be the case if for instance the documentation is not updated the way you expected.


Creating a ROOT DST file.

After you have build the code, you will hopefully find an executable in your "bin directory" ($CLASTOOL/bin/$OS_NAME) called WriteRootDST. To learn how to use this code (and see all the latest features not listed here) type: csh> $CLASTOOL/bin/$OS_NAME/WriteRootDST -h The output may be something like:
bin/LinuxRH9/WriteRootDst: error while loading shared libraries: libCore.so: cannot open shared object file: No such file or directory
In which case you did not set up the ROOT environment properly. The "correct" response would have been a list of options for the program.

The program WriteRootDST will convert a BOS file that was "cooked" with "recsis" (or "user_ana" or "a1c" ) into a ROOT based Data Summary File. The structure of this file is fairly similar to the "NT10" hbook based ntuple, with some additions. An example commandline for this code would be:

csh> WriteRootDST -o /work/disk1/holtrop/eg2_dst_43220-43223.root /cache/mss/clas/eg2/production/Pass1/BOS/clas_04322{0,1,2,3}.*

Usually these RootDST files will already have been made and would located on the Silo.

Starting your own analysis.

You now would access to some of the RootDST files and you have the libraries compiled and ready. How do you now use these tools to build your own analysis? There are a lot of different approaches, each with their own advantages and disadvantages. Which one you choose will depend on the project you are working on and your own personal style. I list a few of them here, there are some simple examples in the "examples" directory.

I. Ignore the Libraries.

You can make ROOT scripts that do not need the libraries we just build, and instead open the root file directly and work with the data in them. I would only reccomend this if you are planning to use the scripts on computers where the libraries are not available or too difficult to build (eg Windows), and your scripts do relatively simple things.

When you do this, you will note a number of warnings about "dictionary not available" when you open the file. This is not a problem. You can now browse the file with "TBrowser", and make simple plots by clicking on variables. You can also use the TTree->Draw() method to make slightly more complicated plots, or create a framework script with the TTree->MakeCode or TTree->MakeClass methods. See the ROOT documentation.

II. Use a set of scripts with the libraries.

For anything with a slight bit of sophistication I would reccommend to at least load the libraries and then use scripts to do the analysis. The libraries allow you to open up multiple files at once and "chain" (read one after the other) through them, so you can gain in statistics. You can make quite a complicated analysis with this method. The examples in the "example" directory can help you get started.

The advantage of this method is that you can get going quickly and use all the pre-programmed tricks of the ClasTool library. The disadvantage is that it will give you a slower analysis code than you would get if you wrote a program. The good news is that a set of scripts can be migrated to a program rather easily, since both the script and the program are in C++.

III. Write classes in scripts.

One step up from method II, the ROOT interpreter allows you to define new classes in a script. You then load this script into ROOT and use the classes you just defined. This allows you to make a very sophisticated analysis package. If you load your MyScript.C script with ".L MyScript.C++" (with a ++ after the name.) ROOT will first compile your script and then load the compiled code, so it will execute as fast as compiled code.

The advantage is that this gives you a very flexible analysis environment, without sacrifising speed. The code you write can very easily be converted to a compiled set of classes and all the functionality of C++ is available to you. The disadvantage is that you need to understand C++ a little better than in method II while you don't get compiler warnings and errors that are quite as easily understood as with a "normal" compiled class. Also you cannot easily make this into a "standalone" code that can run on a batch farm node. (This may not be a problem, since the farm will run root scripts as well.)

IV. Write a full analysis program.

In this case, you write a set of classes that either use or extend (derive from) the classes in ClasTool. The whole thing compiles into a new library that can be used for a standalone program or loaded into ROOT. This is the most powerful way to do analysis, but also the most complicated. One large advantage is that writing code rather than scripts may force you to document what you did more precisely and keep better track of the various versions of the analysis code. You may also be more carefull with how you implement things like cuts that can change from time to time.

I recommend this for everyone who wants to do serious analysis. It is more work up front (learning how to set it all up) but more of the code would be reusable. Even within this method of working there are a lot of design choices you'd need to make, like: Class derivation or using the classes as members? Are you going to over-write some methods?


Code Documentation.

All the code for ClasTool should be "self-documented" with comments embedded in the code itself. The ROOT documentation system extracts some of these comments and formats them into html pages that you can read in a web browser. I say "should" because like everyone else I am often also lazy about keeping all these comments up to date. Sorry.

The documentation should be created in the "html" directory of the ClasTool package. If it is not there you can re-make the documentation with "make docs" should make a new set of documentation for you. Point your browser to the "index.html" file and you can go from there.
An online copy is available on my server: ClasTool Documentation. However, this may not be in synch with the exact version you are using. Over time I will update this document, and if there is sufficient interest, write more detailed HowTo documentation.


Making contributions.

If you think that you have made a usufull contribution to ClasTool, please email me (holtrop@jlab.org) and let me know. I will then try to incorporate the contribution into the main libraries. If you make many improvements, we can discuss how you can add things directly using CVS. I would like to remain as a "gatekeeper" of this code, since I have found that otherwise it can get unwieldy rather quickly.

The easiest contributions would be bug-fixes and small extensions. In such a case, just email me the modified code. If you have a large extension of the code, we can talk about how to implement it. Also if you have improvements in the documentation, please email them to me!


Adding new Data Classes.

You may want to add other data to your RootDST, which is available in the BOS bank but not in the current implementation of ClasTools. How can this be done?

First, check the definition of the BOS bank in packages/bankdefs and decide what information you will need to copy into the RootDST. Each bank in the bankdefs file has a class associated with it defined in the clasbanks.h and clasbanks.cc files, which are in packages/inc_derived and packages/bankdefs. You then need to follow these steps:

  1. You need to add a new class in ClasBanks that defines the container for your bank. To write this, use the packages/bankdefs definition of the bank. You must create a TXXXXClass.cc a TXXXXClass.h and a TXXXXClassLinkDef.h file.
  2. Write a Fill_NEWBANK.cc file to fill this new bank from the BOS contends. This resides in the FillBank directory. (Makefile there will pick up the new banks.)
  3. Include the header from 1) into this file at Bank Include Files.
  4. Add a TClonesArray for the bank, a TBranch and a counter. Then go to the storing section and make sure they get stored (Just copy what was done for other banks.) Also add the Branch to Initialize_Branches (again, like other banks.) and Init_Clones and Clear_Clones.
    At this point you will be able to write the bank into the ROOT DST.
  5. You probably also want to READ the bank. You need to modify the DSTReader and add the banks there (just like all the other banks) to be read from the file and make it available and show up in a PrintEventStatus().

Hopefully all this worked and you can now use the new data relatively transparently in you analysis codes.


Tricks and Tips.

Did you know?