JobManager Documentation

Last Updated on Monday, April 10, 2000 at 11:57:02. by author: Nigel
(Beta testing)

Job Manager Documentation

The jobmanager is an object that allows the user to run over multiple files in a macro, in a some-what machine independent way. The basic idea is to have only one analysis macro (i.e. independent of where you run) that can run any where. The files that change are the template set. The run list and template sets are specfic to each macro. Below outlines how it works at present. There are two ways to set up the job manager

1) With template, and list of run numbers that the job manager inputs and has to make into a complete list of files to be used

2) With an ASCII file with all the complete infomation about file names etc in it.

The more detailed description follows:

Files to be supplied by User

Three files are reqired to be supplied by the user, if using the template/run number list method (Analysis macro,template and runlist). Two files are required to be supplied by the user if using complete file list method(Analysis macro and CompleteFileSetList).
    1) Analysis Macro: This is the macro that will perform your analysis. Its general form is
analysis_macro_name(Char_t *JobManagerFileName)
        {
    TPhModulecontext *mc=new TPhModulecontext();
    TPhObjectManager *om=new TPhObjectManager();
    TPhSuperModule *sm=new TPhSupermodule();

   TFile f(JobManagerFileName);
   TPhJobManager *jm=(TPhJobManager*)f.Get("JobManager");
   f.Close();

sm->SetContext(mc);
sm->StartAnalysis();

   // jm->Next() gets the names set up for each file set in the list, so now can use GetRunTimeName(ReferenceName) function
   // to return the correct name to be used here.
   while (jm->Next())
      {
         // Example
        TPhDST *df=new TPhDataFileEng99(jm->GetRunTimeName("RawData0"));
        mc->SetRun(df->GetRun());

        sm->StartRun();
        while (event=df->GetNextEvent())
                    {
               sm->Process(event);
                    }

        sm->EndRun();
        om.WriteToFile(jm->GetRunTimeName("HitData0"));
              }
sm->EndAnalysis();
        }

NOTICE: The job manager allows you to run over multiple runs etc.

    2) Template: The template is a file that tells the jobmanager for each file set (i.e. list of files that are used during each jm->Next()                              statement) what the reference names are (eg. "RawData0", "HitData0") and what file template they correspond to (i.e. INPUT                              from HPSS location /phobsink/engrun/data/PhoRaw######s###0.root)
Example:
wpe14.jpg (31757 bytes)

Note: You have the flexiblity of defining your own class, to put in your own personallized input if so desired {as long as that class derives from TPhDescriptorBase(), which has the necessary virtual functions to make jobmangaer work correctly.(TPhFileDescriptor derives from this class)}
    3) Run List: The Run List is the set of numbers (usually run numbers) that are to be substituted into the templates to replace their special
                         characters (eg, 001045s000 replaces ######s###0) for each file set. The number of different lines (each line corresponds
                         to a file set) in the run list gives the total number of file sets.
Example:

Note: You have the flexibilty of defining your own class, to look for your own special sequence, and replace it with the numbers/characters you input. Your class must derive from TPhTemplateLookupBase, and overwite the virtual functions, in order for the job manager to work correctly. (This is what TPhRunIdTemplate does)
The special sequence (here ######s### ) and what it gets replaced with is defined within the TPhRunIdTemplate object. If you want another special sequence you must define a new class.
Note: You can make the templates and runlists by hand. (In future will improve) Or copy examples and modify to your macros requirements. There also exist a perl script that will make a run list, of all the good pedestal files or calibration files from one run number to another. (see sigproc/macros/getrundatafile.pl) This will work on rcas only, although slight modification can make it work elsewhere.

4) Complete File Set List: This ASCII file combines 2 and 3, and allows the user to input the exact file names etc that the user want, so that if you have no standard pattern to your input names (eg. MC files) you can use this method.

Making the Job Manager from the Template and Run list files

With the above three files jobmanagers can be made to work inside your macro, via the following proceedure, inside a phat session:

The next step is to write out the Jobmanager object/s that are going to be used in the analysis macro. This depends on where you are going to run.

Making the JobManager from the Complete File Set List

In a phat session, if you type the following (given that you have already created your Complete File Set List)

TPhJobManager *jm=new TPhJobManager("jm","jm");
jm->ReadInCompleteDescriptionListAndMakeAbsListFromASCIIFile(FileName)

This reads in the Complete File Set List, and automatically converts it into the Abstract List of File Set (see above).

The next step is to write out the Jobmanager object/s that are going to be used in the analysis macro. This depends on where you are going to run.

Running locally (i.e. own your own machine)

This makes a jobmanager, with filename LocalJobManagerFileName, that can then be used in the analysis macro, by passing the complete name (including directory path) to the macro.

Running on CRS

With CRS setup correctly, (i.e. with correct controller script etc, see Instructions how a user can use CRS), just type with the following agruements

jm->PrepareJobsForCRS(NumberOfFileSetPerNode,ControlScriptFullFileName,JobFilesSubmitDirectory,LogFileDirectory,
ErrorFileDirectory,JobManagerDirectory)

This makes a jobmanager/s, with filename JobManager_Date{UNIXDATE}_Time{UNIXTIME}_Set#.root (in JobManagerDirectory)and the corresponding jobfile Date{UNIXDATE}_Time{UNIXTIME}_Set#.jf (in JobFilesSubmitDirectory) that passes the correct jobmanager file to the argument of the analysis macro. ControlScriptFullFileName is the full name of the controller script, used to run the analysis macro (typically /phobos/u/<USERNAME>/phobreco/input/scripts/controller_script). LogFileDirectory,ErrorFileDirectory are the directories where standard out and error are directed (typically /phobos/u/{USERNAME}/phobreco/output/log or err). NumberOfFileSetPerNode is the number of file sets you want processed on each node you are allocated. i.e if you have a analysis with 15 file sets, and you have been allocated 5 nodes to run on, you would probably want to have 3 file sets per node.

To then run on CRS, you will have to (as phobreco) in the jobfiles directory submit the jobs, by typing
submit_job.pl -l {USERNAME} -n {Nodes Allocated} .

Note: You will have to do this from a rcas node, with the appropriate CRS directory structure set up.

Class Description + Macros

Class Name	Description
TPhJobManager	Effectively a hashlist, that contains file sets. Used to manage the input and output files of macros, in a some-what machine independent way
TPhDescriptorBase	Base class, of objects stored in the jobmanagers hashlist. This has the virtual functions necessary to convert "Ref Name" to "RunTimeName"
TPhFileDescriptor	Specific implimentation of TPhDescriptorBase, used to contain the abstract information needed for filenames
TPhTemplateLookupBase	Base class, that has virtual functions to allow a special sequence of characters in a string to be searched and replaced within that string.
TPhRunIdTemplate	Specfic implimentation of TPhTemplateLookupBase, that allows the conversion in a string from ######s### -> <Run#>s<Seq#>.

$PHATHOME/phatloop/macros
make_crs_jobmanager.C,make_local_jobmanager.C,template.dat (example template file),runnumberlist.dat(example run list), filesetlist.dat (example of complete file set list)
$PHATHOME/macros
getrundatafile.pl (access db on rcas and returns ped/cal runs between certain run limits, see file for specfics)
$PHATHOME/sigproc/macros
pedcalc_sm_jm.C (pedcalc converted to use the job manager, rather than inputing run number and sequence)

Trouble shooting

This code is still in its infancy, with time, and effort on the users behalf to add features they want, it is hoped this system will accomplish its goals.
If you have problems etc, please let Nigel know.