rivet is hosted by Hepforge, IPPP Durham
close Warning:
  • Error with navigation contributor "BrowserModule"
  • Can't synchronize with repository "(default)" (Repository path '/hepforge/hg/rivet/public/rivet' does not exist.). Look in the Trac log for more information.

Opened 9 years ago

Last modified 6 years ago

#457 assigned enhancement

Full combination of output from split runs and re-entrant histogramming

Reported by: Frank Siegert Owned by: Andy Buckley
Priority: blocker Milestone: 2.Y.0 -- re-entrant histogramming
Component: Analysis Version:
Keywords: Cc:

Description

One of the most requested features recently has been the possibility to combine output from separate runs of Rivet over split event samples to what it would have been like if it ran over all events in one go. Due to the flexibility necessary for the analyses in Rivet when it comes to filling/finalising its output this is not trivial. It becomes particularly tricky as soon as the final output histograms are generated from intermediate histograms or numbers (e.g. "sum of weights that passed cuts" counters) in the finalise method. To solve this problem properly but still automatic and hidden from the analysis author, I suggest the following:

  • All (intermediate) histograms/numbers relevant for finalise are registered by some name (instead of as normal member variable)
  • Those intermediate histograms/numbers are stored by their name in output files at the end of a Rivet run
    • maybe with rivet option to disable this to get smaller files for cases where the outputs are not to be combined
    • with some kind of separate type or flag to hide them from plotting tools
  • The Analysis class gains an "Analysis::fill(input (from) file)" method which fills the intermediate objects from the written files
  • The Analysis class gains a central "Analysis::add(Analysis)" method which combines all the elementary objects that are registered with an analysis
    • e.g. add binheights for histograms, add sumOfPassedWeights counters
    • If necessary this could be made virtual for very weird analyses to re-implement(?) -- but I don't see the need right now
  • An external "afterburner" tool can use these fill and combine methods for each analysis it finds in its combinable input files, and then run the finalize() method which will use the previously combined elementary objects to build its fancy and very specific final output

This assumes, that we will be able to define a general combine method for each elementary object, but I can only imagine histograms (profile histos are basically two normal histos which are divided in the end, cf. Profile1D->binHeight(), right?) and sumOfWeights counters for these intermediate objects right now, and they are trivial to combine. All other "complicated" and analysis-specific logic will remain in the finalise() method just like it is now.

Change History (4)

comment:1 Changed 8 years ago by Andy Buckley

Milestone: 2.0.02.1.0

comment:2 Changed 7 years ago by Andy Buckley

Milestone: 2.1.02.2.0

Version 2.0.0 can do this for the majority of situations. Really awkward objects will require a factorized finalize / reloadable histo state system, which is planned for 2.2.0 and beyond.

comment:3 Changed 6 years ago by Andy Buckley

Milestone: 2.2.0 -- jets, tagging, cuts2.Y.0 -- re-entrant histogramming
Status: newassigned
Summary: Combination of output from split runsFull combination of output from split runs and re-entrant histogramming

comment:4 Changed 6 years ago by Frank Siegert

I recently realised that the first bullet point already exists (AnalysisObjectMap Analysis::_analysisobjects), so it should be straightforward to implement the other steps without having to modify lots of analyses or breaking compatibility. There are still a few points that need discussion, but it should be possible to spec this out at the next dev meeting or sprint.

Note: See TracTickets for help on using tickets.