Dumping Events to ROOT Ntuples
The dump-events CLI command allows you to snapshot/dump processed events back into a flat ROOT ntuple
after processing. This is useful when you want to save events after applying preprocessors and selectors,
store intermediate analysis results, share processed data in a standard ROOT format, or read back events
later without reprocessing.
Overview
After processing your events with preprocessors and selectors, you may want to save the modified events
to disk for later analysis. The dump-events tool writes your processed events to a new ROOT file in
a flat ntuple structure, preserving:
Event-level attributes
All particle collections (e.g., genmuons, segments, digis)
Particle-to-particle relationships (e.g., matched_segments, matched_tps)
The dumped files are standard ROOT files that can be read using the provided DumpedEventReader class
or directly with ROOT tools.
How It Works
Storage Strategy for Particle References
When particles contain references to other particles (e.g., matched_segments, matched_tps),
the dumper stores them as indices with type information for later reconstruction:
For list references:
{ptype}_{attr}_indices: Vector of indices of referenced particles{ptype}_{attr}_type: Particle type name (for reconstruction)
For single references:
{ptype}_{attr}_index: Index of referenced particle{ptype}_{attr}_type: Particle type name
For example, if a genmuon has matched_segments, it will be stored as:
genmuons_matched_segments_indices:vector<vector<int>>containing segment indicesgenmuons_matched_segments_type:vector<string>containing"Segment"
This approach allows full reconstruction of particle relationships when reading the dumped ntuple.
Usage
Command Line Interface
Basic usage to dump all events:
dtpr dump-events -i /path/to/input.root -o ./results
Dump a specific number of events:
dtpr dump-events -i /path/to/input.root -o ./results --maxevents 1000
Dump with custom tree name and tag:
dtpr dump-events -i /path/to/input.root -o ./results --tree-name MyEvents -t _processed
Dump only specific particle types:
dtpr dump-events -i /path/to/input.root -o ./results --particle-types genmuons segments
Use custom configuration:
dtpr dump-events -i /path/to/input.root -o ./results -cf my_config.yaml
Python API
You can also use the dump function directly in your Python scripts:
from dtpr.analysis.dump_events import dump_events
# Dump all particle types
dump_events(
inpath="/path/to/input.root",
outfolder="./results",
tag="_processed",
maxfiles=1,
maxevents=-1, # -1 means all events
tree_name="EVENTS",
particle_types=None, # None means all types
)
# Dump specific particle types only
dump_events(
inpath="/path/to/input.root",
outfolder="./results",
tag="_genmuons_only",
maxfiles=-1,
maxevents=1000,
tree_name="EVENTS",
particle_types=["genmuons", "segments"],
)
Reading Dumped Events
Using DumpedEventReader
The DumpedEventReader class reconstructs the Event and Particle objects from dumped ntuples,
including particle-to-particle relationships:
from dtpr.utils.dumped_ntuple_reader import DumpedEventReader
# Method 1: Using context manager (recommended)
with DumpedEventReader("dumped_events.root", tree_name="EVENTS") as reader:
# Get number of events
print(f"Total events: {len(reader)}")
# Access by index
event = reader[0]
print(event)
# Slice events
events_10_to_20 = reader[10:20]
# Iterate over events
for event in reader:
print(event)
# Access particles
for genmuon in event.genmuons:
print(f"GenMuon pt: {genmuon.pt}")
# Access matched particles (relationships are reconstructed!)
if hasattr(genmuon, 'matched_segments'):
print(f" Matched segments: {len(genmuon.matched_segments)}")
for seg in genmuon.matched_segments:
print(f" Segment at wh={seg.wh}, st={seg.st}, sc={seg.sc}")
# Method 2: Manual management
reader = DumpedEventReader("dumped_events.root")
event = reader.read_event(5) # Read specific event
reader.close()
Using ROOT Directly
You can also read the dumped files with standard ROOT tools:
import ROOT as r
f = r.TFile("dumped_events.root")
tree = f.Get("EVENTS")
for i, entry in enumerate(tree):
if i >= 10:
break
print(f"Event {i}:")
print(f" Event number: {entry.number}")
print(f" Number of genmuons: {entry.n_genmuons}")
# Access particle attributes
for j in range(entry.n_genmuons):
print(f" GenMuon {j}: pt={entry.genmuons_pt[j]}, eta={entry.genmuons_eta[j]}")
# Access matched particle indices
if hasattr(entry, 'genmuons_matched_segments_indices'):
matched_indices = list(entry.genmuons_matched_segments_indices[j])
print(f" Matched segment indices: {matched_indices}")
Example Workflow
Step 1: Process and Dump Events
from dtpr.base.config import RUN_CONFIG
from dtpr.analysis.dump_events import dump_events
# Configure your analysis
RUN_CONFIG.change_config_file("my_config.yaml")
# This configuration includes preprocessors that add matched_segments
dump_events(
inpath="/data/dt_ntuples/",
outfolder="./processed_events",
tag="_with_matching",
maxfiles=-1,
maxevents=-1,
)
Step 2: Read Back and Analyze
from dtpr.utils.dumped_ntuple_reader import DumpedEventReader
from dtpr.utils.functions import color_msg
with DumpedEventReader("processed_events/dumped_events_with_matching.root") as reader:
matched_count = 0
unmatched_count = 0
for event in reader:
for genmuon in event.genmuons:
if hasattr(genmuon, 'matched_segments') and len(genmuon.matched_segments) > 0:
matched_count += 1
else:
unmatched_count += 1
color_msg(f"Matched genmuons: {matched_count}", "green")
color_msg(f"Unmatched genmuons: {unmatched_count}", "yellow")
Branch Structure in Dumped Files
Event-level branches
index: Event indexnumber: Event numberOther event attributes from your Event class
Particle-level branches
For each particle type (e.g., “genmuons”):
n_genmuons: Number of genmuons (int)genmuons_<attr>: Particle attribute vectorsSimple types:
vector<int>,vector<double>,vector<bool>Nested vectors:
vector<vector<T>>Strings:
vector<string>
Particle reference branches
{ptype}_{attr}_indices: Vector of vectors of indices{ptype}_{attr}_type: Vector of particle type names{ptype}_{attr}_index: Vector of single indices (for non-list references)
Example for genmuons with matched segments:
n_genmuons : int
genmuons_pt : vector<double>
genmuons_eta : vector<double>
genmuons_matched_segments_indices : vector<vector<int>>
genmuons_matched_segments_type : vector<string>
Tips and Best Practices
Memory Management: For large datasets, process in batches using
--maxeventsdtpr dump-events -i input.root -o ./batch1 --maxevents 10000
Selective Dumping: Only dump particle types you need to save space
dtpr dump-events -i input.root -o ./results --particle-types genmuons segments
Verification: Always check the first few events after dumping
from dtpr.utils.dumped_ntuple_reader import DumpedEventReader with DumpedEventReader("dumped_events.root") as reader: print(reader[0]) # Print first event to verify structure
Compatibility: Dumped files are standard ROOT files and can be read by any ROOT-based tool
Limitations
String attributes are stored but may have encoding limitations
Complex nested structures (particles containing particles containing particles) may need special handling
Circular references are not supported
Troubleshooting
Problem: “Cannot find tree ‘EVENTS’ in file”
Check tree name with
--tree-nameoptionVerify file was created successfully
Problem: Particle references not reconstructed
Ensure both
_indicesand_typebranches existCheck that particle type names match
Problem: Memory issues with large files
Use
--maxeventsto process in smaller batchesUse
--particle-typesto dump only needed particles