Working with Workspaces¶
This tutorial covers how to work with PyHS3 workspaces - loading, exploring, and understanding their structure.
What is a Workspace?¶
A Workspace is the main container in PyHS3 that holds all the components needed to define a statistical model:
Distributions: Probability distributions (Gaussian, Poisson, etc.)
Functions: Mathematical functions that compute parameter values
Domains: Parameter space constraints and bounds
Parameter Points: Named sets of parameter values
Metadata: Version information and documentation
Loading a Workspace¶
You can create a workspace from a dictionary or load it from a JSON file:
import pyhs3
# From a dictionary
workspace_data = {
"metadata": {"hs3_version": "0.2"},
"distributions": [
{
"name": "signal",
"type": "gaussian_dist",
"x": "obs",
"mean": "mu",
"sigma": "sigma",
}
],
"parameter_points": [
{
"name": "nominal",
"parameters": [
{"name": "obs", "value": 0.0},
{"name": "mu", "value": 0.0},
{"name": "sigma", "value": 1.0},
],
}
],
"domains": [
{
"name": "physics_region",
"type": "product_domain",
"axes": [
{"name": "obs", "min": -5.0, "max": 5.0},
{"name": "mu", "min": -2.0, "max": 2.0},
{"name": "sigma", "min": 0.1, "max": 3.0},
],
}
],
}
ws = pyhs3.Workspace(**workspace_data)
# From a JSON file
# ws = pyhs3.Workspace.load("my_model.json")
Exploring Workspace Contents¶
Once you have a workspace, you can explore its contents:
>>> import pyhs3
>>> workspace_data = {
... "metadata": {"hs3_version": "0.2"},
... "distributions": [
... {
... "name": "signal",
... "type": "gaussian_dist",
... "x": "obs",
... "mean": "mu",
... "sigma": "sigma",
... }
... ],
... "parameter_points": [
... {
... "name": "nominal",
... "parameters": [
... {"name": "obs", "value": 0.0},
... {"name": "mu", "value": 0.0},
... {"name": "sigma", "value": 1.0},
... ],
... }
... ],
... "domains": [
... {
... "name": "physics_region",
... "type": "product_domain",
... "axes": [
... {"name": "obs", "min": -5.0, "max": 5.0},
... {"name": "mu", "min": -2.0, "max": 2.0},
... {"name": "sigma", "min": 0.1, "max": 3.0},
... ],
... }
... ],
... }
>>> ws = pyhs3.Workspace(**workspace_data)
>>> # Print workspace structure
>>> print(f"Workspace contains:")
Workspace contains:
>>> print(f"- {len(ws.distributions)} distributions")
- 1 distributions
>>> print(f"- {len(ws.functions)} functions")
- 0 functions
>>> print(f"- {len(ws.domains)} domains")
- 1 domains
>>> print(f"- {len(ws.parameter_points)} parameter sets")
- 1 parameter sets
# Access distributions
print("\\nDistributions:")
for dist in ws.distributions:
print(f" {dist.name} ({dist.type})")
print(f" Parameters: {list(dist.parameters.values())}")
# Access parameter sets
print("\\nParameter sets:")
for param_set in ws.parameter_points:
print(f" {param_set.name}:")
for param in param_set.parameters:
print(f" {param.name} = {param.value}")
# Access domains
print("\\nDomains:")
for domain in ws.domains:
print(f" {domain.name}:")
for axis in domain.axes:
print(f" {axis.name}: [{axis.min}, {axis.max}]")
Understanding Workspace Structure¶
The workspace follows a hierarchical structure:
---
config:
darkMode: 'true'
theme: forest
---
%%{
init: {
'theme': 'forest',
'themeVariables': {
'primaryColor': '#fefefe',
'lineColor': '#aaa'
}
}
}%%
classDiagram
class Workspace {
+metadata: Metadata
+distributions: list[Distribution]
+functions: list[Function]
+domains: list[Domain]
+parameter_points: list[ParameterSet]
+data: optional
+likelihoods: optional
+analyses: optional
}
class Metadata {
+hs3_version: str
+authors: optional[list]
+description: optional[str]
}
class Distribution {
+name: str
+type: str
+parameters: dict
}
class Function {
+name: str
+type: str
+parameters: dict
}
class Domain {
+name: str
+type: str
+axes: list[Axis]
}
class ParameterSet {
+name: str
+parameters: list[ParameterPoint]
}
Workspace --> Metadata : contains
Workspace --> Distribution : contains
Workspace --> Function : contains
Workspace --> Domain : contains
Workspace --> ParameterSet : contains
Creating Models from Workspaces¶
The main purpose of a workspace is to create models that you can evaluate:
# Create a model using specific domain and parameter set
model = ws.model(domain="physics_region", parameter_set="nominal")
# Or use defaults (index 0)
model = ws.model()
# Evaluate the model
result = model.pdf("signal", obs=0.5, mu=0.0, sigma=1.0)
print(f"PDF value: {result}")
Example: Complete Physics Model¶
Here’s a more realistic example of a workspace for a physics analysis:
physics_model = {
"metadata": {
"hs3_version": "0.2",
"authors": ["Physics Analysis Team"],
"description": "Signal + background model for Higgs search",
},
"distributions": [
{
"name": "signal",
"type": "gaussian_dist",
"x": "mass",
"mean": "higgs_mass",
"sigma": "resolution",
},
{
"name": "background",
"type": "generic_dist",
"x": "mass",
"expression": "exp(-mass/lifetime) / norm",
},
],
"functions": [
{
"name": "total_events",
"type": "sum",
"summands": ["signal_yield", "background_yield"],
}
],
"parameter_points": [
{
"name": "best_fit",
"parameters": [
{"name": "higgs_mass", "value": 125.0},
{"name": "resolution", "value": 2.5},
{"name": "signal_yield", "value": 100.0},
{"name": "background_yield", "value": 1000.0},
{"name": "lifetime", "value": 50.0},
{"name": "norm", "value": 1.0},
],
}
],
"domains": [
{
"name": "search_window",
"type": "product_domain",
"axes": [
{"name": "mass", "min": 110.0, "max": 140.0},
{"name": "higgs_mass", "min": 120.0, "max": 130.0},
{"name": "resolution", "min": 1.0, "max": 5.0},
{"name": "signal_yield", "min": 0.0, "max": 500.0},
{"name": "background_yield", "min": 100.0, "max": 5000.0},
],
}
],
}
physics_ws = pyhs3.Workspace(**physics_model)
physics_model = physics_ws.model()
# Evaluate signal and background separately
signal_pdf = physics_model.pdf("signal", mass=125.0, higgs_mass=125.0, resolution=2.5)
background_pdf = physics_model.pdf("background", mass=125.0, lifetime=50.0, norm=1.0)
print(f"Signal PDF at 125 GeV: {signal_pdf}")
print(f"Background PDF at 125 GeV: {background_pdf}")