Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Counterfactual Scenarios

The counterfactual scenario engine enables what-if analysis by generating paired baseline and counterfactual datasets using a causal directed acyclic graph (DAG).

Causal DAG

The default DAG contains 17 financial process nodes:

NodeDescription
gdp_growthGDP growth rate
interest_ratesMarket interest rates
consumer_confidenceConsumer confidence index
transaction_volumeOverall transaction volume
revenueCompany revenue
expensesOperating expenses
fraud_rateRate of fraudulent transactions
control_effectivenessInternal control effectiveness
audit_riskAudit risk level
misstatement_riskFinancial misstatement risk
detection_rateFraud detection rate
approval_thresholdApproval threshold amount
sod_violationsSegregation of duties violations
late_postingsLate posting frequency
period_end_adjustmentsPeriod-end adjustment volume
intercompany_volumeIntercompany transaction volume
cash_flowOperating cash flow

Transfer Functions

8 transfer function types model causal relationships:

FunctionDescription
Lineary = strength * x + offset
Exponentialy = strength * e^(rate * x)
LogisticS-curve saturation
InverseLogisticInverse S-curve
StepBinary threshold
ThresholdActivation above/below value
DecayExponential decay
PiecewiseMulti-segment linear

DAG Presets

PresetNodesDescription
minimal6Core accounting relationships only
financial_process12Includes document flows and period close
full17Complete causal graph

Interventions

Interventions modify node values to create counterfactual scenarios:

scenarios:
  - name: "recession_impact"
    interventions:
      - type: ParameterShift
        target_node: gdp_growth
        magnitude: -0.03
        timing: immediate
      - type: MacroShock
        target_node: interest_rates
        magnitude: 0.02
        timing: gradual

ConfigMutator Constraints

The ConfigMutator applies interventions while preserving data integrity:

  • preserve_accounting_identity – Assets = Liabilities + Equity
  • preserve_document_chains – PO -> GR -> Invoice -> Payment integrity
  • preserve_period_close – Fiscal period boundaries maintained
  • preserve_balance_coherence – Trial balance consistency

CLI Usage

# List available scenarios
datasynth-data scenario list

# Generate baseline + counterfactual pair
datasynth-data scenario generate --config config.yaml --scenario recession_impact --output ./output

# Compute diff between baseline and counterfactual
datasynth-data scenario diff --baseline ./output/baseline --counterfactual ./output/counterfactual

Python Usage

from datasynth_py.config import blueprints

config = blueprints.retail_small()
config = blueprints.with_scenarios(config, template="fraud_detection", with_interventions=True)