PESMaker Architecture¶
PESMaker is organized around replaceable workflow stages. The stage
implementations live in domain packages, while pesmaker.workflow is now an
orchestration and compatibility layer.
Core objects¶
PESMakerConfig: validated project configuration.WorkflowConfig: optional compatibility override fornext; normal configs omit it and let artifacts plus YAML sections determine the flow.StageResult: small return object for setup, submit, collect, and training stages.- JSON Lines manifests: persistent file records for generated structures, sampling jobs, selected frames, SCF jobs, and the active training job.
Module Boundaries¶
pesmaker.generators.structures: supercells, surfaces, defects, perturbations, generated-structure manifests, and generation summaries.pesmaker.samplers.gpumd: GPUMD sampling folders,run.in, potential copy, and sampling submit scripts.pesmaker.samplers.lammps_mace: LAMMPS-MACE sampling folders,data.in, user LAMMPS input rendering, and sampling submit scripts.pesmaker.samplers: sampling-engine dispatcher used by CLI andnext.pesmaker.samplers.selection: descriptors, farthest point selection, and diagnostic plots.pesmaker.parsers.ase: ASE-backed frame reading and extxyz writing.pesmaker.parsers.vasp: VASP output readers used by collection.pesmaker.labelers.vasp: VASP SCF folders, POSCAR normalization, INCAR, POTCAR assembly, and SCF warnings.pesmaker.jobs.resources: CPU/GPU and VASP parallel-resource decisions.pesmaker.jobs.scripts: submit-script template rendering and normalization.pesmaker.jobs.submit: dry-run or real submission of preparedsubmit.shfiles.pesmaker.dataset.extxyz: labeled-output collection into extxyz datasets.pesmaker.trainers.nep: NEP and generic training input setup.pesmaker.trainers.layout: shared training output paths, two-step state, and training manifest locations.pesmaker.plot: plotting command package.pesmaker.plot.commandsis the CLI registry, engine-specific plots live in modules such aspesmaker.plot.nep, and shared result/style helpers stay inpesmaker.plot.resultandpesmaker.plot.style.pesmaker.workflow.next: artifact-driven smart-next state machine.pesmaker.workflow.plan: artifact path and file-presence checks.pesmaker.workflow.state:.pesmaker/<project>/next_state.json.
pesmaker.workflow.stages and pesmaker.workflow.generate remain
backward-compatible re-export modules for older imports.
Stage Interfaces¶
Each concrete stage exposes a small Python function such as
generate_structures(config), setup_sampling(config),
setup_labeling(config), submit_jobs(config, stage=..., dry_run=...),
collect_labeled_dataset(config), or setup_training(config). The CLI and
next orchestration call these functions directly.
State Model¶
Stage data stays file-backed:
- generated structures:
generated/manifest.jsonl; - sampling jobs:
sampling/sampling_manifest.jsonl; - selected frames:
selected/manifest.jsonl; - SCF jobs:
labeling/labeling_manifest.jsonl; - active training job:
training/training_manifest.jsonl; - follow-up config template after generation-only runs:
run.next.yaml; - smart-next dry-run gates:
.pesmaker/<project>/next_state.json.
A database service remains optional and is not required for the current workflow.
Dependency policy¶
The base package should stay light. Heavy scientific tools should be optional extras or external executables:
- base: configuration, stage setup, file manifests, CLI;
- atomistic extra: ASE and pymatgen;
- workflow extras: jobflow or AiiDA only if a user chooses those integrations;
- engines: VASP, CP2K, GPUMD, LAMMPS, MACE as external programs.