Command Line Interface¶
Batchie provides a suite of command line utilities that allow users to script running the pipeline end to end.
train_model¶
Train the provided model by iteratively calling its #step method,conditioned on the provided data. Collect the model parameters and saveto a file.
usage: train_model [-h] [-v] [-P] --data DATA --model MODEL
[--model-param KEY=VALUE] --output OUTPUT
[--n-samples N_SAMPLES] [--n-burnin N_BURNIN] [--thin THIN]
[--n-chains N_CHAINS] [--chain-index CHAIN_INDEX]
[--seed SEED]
Named Arguments¶
- -v, --verbose
Enable verbose logging
Default:
False- -P, --progress
Enable progress bar
Default:
False- --data
A batchie Screen object in hdf5 format.
- --model
Fully qualified name of the BayesianModel class to use.
- --model-param
Model parameters
- --output
Output file to save learned model parameters to.
- --n-samples
Number of samples to save from the posterior distribution
Default:
100- --n-burnin
Number of steps to iterate before samples are saved
Default:
1000- --thin
Thinning factor for samples after burn-in is complete. A value of 2 means every seconds sample is saved, etc.
Default:
10- --n-chains
Number of parallel instances of this model that are being run. This information is used for rng seed intialization.
Default:
1- --chain-index
Index of this model in the series of parallel model runs.
Default:
0- --seed
Seed to use for PRNG.
Default:
0
evaluate_model¶
This is a utility for evaluating model performance by predicting over an observed ‘test screen’.
usage: evaluate_model [-h] [-v] [-P] --screen SCREEN --thetas THETAS
[THETAS ...] --output OUTPUT [--seed SEED]
Named Arguments¶
- -v, --verbose
Enable verbose logging
Default:
False- -P, --progress
Enable progress bar
Default:
False- --screen
A batchie Screen in hdf5 format with all plates observed.
- --thetas
A batchie ThetaHolder in hdf5 format.
- --output
Output ModelEvaluation object in h5 format.
- --seed
Seed to use for PRNG.
Default:
0
calculate_distance_matrix¶
calculate_distance_matrix.py
usage: calculate_distance_matrix [-h] [-v] [-P] --data DATA --thetas THETAS
[THETAS ...] --distance-metric
DISTANCE_METRIC
[--distance-metric-param KEY=VALUE]
--n-chunks N_CHUNKS --chunk-index CHUNK_INDEX
--output OUTPUT
Named Arguments¶
- -v, --verbose
Enable verbose logging
Default:
False- -P, --progress
Enable progress bar
Default:
False- --data
A batchie Screen in hdf5 format.
- --thetas
A batchie ThetaHolder in hdf5 format.
- --distance-metric
Fully qualified name of the DistanceMetric class to use.
- --distance-metric-param
Distance metric parameters
- --n-chunks
Number of chunks to split the distance matrix calculation into.
- --chunk-index
Which of the n chunks to calculate in this invocation.
- --output
Output batchie ChunkedDistanceMatrix in hdf5 format.
calculate_scores¶
calculate_scores.py
usage: calculate_scores [-h] [-v] [-P] --data DATA --thetas THETAS
[THETAS ...] --distance-matrix DISTANCE_MATRIX
[DISTANCE_MATRIX ...] [--n-chunks N_CHUNKS]
[--chunk-index CHUNK_INDEX] --scorer SCORER
[--scorer-param KEY=VALUE]
[--batch-plate-ids BATCH_PLATE_IDS [BATCH_PLATE_IDS ...]]
--output OUTPUT [--seed SEED]
Named Arguments¶
- -v, --verbose
Enable verbose logging
Default:
False- -P, --progress
Enable progress bar
Default:
False- --data
A batchie Screen object in hdf5 format.
- --thetas
A batchie ThetaHolder object in hdf5 format.
- --distance-matrix
A batchie ChunkedDistanceMatrix object in hdf5 format.
- --n-chunks
Number of chunks to parallelize scoring over.
Default:
1- --chunk-index
Which of the n chunks to calculate in this invocation.
Default:
0- --scorer
Fully qualified name of the scorer class to use.
- --scorer-param
Scorer parameters
- --batch-plate-ids
The plate(s) already selected as part of this batch.
Default:
[]- --output
Location of output h5 file where scores will be saved.
- --seed
Seed to use for PRNG.
Default:
0
select_next_plate¶
calculate_distance_matrix.py
usage: select_next_plate [-h] [-v] [-P] --data DATA --scores SCORES
[SCORES ...] [--policy POLICY]
[--policy-param KEY=VALUE] --output OUTPUT
[--seed SEED]
[--batch-plate-id BATCH_PLATE_ID [BATCH_PLATE_ID ...]]
Named Arguments¶
- -v, --verbose
Enable verbose logging
Default:
False- -P, --progress
Enable progress bar
Default:
False- --data
A batchie Screen object in hdf5 format.
- --scores
One or more ChunkedScoresHolder objects in hdf5 format.
- --policy
Fully qualified name of the PlatePolicy class to use.
- --policy-param
Policy parameters
- --output
Location of output file which will contain the plate id selected
- --seed
Seed to use for PRNG.
Default:
0- --batch-plate-id
The plate(s) currently select in the batch.
Default:
[]
reveal_plate¶
This is a utility for revealing a plates in a retrospective simulation.
usage: reveal_plate [-h] [-v] [-P] --screen SCREEN --output OUTPUT --plate-id
PLATE_ID [PLATE_ID ...]
Named Arguments¶
- -v, --verbose
Enable verbose logging
Default:
False- -P, --progress
Enable progress bar
Default:
False- --screen
A batchie Screen in hdf5 format with some plates observed.
- --output
Where to save screen with the next plate revealed.
- --plate-id
The plate(s) to reveal.
extract_screen_metadata¶
extract_screen_metadata.py
usage: extract_screen_metadata [-h] [-v] [-P] --screen SCREEN --output OUTPUT
Named Arguments¶
- -v, --verbose
Enable verbose logging
Default:
False- -P, --progress
Enable progress bar
Default:
False- --screen
A batchie Screen in hdf5 format.
- --output
Output json file to save metadata to.
prepare_retrospective_simulation¶
This is a utility for revealing plates in a retrospective simulation,calculating the prediction error on the un-revealed plates thus far,and saving the results.
usage: prepare_retrospective_simulation [-h] [-v] [-P] --data DATA
--training-output TRAINING_OUTPUT
--test-output TEST_OUTPUT
[--initial-plate-generator INITIAL_PLATE_GENERATOR]
[--initial-plate-generator-param KEY=VALUE]
[--plate-generator PLATE_GENERATOR]
[--plate-generator-param KEY=VALUE]
[--plate-smoother PLATE_SMOOTHER]
[--plate-smoother-param KEY=VALUE]
[--holdout-fraction HOLDOUT_FRACTION]
[--seed SEED]
Named Arguments¶
- -v, --verbose
Enable verbose logging
Default:
False- -P, --progress
Enable progress bar
Default:
False- --data
A batchie Screen in hdf5 format.
- --training-output
Output training set batchie Screen in hdf5 format.
- --test-output
Output test set batchie Screen in hdf5 format.
- --initial-plate-generator
Fully qualified name of the InitialRetrospectivePlateGenerator class to use.
- --initial-plate-generator-param
Initial plate generator parameters
- --plate-generator
Fully qualified name of the RetrospectivePlateGenerator class to use.
- --plate-generator-param
Plate generator parameters
- --plate-smoother
Fully qualified name of the RetrospectivePlateSmoother class to use.
- --plate-smoother-param
Plate smoother parameters
- --holdout-fraction
Fraction of data to holdout for testing (proportion of experiments in the test set in the test/train spit).
Default:
0.1- --seed
Seed to use for PRNG.
Default:
0