New_sample ( bool) – If False, will rely on a previous sample and ignore the ‘n’ and ‘seed’ parameters To_raw_examples ( file_format=None, format_fn=None, n=None, seed=None, new_sample=True ) ¶įlattens all test examples into a single list Parametersįile_format ( string, must be one of ‘jsonl’, ‘tsv’, or None) – None just calls str(x) for each example in self.dataįormat_fn ( function or None) – If not None, call this function to format each example in self.data N_per_testcase ( int) – Maximum number of examples to show for each test case Print_fn ( function) – If not None, use this to print a failed test case.Īrguments: (xs, preds, confs, expect_results, labels=None, meta=None)įormat_example_fn ( function) – If not None, use this to print a failed example within a test caseĪrguments: (x, pred, conf, label=None, meta=None) N ( int) – number of example failures to show Print stats and example failures Parameters See expect.py for details summary ( n=3, print_fn=None, format_example_fn=None, n_per_testcase=3 ) ¶ Sets and updates expectation function ParametersĮxpect ( function) – Expectation function, takes an AbstractTest (self) as parameter Update self.results (run tests) from list of predictions and confidences Parameters Run_from_preds_confs ( preds, confs, overwrite=False ) ¶ Ignore_header ( bool) – If True, skip first line in the file Pred_and_softmax: each line has a prediction and all softmax probabilities, separated by a spaceįormat_fn ( function) – If not None, function that reads a line in the input file and outputs a tuple of (prediction, confidence) Pred_and_conf: each line has a prediction and a confidence value, separated by a space Softmax: each line has prediction probabilities separated by spacesīinary_conf: each line has the prediction probability of class 1 (binary) Update self.results (run tests) from a prediction file Parametersįile_format ( string) – None, or one of ‘pred_only’, ‘softmax’, binary_conf’, ‘pred_and_conf’, ‘pred_and_softmax’, ‘squad’, Run_from_file ( path, file_format=None, format_fn=None, ignore_header=False, overwrite=False ) ¶ Seed ( int) – Seed to use if n is not None N ( int) – If not None, number of samples to draw Verbose ( bool) – If True, print extra information Overwrite ( bool) – If False, raise exception if results already exist Outputs a tuple (predictions, confidences) Predict_and_confidence_fn ( function) – Takes as input a list of examples Recovers a previously computed example_list_and_indices run ( predict_and_confidence_fn, overwrite=False, verbose=True, n=None, seed=None ) ¶ Also updates self.result_indexes with the second list.įail_idxs ( ) ¶ filtered_idxs ( ) ¶ form_test_info ( name=None, description=None, capability=None ) ¶ form_testcases ( n_per_testcase=3 ) ¶ static from_file ( file ) ¶ get_stats ( ) ¶ print ( xs, preds, confs, expect_results, labels=None, meta=None, format_example_fn=None, nsamples=3 ) ¶ print_stats ( ) ¶ recover_example_list_and_indices ( ) ¶ )Īlso updates n_idxs if n is not None to indicate which testcases Tuple(list, list) – First list is a list of examplesįor example, let’s say we have two testcases ( and. AbstractTest ( data, expect, labels=None, meta=None, agg_fn='all', templates=None, print_first=None, name=None, capability=None, description=None ) ¶īases: abc.ABC example_list_and_indices ( n=None, seed=None ) ¶ Checklist package ¶ Submodules ¶ checklist.abstract_test module ¶ class checklist.abstract_test.