sciunit.scores package

Submodules

sciunit.scores.base module

Base class for SciUnit scores.

class sciunit.scores.base.ErrorScore(score, related_data=None)[source]

Bases: sciunit.scores.base.Score

A score returned when an error occurs during testing.

__module__ = 'sciunit.scores.base'
__str__()[source]

Return str(self).

_describe()[source]
property norm_score

A floating point version of the score used for sorting. If normalized = True, this must be in the range 0.0 to 1.0, where larger is better (used for sorting and coloring tables).

property summary

Summarize the performance of a model on a test.

class sciunit.scores.base.Score(score, related_data=None)[source]

Bases: sciunit.base.SciUnit

Abstract base class for scores.

__eq__(other)[source]

Return self==value.

__ge__(other)[source]

Return self>=value.

__gt__(other)[source]

Return self>value.

__hash__ = None
__init__(score, related_data=None)[source]

Abstract base class for scores.

Args:

score (int, float, bool): A raw value to wrap in a Score class. related_data (dict, optional): Artifacts to store with the score.

__le__(other)[source]

Return self<=value.

__lt__(other)[source]

Return self<value.

__module__ = 'sciunit.scores.base'
__ne__(other)[source]

Return self!=value.

__repr__()[source]

Return repr(self).

__str__()[source]

Return str(self).

_allowed_types = None

List of allowed types for the score argument

_allowed_types_message = 'Score of type %s is not an instance of one of the allowed types: %s'

Error message when score argument is not one of these types

_best = None

The best possible score of this type

_check_score(score)[source]

A method for each Score subclass to impose additional constraints on the score, e.g. the range of the allowed score

_describe()[source]
_description = ''

A description of this score, i.e. how to interpret it. Provided in the score definition

_raw = None

A raw number arising in a test’s compute_score, used to determine this score. Can be set for reporting a raw value determined in Test.compute_score before any transformation, e.g. by a Converter

_worst = None

The best possible score of this type

check_score(score)[source]
color(value=None)[source]

Turn the score intp an RGB color tuple of three 8-bit integers.

classmethod compute(observation, prediction)[source]

Compute whether the observation equals the prediction.

describe(quiet=False)[source]
describe_from_docstring()[source]
description = ''

A description of this score, i.e. how to interpret it. For the user to set in bind_score

classmethod extract_mean_or_value(obs_or_pred, key=None)[source]

Extracts the mean, value, or user-provided key from an observation or prediction dictionary.

classmethod extract_means_or_values(observation, prediction, key=None)[source]

Extracts the mean, value, or user-provided key from the observation and prediction dictionaries.

get_raw()[source]
property log10_norm_score

The logarithm base 10 of the norm_score. This is useful for guaranteeing convexity in an error surface

property log2_norm_score

The logarithm base 2 of the norm_score. This is useful for guaranteeing convexity in an error surface

property log_norm_score

The natural logarithm of the norm_score. This is useful for guaranteeing convexity in an error surface

model = None

The model judged. Set automatically by Test.judge.

property norm_score

A floating point version of the score used for sorting. If normalized = True, this must be in the range 0.0 to 1.0, where larger is better (used for sorting and coloring tables).

property raw
related_data = None

Data specific to the result of a test run on a model.

score = None

The score itself.

property score_type
set_raw(raw)[source]
summarize()[source]
property summary

Summarize the performance of a model on a test.

test = None

The test taken. Set automatically by Test.judge.

classmethod value_color(value)[source]

sciunit.scores.collections module

SciUnit score collections, such as arrays and matrices.

These collections allow scores to be organized and visualized by model, test, or both.

class sciunit.scores.collections.ScoreArray(tests_or_models, scores=None, weights=None)[source]

Bases: pandas.core.series.Series, sciunit.base.SciUnit, sciunit.base.TestWeighted

Represents an array of scores derived from a test suite.

Extends the pandas Series such that items are either models subject to a test or tests taken by a model. Also displays and compute score summaries in sciunit-specific ways.

Can use like this, assuming n tests and m models:

>>> sm[test]
>>> sm[test]
(score_1, ..., score_m)
>>> sm[model]
(score_1, ..., score_n)
__getattr__(name)[source]

After regular attribute access, try looking up the name This allows simpler access to columns for interactive use.

__getitem__(item)[source]
__init__(tests_or_models, scores=None, weights=None)[source]

Initialize self. See help(type(self)) for accurate signature.

__module__ = 'sciunit.scores.collections'
_data
_name
check_tests_and_models(tests_or_models)[source]
direct_attrs = ['score', 'norm_scores', 'related_data']
get_by_name(name)[source]
mean()[source]

Compute a total score for each model over all the tests.

Uses the norm_score attribute, since otherwise direct comparison across different kinds of scores would not be possible.

property norm_scores

Return the norm_score for each test.

stature(test_or_model)[source]

Compute the relative rank of a model on a test.

Rank is against other models that were asked to take the test.

class sciunit.scores.collections.ScoreMatrix(tests, models, scores=None, weights=None, transpose=False)[source]

Bases: pandas.core.frame.DataFrame, sciunit.base.SciUnit, sciunit.base.TestWeighted

Represents a matrix of scores derived from a test suite. Extends the pandas DataFrame such that tests are columns and models are the index. Also displays and compute score summaries in sciunit-specific ways.

Can use like this, assuming n tests and m models:

>>> sm[test]
>>> sm[test]
(score_1, ..., score_m)
>>> sm[model]
(score_1, ..., score_n)
property T

Get transpose of this ScoreMatrix.

__getattr__(name)[source]

After regular attribute access, try looking up the name This allows simpler access to columns for interactive use.

__getitem__(item)[source]
__init__(tests, models, scores=None, weights=None, transpose=False)[source]

Initialize self. See help(type(self)) for accurate signature.

__module__ = 'sciunit.scores.collections'
annotate(df, html, show_mean, colorize)[source]
annotate_body(soup, df, show_mean)[source]
annotate_body_cell(cell, df, show_mean, i, j)[source]
annotate_header_cell(cell, df, show_mean, i, j)[source]
annotate_headers(soup, df, show_mean)[source]
annotate_mean(cell, df, i)[source]
check_tests_models_scores(tests, models, scores)[source]
direct_attrs = ['score', 'norm_scores', 'related_data']
dynamify(table_id)[source]
get_by_name(name)[source]
get_group(x)[source]
get_model(model)[source]
get_test(test)[source]
property norm_scores
show_mean = False
sortable = False
stature(test, model)[source]

Computes the relative rank of a model on a test compared to other models that were asked to take the test.

to_html(show_mean=None, sortable=None, colorize=True, *args, **kwargs)[source]

Extend Pandas built in to_html method for rendering a DataFrame and use it to render a ScoreMatrix.

sciunit.scores.collections_m2m module

Score collections for direct comparison of models against other models.

class sciunit.scores.collections_m2m.ScoreArrayM2M(test, models, scores)[source]

Bases: pandas.core.series.Series

Represents an array of scores derived from TestM2M. Extends the pandas Series such that items are either models subject to a test or the test itself.

__getattr__(name)[source]

After regular attribute access, try looking up the name This allows simpler access to columns for interactive use.

__getitem__(item)[source]
__init__(test, models, scores)[source]

Initialize self. See help(type(self)) for accurate signature.

__module__ = 'sciunit.scores.collections_m2m'
_data
_name
get_by_name(name)[source]
property norm_scores
class sciunit.scores.collections_m2m.ScoreMatrixM2M(test, models, scores)[source]

Bases: pandas.core.frame.DataFrame

Represents a matrix of scores derived from TestM2M. Extends the pandas DataFrame such that models/observation are both columns and the index.

__getattr__(name)[source]

After regular attribute access, try looking up the name This allows simpler access to columns for interactive use.

__getitem__(item)[source]
__init__(test, models, scores)[source]

Initialize self. See help(type(self)) for accurate signature.

__module__ = 'sciunit.scores.collections_m2m'
get_by_name(name)[source]
get_group(x)[source]
property norm_scores

sciunit.scores.complete module

Score types for tests that completed successfully.

These include various representations of goodness-of-fit.

class sciunit.scores.complete.BooleanScore(score, related_data=None)[source]

Bases: sciunit.scores.base.Score

A boolean score, which must be True or False.

__module__ = 'sciunit.scores.complete'
__str__()[source]

Return str(self).

_allowed_types = (<class 'bool'>,)
_description = 'True if the observation and prediction were sufficiently similar; False otherwise'
classmethod compute(observation, prediction)[source]

Compute whether the observation equals the prediction.

property norm_score

Return 1.0 for a True score and 0.0 for False score.

class sciunit.scores.complete.CohenDScore(score, related_data=None)[source]

Bases: sciunit.scores.complete.ZScore

A Cohen’s D score.

A float indicating difference between two means normalized by the pooled standard deviation.

__module__ = 'sciunit.scores.complete'
__str__()[source]

Return str(self).

_description = "The Cohen's D between the prediction and the observation"
classmethod compute(observation, prediction)[source]

Compute a Cohen’s D from an observation and a prediction.

class sciunit.scores.complete.FloatScore(score, related_data=None)[source]

Bases: sciunit.scores.base.Score

A float score.

A float with any value.

__module__ = 'sciunit.scores.complete'
__str__()[source]

Return str(self).

_allowed_types = (<class 'float'>, <class 'quantities.quantity.Quantity'>)
_check_score(score)[source]

A method for each Score subclass to impose additional constraints on the score, e.g. the range of the allowed score

_description = 'There is no canonical mapping between this score type and a measure of agreement between the observation and the prediction'
classmethod compute_ssd(observation, prediction)[source]

Compute sum-squared diff between observation and prediction.

class sciunit.scores.complete.PercentScore(score, related_data=None)[source]

Bases: sciunit.scores.base.Score

A percent score.

A float in the range [0,0,100.0] where higher is better.

__module__ = 'sciunit.scores.complete'
__str__()[source]

Return str(self).

_check_score(score)[source]

A method for each Score subclass to impose additional constraints on the score, e.g. the range of the allowed score

_description = '100.0 is considered perfect agreement between the observation and the prediction. 0.0 is the worst possible agreement'
property norm_score

Return 1.0 for a percent score of 100, and 0.0 for 0.

class sciunit.scores.complete.RandomScore(score, related_data=None)[source]

Bases: sciunit.scores.base.Score

A random score in [0,1].

This has no scientific value and should only be used for debugging purposes. For example, one might assign a random score under some error condition to move forward with an application that requires a numeric score, and use the presence of a RandomScore in the output as an indication of an internal error.

__module__ = 'sciunit.scores.complete'
__str__()[source]

Return str(self).

_allowed_types = (<class 'float'>,)
_description = 'There is a random number in [0,1] and has no relation to the prediction or the observation'
class sciunit.scores.complete.RatioScore(score, related_data=None)[source]

Bases: sciunit.scores.base.Score

A ratio of two numbers.

Usually the prediction divided by the observation.

__module__ = 'sciunit.scores.complete'
__str__()[source]

Return str(self).

_allowed_types = (<class 'float'>,)
_best = 1.0
_check_score(score)[source]

A method for each Score subclass to impose additional constraints on the score, e.g. the range of the allowed score

_description = 'The ratio between the prediction and the observation'
classmethod compute(observation, prediction, key=None)[source]

Compute a ratio from an observation and a prediction.

property norm_score

Return 1.0 for a ratio of 1, falling to 0.0 for extremely small or large values.

class sciunit.scores.complete.ZScore(score, related_data=None)[source]

Bases: sciunit.scores.base.Score

A Z score.

A float indicating standardized difference from a reference mean.

__module__ = 'sciunit.scores.complete'
__str__()[source]

Return str(self).

_allowed_types = (<class 'float'>,)
_best = 0.0
_description = 'The difference between the means of the observation and prediction divided by the standard deviation of the observation'
_worst = inf
classmethod compute(observation, prediction)[source]

Compute a z-score from an observation and a prediction.

property norm_score

Return the normalized score.

Equals 1.0 for a z-score of 0, falling to 0.0 for extremely positive or negative values.

sciunit.scores.incomplete module

Score types for tests that did not complete successfully.

These include details about the various possible reasons that a particular combination of model and test could not be completed.

class sciunit.scores.incomplete.InsufficientDataScore(score, related_data=None)[source]

Bases: sciunit.scores.incomplete.NoneScore

A score returned when the model or test data is insufficient to score the test.

__module__ = 'sciunit.scores.incomplete'
description = 'Insufficient Data'
class sciunit.scores.incomplete.NAScore(score, related_data=None)[source]

Bases: sciunit.scores.incomplete.NoneScore

A N/A (not applicable) score.

Indicates that the model doesn’t have the capabilities that the test requires.

__module__ = 'sciunit.scores.incomplete'
description = 'N/A'
class sciunit.scores.incomplete.NoneScore(score, related_data=None)[source]

Bases: sciunit.scores.base.Score

A None score.

Usually indicates that the model has not been checked to see if it has the capabilities required by the test.

__init__(score, related_data=None)[source]

Abstract base class for scores.

Args:

score (int, float, bool): A raw value to wrap in a Score class. related_data (dict, optional): Artifacts to store with the score.

__module__ = 'sciunit.scores.incomplete'
__str__()[source]

Return str(self).

property norm_score

A floating point version of the score used for sorting. If normalized = True, this must be in the range 0.0 to 1.0, where larger is better (used for sorting and coloring tables).

class sciunit.scores.incomplete.TBDScore(score, related_data=None)[source]

Bases: sciunit.scores.incomplete.NoneScore

A TBD (to be determined) score. Indicates that the model has capabilities required by the test but has not yet taken it.

__module__ = 'sciunit.scores.incomplete'
description = 'None'

Module contents

Contains classes for different representations of test scores.

It also contains score collections such as arrays and matrices.