class documentation

class TestHistogramOptimBinNums:

View In Hierarchy

Provide test coverage when using provided estimators for optimal number of bins
Method test​_empty Undocumented
Method test​_incorrect​_methods Check a Value Error is thrown when an unknown string is passed in
Method test​_limited​_variance Check when IQR is 0, but variance exists, we return the sturges value and not the fd value.
Method test​_novariance Check that methods handle no variance in data Primarily for Scott and FD as the SD and IQR are both 0 in this case
Method test​_outlier Check the FD, Scott and Doane with outliers.
Method test​_scott​_vs​_stone Verify that Scott's rule and Stone's rule converges for normally distributed data
Method test​_signed​_integer​_data Undocumented
Method test​_simple Straightforward testing with a mixture of linspace data (for consistency). All test values have been precomputed and the values shouldn't change
Method test​_simple​_range No summary
Method test​_simple​_weighted Check that weighted data raises a TypeError
Method test​_small Smaller datasets have the potential to cause issues with the data adaptive methods, especially the FD method. All bin numbers have been precalculated.
def test_empty(self):

Undocumented

def test_incorrect_methods(self):
Check a Value Error is thrown when an unknown string is passed in
def test_limited_variance(self):
Check when IQR is 0, but variance exists, we return the sturges value and not the fd value.
def test_novariance(self):
Check that methods handle no variance in data Primarily for Scott and FD as the SD and IQR are both 0 in this case
def test_outlier(self):

Check the FD, Scott and Doane with outliers.

The FD estimates a smaller binwidth since it's less affected by outliers. Since the range is so (artificially) large, this means more bins, most of which will be empty, but the data of interest usually is unaffected. The Scott estimator is more affected and returns fewer bins, despite most of the variance being in one area of the data. The Doane estimator lies somewhere between the other two.

def test_scott_vs_stone(self):
Verify that Scott's rule and Stone's rule converges for normally distributed data
@pytest.mark.parametrize('bins', ['auto', 'fd', 'doane', 'scott', 'stone', 'rice', 'sturges'])
def test_signed_integer_data(self, bins):

Undocumented

def test_simple(self):
Straightforward testing with a mixture of linspace data (for consistency). All test values have been precomputed and the values shouldn't change
def test_simple_range(self):
Straightforward testing with a mixture of linspace data (for consistency). Adding in a 3rd mixture that will then be completely ignored. All test values have been precomputed and the shouldn't change.
def test_simple_weighted(self):
Check that weighted data raises a TypeError
def test_small(self):
Smaller datasets have the potential to cause issues with the data adaptive methods, especially the FD method. All bin numbers have been precalculated.