DataSet

class minkit.DataSet(data, pars, weights=None)[source]

Bases: minkit.DataObject

Definition of an unbinned data set to evaluate PDFs.

Parameters:

Attributes Summary

aop Object to do operations on arrays.
backend Backend interface.
data_pars Data parameters associated to this sample.
ndim Number of dimensions.
values Values of the data set.
weights Weights of the sample.

Methods Summary

from_ndarray(arr, arg[, weights, backend]) Build the class from a single array.
from_records(arr, data_pars[, weights, backend]) Build the class from a numpy.ndarray object.
get(index) Get the values given an index.
make_binned([bins]) Make a binned version of this sample.
merge(samples[, maximum]) Merge many samples into one.
subset(arg[, rescale_weights]) Get a subset of this data set.
to_backend(backend) Initialize this class in a different backend.
to_records() Convert this class into a numpy.ndarray object.

Attributes Documentation

aop

Object to do operations on arrays.

Type:ArrayOperations
backend

Backend interface.

Type:Backend
data_pars

Data parameters associated to this sample.

Type:Registry(Parameter)
ndim

Number of dimensions.

Type:int
values

Values of the data set.

Type:darray
weights

Weights of the sample.

Type:darray or None

Methods Documentation

classmethod from_ndarray(arr, arg, weights=None, backend=None)[source]

Build the class from a single array.

Parameters:
  • arr (numpy.ndarray) – array of data.
  • arg (Registry(Parameter)) – if arr only contains one set of values, it must be a single data parameter. Otherwise a collection of parameters.
  • weights (numpy.ndarray or None) – possible weights to use.
  • backend (Backend or None) – backend where the data set is built.
classmethod from_records(arr, data_pars, weights=None, backend=None)[source]

Build the class from a numpy.ndarray object.

Parameters:
Raises:

RuntimeError – If a parameter is not found in the input array.

get(index)[source]

Get the values given an index.

Parameters:index (int) – index to process.
Returns:Values at the index.
Return type:numpy.ndarray
make_binned(bins=100)[source]

Make a binned version of this sample.

Parameters:bins (int or tuple(int, ..)) – number of bins per data parameter.
Returns:Binned data sample.
Return type:BinnedDataSet
classmethod merge(samples, maximum=None)[source]

Merge many samples into one. If maximum is specified, then the last elements will be dropped.

Parameters:
  • samples (tuple(DataSet)) – samples to merge.
  • maximum (int or None) – maximum number of entries for the final sample.
Returns:

Merged sample.

Return type:

DataSet

Raises:

RuntimeError – If the samples have different parameters or if only some of them have weights.

… warning:: If maximum is specified, the last elements corresponding to the
last samples might be dropped.
subset(arg, rescale_weights=False)[source]

Get a subset of this data set. If arg is a string, it will be considered as a range. In case it is a barray, then it is considered to be a mask array. If rescale_weights is set to True, then the weights are rescaled so their statistical weight in minimization processes is proportional to the event weights:

\[\omega^\prime_i = \omega_i \times \frac{\sum_{j = 0}^n \omega_j}{\sum_{j = 0}^n \omega_j^2}\]
Parameters:
  • arg (str or barray) – argument to create the subset.
  • rescale_weights (bool) – whether to rescale the sample weights.
Returns:

New data set.

Return type:

DataSet

to_backend(backend)[source]

Initialize this class in a different backend.

Parameters:backend (Backend) – new backend.
Returns:This class in the new backend.
to_records()[source]

Convert this class into a numpy.ndarray object.

Returns:This object as a a numpy.ndarray object.
Return type:numpy.ndarray