ADLStream.data.ClassificationStreamGenerator
Classification stream generator.
This class is used for generating streams for classification problems.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
stream |
inherits ADLStream.data.stream.BaseStream |
Stream source to be feed to the ADLStream framework. |
required |
label_index |
int or list |
The column index/indices of the target label. Defaults to -1. |
[-1] |
one_hot_labels |
list or None |
Possible label values if one-hot encoding must be done. If None, the target value is not one-hot encoded. Defaults to None. |
None |
Source code in ADLStream/data/classification_generator.py
class ClassificationStreamGenerator(BaseStreamGenerator):
"""Classification stream generator.
This class is used for generating streams for classification problems.
Arguments:
stream (inherits ADLStream.data.stream.BaseStream):
Stream source to be feed to the ADLStream framework.
label_index (int or list, optional): The column index/indices of the target
label.
Defaults to -1.
one_hot_labels (list or None, optional): Possible label values if one-hot
encoding must be done. If None, the target value is not one-hot encoded.
Defaults to None.
"""
def __init__(self, stream, label_index=[-1], one_hot_labels=None, **kwargs):
super().__init__(stream, **kwargs)
self.label_index = label_index if type(label_index) is list else [label_index]
self.labels = one_hot_labels
self.one_hot_encoder = None
if self.labels:
self.one_hot_encoder = OneHotEncoder()
self.one_hot_encoder.fit(np.asarray(self.labels).reshape(-1, 1))
def preprocess(self, message):
x = message
y = [message.pop(i) for i in self.label_index]
if self.labels:
y = self.one_hot_encoder.transform([y]).toarray()
y = list(y[0])
return x, y
preprocess(self, message)
The function that contains the logic to transform a stream message into
model imput and target data (x ,y)
.
Both output, x
or y
, can be None
what means it should not be added to
the context.
The target data y
can be delayed. Although we are sending x
and y
at
the same time, it does not mean that y
is the corresponding target value
of x
. However, input data and target data should be in order: y_i
is the
target value of x_i
. So the first target data sent (y_0
) corresponds with
the first input sent (x_0
).
Parameters:
Name | Type | Description | Default |
---|---|---|---|
message |
list |
message received from the stream |
required |
Exceptions:
Type | Description |
---|---|
NotImplementedError |
This is an abstract method which should be implemented. |
Returns:
Type | Description |
---|---|
x (list) |
instance of model's input data. y (list): instance of model's target data. |
Source code in ADLStream/data/classification_generator.py
def preprocess(self, message):
x = message
y = [message.pop(i) for i in self.label_index]
if self.labels:
y = self.one_hot_encoder.transform([y]).toarray()
y = list(y[0])
return x, y