CodeSphere - Expand Your Coding Knowledge

Lesson 3: Model Evaluation and Hyperparameter Tuning

Objective: In this lesson, you'll learn how to evaluate the performance of your deep learning models, tune hyperparameters, and apply techniques to improve model performance.

1. Model Evaluation Metrics

Evaluating model performance involves using various metrics depending on the type of problem (classification or regression).

a. Classification Metrics:

Accuracy: The proportion of correctly classified instances.

from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_true, y_pred)

Precision, Recall, and F1-Score: Metrics that provide insights into how well the model performs on each class.

from sklearn.metrics import precision_score, recall_score, f1_score
precision = precision_score(y_true, y_pred, average='weighted')
recall = recall_score(y_true, y_pred, average='weighted')
f1 = f1_score(y_true, y_pred, average='weighted')

Confusion Matrix: A matrix showing the true vs. predicted classifications.

from sklearn.metrics import confusion_matrix
conf_matrix = confusion_matrix(y_true, y_pred)

b. Regression Metrics:

Mean Absolute Error (MAE): The average of the absolute errors.

from sklearn.metrics import mean_absolute_error
mae = mean_absolute_error(y_true, y_pred)

Mean Squared Error (MSE) and Root Mean Squared Error (RMSE): Measures of the average squared error.

from sklearn.metrics import mean_squared_error
mse = mean_squared_error(y_true, y_pred)
rmse = mse ** 0.5

R-squared (R²): The proportion of variance explained by the model.

from sklearn.metrics import r2_score
r2 = r2_score(y_true, y_pred)

2. Hyperparameter Tuning

Hyperparameters are parameters set before the training process that affect model performance. Tuning these parameters can significantly impact the model's effectiveness.

a. Common Hyperparameters:

Learning Rate: Determines the size of the steps during optimization.
Batch Size: The number of samples processed before updating the model.
Number of Epochs: The number of times the model is trained on the entire dataset.
Number of Layers and Units: The architecture of the network, including the number of hidden layers and neurons.

b. Techniques for Hyperparameter Tuning:

Grid Search: Exhaustively searches over a predefined set of hyperparameters.

from sklearn.model_selection import GridSearchCV
param_grid = {'learning_rate': [0.001, 0.01, 0.1], 'batch_size': [16, 32, 64]}
grid_search = GridSearchCV(estimator=model, param_grid=param_grid, cv=3)
grid_search.fit(X_train, y_train)

Random Search: Randomly samples from a range of hyperparameters.

from sklearn.model_selection import RandomizedSearchCV
param_dist = {'learning_rate': [0.001, 0.01, 0.1], 'batch_size': [16, 32, 64]}
random_search = RandomizedSearchCV(estimator=model, param_distributions=param_dist, n_iter=10, cv=3)
random_search.fit(X_train, y_train)

Bayesian Optimization: Uses probabilistic models to find optimal hyperparameters.

from skopt import BayesSearchCV
param_space = {'learning_rate': (1e-5, 1e-1, 'log-uniform'), 'batch_size': (16, 128)}
bayes_search = BayesSearchCV(estimator=model, search_spaces=param_space, n_iter=50, cv=3)
bayes_search.fit(X_train, y_train)

3. Techniques for Improving Model Performance

a. Regularization:

L1 and L2 Regularization: Adds penalties to the loss function to prevent overfitting.

from tensorflow.keras.regularizers import l2
model.add(Dense(64, activation='relu', kernel_regularizer=l2(0.01)))

Dropout: Randomly drops neurons during training to prevent overfitting.
```
from tensorflow.keras.layers import Dropout
model.add(Dropout(0.5))
```

b. Data Augmentation:

Image Augmentation: Applies transformations to images to artificially increase the dataset size.

from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rotation_range=40, width_shift_range=0.2, height_shift_range=0.2)
datagen.fit(X_train)

c. Learning Rate Scheduling:

Adjusting Learning Rate: Reducing the learning rate during training can help fine-tune the model.

from tensorflow.keras.callbacks import ReduceLROnPlateau
lr_scheduler = ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=3)
model.fit(X_train, y_train, callbacks=[lr_scheduler])

d. Early Stopping:

Prevent Overfitting: Stop training when the model's performance stops improving on a validation set.

from tensorflow.keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(monitor='val_loss', patience=5)
model.fit(X_train, y_train, callbacks=[early_stopping])

4. Hands-On Exercise

Task: Tune hyperparameters and evaluate the performance of a CNN on the CIFAR-10 dataset.

Define and Train the Initial Model:

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10, batch_size=64, validation_split=0.2)

Perform Grid Search for Hyperparameter Tuning:

from sklearn.model_selection import GridSearchCV

param_grid = {
    'batch_size': [32, 64],
    'epochs': [10, 20]
}

# Note: GridSearchCV does not support Keras models directly, so you may need to use KerasClassifier or KerasRegressor wrappers.

Evaluate and Compare Model Performance:

test_loss, test_accuracy = model.evaluate(X_test, y_test)
print(f"Test accuracy: {test_accuracy}")

5. Summary and Next Steps

In this lesson, we covered:

Model evaluation metrics for classification and regression.
Techniques for hyperparameter tuning, including grid search, random search, and Bayesian optimization.
Methods to improve model performance, such as regularization, data augmentation, and learning rate scheduling.