Deep Learning Fundamentals

Lesson 4: Advanced Topics: Transfer Learning and Model Ensembling

Objective: In this lesson, you'll explore advanced techniques to enhance model performance and efficiency, focusing on transfer learning and model ensembling.


1. Transfer Learning

Transfer learning involves leveraging a pre-trained model on a new but related problem. This approach is particularly useful when you have limited data for your specific task.

a. Why Transfer Learning?

  • Pre-trained Models: Save time and computational resources by using models pre-trained on large datasets.
  • Improved Performance: Transfer learning can boost performance on small datasets by leveraging features learned from larger datasets.

b. Using Pre-trained Models in TensorFlow/Keras:

TensorFlow and Keras provide several pre-trained models like VGG16, ResNet, and Inception.

Example: Fine-tuning VGG16 for Image Classification

import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Flatten

# Load VGG16 with pre-trained weights and exclude the top layer
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freeze the base model layers
for layer in base_model.layers:
    layer.trainable = False

# Add custom classification layers on top
x = base_model.output
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)

# Create the new model
model = Model(inputs=base_model.input, outputs=predictions)

# Compile and train the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

c. Fine-tuning:

After initial training, you can unfreeze some of the layers in the base model and retrain to fine-tune it for your specific task.

# Unfreeze some layers
for layer in base_model.layers[-4:]:
    layer.trainable = True

# Recompile and retrain the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=32, validation_split=0.2)

2. Model Ensembling

Model ensembling involves combining predictions from multiple models to improve accuracy and robustness. Common techniques include bagging, boosting, and stacking.

a. Bagging:

  • Bootstrap Aggregating: Reduces variance by training multiple models on different subsets of the training data and averaging their predictions.

Example: Bagging with Random Forest

from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

data = load_iris()
X, y = data.data, data.target

model = RandomForestClassifier(n_estimators=100)
model.fit(X, y)

b. Boosting:

  • Adaptive Boosting (AdaBoost): Sequentially trains models, each correcting the errors of its predecessor.

Example: AdaBoost with Decision Trees

from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier

model = AdaBoostClassifier(base_estimator=DecisionTreeClassifier(max_depth=1), n_estimators=50)
model.fit(X, y)

c. Stacking:

  • Stacked Generalization: Combines predictions from multiple models (base models) using a meta-model to make final predictions.

Example: Stacking with Scikit-Learn

from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

base_models = [
    ('svc', SVC(probability=True)),
    ('lr', LogisticRegression())
]
model = StackingClassifier(estimators=base_models, final_estimator=LogisticRegression())
model.fit(X, y)

3. Handling Large-Scale Datasets

When working with large-scale datasets, consider the following strategies:

a. Data Preprocessing:

  • Data Sharding: Split large datasets into smaller, manageable chunks.
  • Efficient Loading: Use data generators to load data in batches.

Example: Data Generators in Keras

from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(rescale=1./255)
train_generator = datagen.flow_from_directory('train_data/', target_size=(150, 150), batch_size=32, class_mode='binary')
model.fit(train_generator, epochs=10)

b. Distributed Training:

  • Multi-GPU Training: Use multiple GPUs to speed up training.
  • Cloud Platforms: Leverage cloud services like AWS or Google Cloud for distributed computing.

c. Model Optimization:

  • Model Pruning: Reduce the size of the model by removing less important weights.
  • Quantization: Convert model weights to lower precision to improve inference speed.

4. Hands-On Exercise

Task: Implement transfer learning and model ensembling on the CIFAR-10 dataset.

  1. Fine-Tune a Pre-trained Model (VGG16):
import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Prepare data
datagen = ImageDataGenerator(rescale=1./255)
train_generator = datagen.flow_from_directory('cifar10/train/', target_size=(224, 224), batch_size=32, class_mode='sparse')
val_generator = datagen.flow_from_directory('cifar10/val/', target_size=(224, 224), batch_size=32, class_mode='sparse')

# Load and fine-tune VGG16
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
for layer in base_model.layers:
    layer.trainable = False
x = base_model.output
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
model = tf.keras.Model(inputs=base_model.input, outputs=predictions)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(train_generator, epochs=5, validation_data=val_generator)
  1. Combine Models Using Stacking:
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

# Define base models
base_models = [
    ('svc', SVC(probability=True)),
    ('lr', LogisticRegression())
]
stacked_model = StackingClassifier(estimators=base_models, final_estimator=LogisticRegression())
stacked_model.fit(X_train, y_train)

5. Summary and Next Steps

In this lesson, we covered:

  • Transfer learning, including using and fine-tuning pre-trained models.
  • Model ensembling techniques like bagging, boosting, and stacking.
  • Strategies for handling large-scale datasets, including distributed training and model optimization.

Next Lesson Preview: In Lesson 5, we'll explore techniques for deploying deep learning models, including saving and loading models, and deploying to cloud platforms.