Deep Learning Fundamentals
Lesson 4: Advanced Topics: Transfer Learning and Model Ensembling
Objective: In this lesson, you'll explore advanced techniques to enhance model performance and efficiency, focusing on transfer learning and model ensembling.
1. Transfer Learning
Transfer learning involves leveraging a pre-trained model on a new but related problem. This approach is particularly useful when you have limited data for your specific task.
a. Why Transfer Learning?
- Pre-trained Models: Save time and computational resources by using models pre-trained on large datasets.
- Improved Performance: Transfer learning can boost performance on small datasets by leveraging features learned from larger datasets.
b. Using Pre-trained Models in TensorFlow/Keras:
TensorFlow and Keras provide several pre-trained models like VGG16, ResNet, and Inception.
Example: Fine-tuning VGG16 for Image Classification
import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Flatten
# Load VGG16 with pre-trained weights and exclude the top layer
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Freeze the base model layers
for layer in base_model.layers:
layer.trainable = False
# Add custom classification layers on top
x = base_model.output
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
# Create the new model
model = Model(inputs=base_model.input, outputs=predictions)
# Compile and train the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=32, validation_split=0.2)
c. Fine-tuning:
After initial training, you can unfreeze some of the layers in the base model and retrain to fine-tune it for your specific task.
# Unfreeze some layers
for layer in base_model.layers[-4:]:
layer.trainable = True
# Recompile and retrain the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=5, batch_size=32, validation_split=0.2)
2. Model Ensembling
Model ensembling involves combining predictions from multiple models to improve accuracy and robustness. Common techniques include bagging, boosting, and stacking.
a. Bagging:
- Bootstrap Aggregating: Reduces variance by training multiple models on different subsets of the training data and averaging their predictions.
Example: Bagging with Random Forest
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
data = load_iris()
X, y = data.data, data.target
model = RandomForestClassifier(n_estimators=100)
model.fit(X, y)
b. Boosting:
- Adaptive Boosting (AdaBoost): Sequentially trains models, each correcting the errors of its predecessor.
Example: AdaBoost with Decision Trees
from sklearn.ensemble import AdaBoostClassifier
from sklearn.tree import DecisionTreeClassifier
model = AdaBoostClassifier(base_estimator=DecisionTreeClassifier(max_depth=1), n_estimators=50)
model.fit(X, y)
c. Stacking:
- Stacked Generalization: Combines predictions from multiple models (base models) using a meta-model to make final predictions.
Example: Stacking with Scikit-Learn
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
base_models = [
('svc', SVC(probability=True)),
('lr', LogisticRegression())
]
model = StackingClassifier(estimators=base_models, final_estimator=LogisticRegression())
model.fit(X, y)
3. Handling Large-Scale Datasets
When working with large-scale datasets, consider the following strategies:
a. Data Preprocessing:
- Data Sharding: Split large datasets into smaller, manageable chunks.
- Efficient Loading: Use data generators to load data in batches.
Example: Data Generators in Keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rescale=1./255)
train_generator = datagen.flow_from_directory('train_data/', target_size=(150, 150), batch_size=32, class_mode='binary')
model.fit(train_generator, epochs=10)
b. Distributed Training:
- Multi-GPU Training: Use multiple GPUs to speed up training.
- Cloud Platforms: Leverage cloud services like AWS or Google Cloud for distributed computing.
c. Model Optimization:
- Model Pruning: Reduce the size of the model by removing less important weights.
- Quantization: Convert model weights to lower precision to improve inference speed.
4. Hands-On Exercise
Task: Implement transfer learning and model ensembling on the CIFAR-10 dataset.
- Fine-Tune a Pre-trained Model (VGG16):
import tensorflow as tf
from tensorflow.keras.applications import VGG16
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# Prepare data
datagen = ImageDataGenerator(rescale=1./255)
train_generator = datagen.flow_from_directory('cifar10/train/', target_size=(224, 224), batch_size=32, class_mode='sparse')
val_generator = datagen.flow_from_directory('cifar10/val/', target_size=(224, 224), batch_size=32, class_mode='sparse')
# Load and fine-tune VGG16
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
for layer in base_model.layers:
layer.trainable = False
x = base_model.output
x = Flatten()(x)
x = Dense(256, activation='relu')(x)
predictions = Dense(10, activation='softmax')(x)
model = tf.keras.Model(inputs=base_model.input, outputs=predictions)
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(train_generator, epochs=5, validation_data=val_generator)
- Combine Models Using Stacking:
from sklearn.ensemble import StackingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
# Define base models
base_models = [
('svc', SVC(probability=True)),
('lr', LogisticRegression())
]
stacked_model = StackingClassifier(estimators=base_models, final_estimator=LogisticRegression())
stacked_model.fit(X_train, y_train)
5. Summary and Next Steps
In this lesson, we covered:
- Transfer learning, including using and fine-tuning pre-trained models.
- Model ensembling techniques like bagging, boosting, and stacking.
- Strategies for handling large-scale datasets, including distributed training and model optimization.
Next Lesson Preview: In Lesson 5, we'll explore techniques for deploying deep learning models, including saving and loading models, and deploying to cloud platforms.