Week 1 / Day 3: Logistic Regression Implementation from Scratch

Learning Objectives

Today's focus was on implementing logistic regression from scratch, understanding the mathematical foundations of classification algorithms, and applying advanced machine learning concepts including regularization and multi-class classification.

Technical Implementation

Data Preparation and Loading

The session began with loading the preprocessed wine dataset from Day 2. When the preprocessed data wasn't available, we implemented a fallback mechanism to load and standardize the original dataset:

try:
    # Load scaled features and target
    X_scaled = np.load('week1_ml/data/X_scaled.npy')
    y = np.load('week1_ml/data/y.npy')
    print("Successfully loaded preprocessed data!")
except FileNotFoundError:
    # Fallback to original dataset
    from sklearn.datasets import load_wine
    from sklearn.preprocessing import StandardScaler
    
    wine = load_wine()
    X = wine.data
    y = wine.target
    
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)

The dataset was split into training (142 samples) and testing (36 samples) sets with stratification to maintain class balance across the three wine types.

Core Algorithm Implementation

We implemented a comprehensive LogisticRegression class from scratch with the following key components:

1. Multi-class Classification Support

Softmax Function: Implemented for handling multiple classes
One-hot Encoding: Converted target labels for loss computation
Cross-entropy Loss: Calculated with regularization support

2. Advanced Features

L1/L2 Regularization: Configurable regularization types and strengths
Gradient Descent Optimization: Customizable learning rate and iterations
Loss Tracking: Monitored training progress throughout iterations

3. Mathematical Implementation

def _softmax(self, z):
    """Compute softmax function for multi-class classification"""
    exp_z = np.exp(z - np.max(z, axis=1, keepdims=True))
    return exp_z / np.sum(exp_z, axis=1, keepdims=True)

def _cross_entropy_loss(self, y_true, y_pred):
    """Compute cross-entropy loss with regularization"""
    y_one_hot = np.eye(self.n_classes)[y_true]
    epsilon = 1e-15
    y_pred = np.clip(y_pred, epsilon, 1 - epsilon)
    
    loss = -np.mean(np.sum(y_one_hot * np.log(y_pred), axis=1))
    
    # Add regularization term
    if self.reg_type == 'l1':
        reg_term = self.reg_strength * np.sum(np.abs(self.weights))
    elif self.reg_type == 'l2':
        reg_term = self.reg_strength * np.sum(self.weights ** 2)
    else:
        reg_term = 0
        
    return loss + reg_term

Training Process and Results

The model was trained with the following parameters:

Learning Rate: 0.01
Maximum Iterations: 1000
Regularization: L2 with strength 0.01
Random State: 42 for reproducibility

Training Progress

The loss decreased consistently from 0.4128 at iteration 100 to 0.1474 at iteration 1000, demonstrating stable convergence.

Model Performance

Training Accuracy: 99.30%
Test Accuracy: 100.00%
Final Loss: 0.1474
Model Parameters: 13 features × 3 classes weights + 3 bias terms

Comprehensive Model Evaluation

Classification Metrics

The model achieved perfect performance on the test set:

Precision: 1.00 for all classes
Recall: 1.00 for all classes
F1-Score: 1.00 for all classes
Overall Accuracy: 100%

Confusion Matrix Analysis

The confusion matrix showed no misclassifications, indicating the model perfectly distinguished between all three wine types.

Feature Importance Analysis

We analyzed feature importance based on the learned weights:

Top 5 Most Important Features

color_intensity: 0.441 (highest importance)
alcohol: 0.421
proline: 0.406
alcalinity_of_ash: 0.291
hue: 0.275

This analysis revealed that wine color intensity and alcohol content are the most discriminative features for classification.

Learning Outcomes

Technical Skills Developed

Algorithm Implementation: Built logistic regression from mathematical principles
Multi-class Classification: Handled three wine types with softmax activation
Regularization Techniques: Implemented L1/L2 regularization for model robustness
Gradient Descent: Optimized model parameters using iterative optimization
Model Evaluation: Applied comprehensive metrics and visualization techniques

Mathematical Understanding

Softmax Function: Multi-class probability distribution
Cross-entropy Loss: Classification loss function with regularization
Gradient Computation: Backpropagation for parameter updates
Feature Importance: Weight-based feature significance analysis

Best Practices Learned

Fallback Mechanisms: Robust data loading with error handling
Reproducibility: Consistent random seeds and parameter tracking
Regularization: Preventing overfitting through L1/L2 penalties
Comprehensive Evaluation: Multiple metrics for thorough model assessment

Code Repository

All implementation code is saved in the ai-sprint project directory as 03_logistic_regression_implementation.ipynb, including:

Complete LogisticRegression class implementation
Training and evaluation pipeline
Feature importance analysis
Visualization and reporting functions

Next Steps

Tomorrow's focus will be on implementing additional machine learning algorithms (SVM, Random Forest) and comparing their performance with our custom logistic regression implementation.