When data cannot be perfectly divided by a linear boundary, SVMs adapt using soft margins and the kernel trick to handle overlaps and complex patterns.
![]()
Definition: Extends SVM to allow some misclassifications, enhancing flexibility for non-linearly separable data.
Formula: minimize 12‖w‖2+C∑ni=1ξi
Where:
Key points:
Slack variables permit data points to be on the wrong side of the margin.
C balances margin width and misclassification penalty, crucial for model performance.
Definition: Transforms data into a higher-dimensional space to find a separable hyperplane, enabling non-linear classification.
Formula: Uses kernel functions like RBF (exp(−γ‖x−x′‖2)), Polynomial (x⋅x′)d), or Sigmoid (tanh(αx⋅x′+r)).
Key Points:
Allows complex decision boundaries without explicitly computing high-dimensional transformations.
Selection of kernel and its parameters (d, γ etc.) is vital for capturing data structure.
Handling Non-separable Data
Soft Margins: Introduce flexibility by penalizing misclassifications to a degree controlled by C.
Kernel-Based Mapping: Facilitates classification in cases where linear separation is not feasible in the original feature space.
Parameter Tuning: Critical for optimizing the trade-off between model complexity and generalization ability, often achieved through cross-validation.