Category: Easy

  • ML0012 F1 Score

    What is F1 Score?

    Answer

    The F1 score is a crucial metric used to evaluate the performance of classification models, particularly when there’s an imbalance between the classes. It provides a balance between Precision and Recall, combining them into a single metric.

    {\large \text{F1 Score} = \displaystyle\frac{2 \times \text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}}


    Login to view more content
  • ML0011 Precision and Recall

    What are Precision and Recall?

    Answer

    Precision and recall are two fundamental metrics used to evaluate the performance of classification models, especially when dealing with imbalanced datasets or when the cost of different types of errors varies.

    Precision
    Precision (also known as positive predictive value) is the ratio of correctly predicted positive observations to the total predicted positives. In other words, it tells you, “When the model predicts a positive, how often is it right?” Mathematically, it’s defined as:

    {\large \text{Precision} = \displaystyle\frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Positives (FP)}}}

    For example, if a spam detector labels 100 emails as spam and 99 of them are actually spam, its precision is 99%.

    Recall
    Recall (also known as sensitivity or true positive rate) is the ratio of correctly predicted positive observations to all observations that are actually positive. It answers the question, “Out of all the actual positives, how many did the model capture?” Mathematically, it’s defined as:

    {\large \text{Recall} = \displaystyle\frac{\text{True Positives (TP)}}{\text{True Positives (TP)} + \text{False Negatives (FN)}}}

    For example, if there are 100 spam emails in total and the model correctly identifies 90 of them, its recall is 90%.

    True Positives (TP): The model correctly predicts the positive class.
    False Positives (FP): The model incorrectly predicts the positive class (it predicted positive, but it was actually negative)
    True Negatives (TN): The model correctly predicts the negative class.
    False Negatives (FN): The model incorrectly predicts the negative class (it predicted negative, but it was actually positive).


    Login to view more content

  • ML0010 Epoch Selection

    What are effective strategies for selecting the appropriate number of training epochs in machine learning?

    Answer

    An epoch represents one complete pass through the entire training dataset. Emphasize its role in iterative learning and weight updates.

    Choosing the right number of epochs involves striking a balance between undertraining and overfitting.
    (1) Monitor Validation Metrics: Regularly evaluate performance on a validation set. If the validation loss begins to plateau or increase, it may indicate that further training won’t yield improvements.
    (2) Implement Early Stopping: Use early stopping techniques to automatically halt training when the model’s performance ceases to improve, thereby avoiding overfitting.
    (3) Experimentation: Begin with a moderate range (e.g., 10–100 epochs) and adjust based on observed training and validation curves.
    (4) Assess Model and Data Complexity: More intricate models or complex datasets may require additional epochs to capture underlying patterns, while simpler scenarios can converge more rapidly.

    In short, select epoch sizes by closely monitoring the model’s performance, employing early stopping to refine the process, and tailoring the approach to the complexities of the specific task.


    Login to view more content

  • ML0009 Batch Size Selection

    What are the best strategies for selecting the appropriate batch size?

    Answer

    Selecting an appropriate batch size is another crucial hyperparameter choice in neural network training, impacting both performance and training efficiency. Select a batch size that balances your hardware constraints and the optimization trade-offs. Start with a moderate value (e.g., 16, 32, or 64) and adjust it based on the available memory, the stability of your gradient updates, and the model’s validation performance.

    (1) Memory Constraints: Larger batch sizes require more GPU (or CPU) memory.
    (2) Dataset Size: Larger datasets can generally accommodate larger batch sizes. Smaller datasets may benefit from smaller batch sizes to introduce more variability in the training process.
    (3) Learning Rate Interaction: The appropriate learning rate often depends on the batch size; large batches might allow or even require a higher learning rate, while small batches might need a lower one. Some practitioners adjust the learning rate proportionally to the batch size.


    Login to view more content

  • ML0008 Learning Rate Selection

    What are the best practices for selecting an optimal learning rate?

    Answer

    Selecting an appropriate learning rate is a crucial part of training neural networks. It significantly impacts how quickly and effectively your model learns.

    1. Grid or Random Search: Experiment with a range of learning rates (for example, 0.0001, 0.001, 0.01, etc.) and observe training performance. This systematic exploration can help narrow down an effective learning rate, though it may be computationally intensive.
    2. Adaptive Optimizers: Use optimizers like Adam, RMSProp, or Adagrad that adjust the learning rate for each parameter automatically. These methods often require less manual tuning because they adapt based on the gradient history.
    3. Learning Rate Schedules: Implement strategies such as step decay, exponential decay, or cosine annealing. These schedules reduce the learning rate over time, which can help fine-tune the model as it approaches convergence.
    4. Monitoring Training Loss: Pay close attention to the training loss during training. If the loss is not decreasing, or if it’s oscillating, adjust the learning rate accordingly.  


    Login to view more content
  • ML0004 Underfitting

    Which of the following descriptions is inaccurate in regard to underfitting?

    A. Underfitting occurs when a model is too simple to capture the underlying patterns from the data.

    B. When underfitting occurs, the model will have high bias and low variance.

    C. Increasing the model’s complexity and reducing regularization can address underfitting.

    D. An underfit model performs well with the training data but performs poorly on new, unseen data.

    Answer

    D
    Explanation:
    Underfitting means the model performs poorly on both the training data and the unseen test data because it hasn’t learned enough from the training set.


    Login to view more content
  • ML0003 Overfitting

    What is overfitting and how to avoid overfitting?

    Answer

    Overfitting happens when a model tries to learn the training data too well, including its noise and outliers, leading to poor performance on new, unseen data. The model becomes too specialized to the training data, failing to generalize to other data.

    To avoid overfitting:
    1. Simplify the model: Use less complex models.
    2. Get more data or use data agumentation: A larger dataset helps the model generalize.
    3. Regularization: Penalize complex models with techniques like L1/L2 regularization.
    4. Validation & Early Stopping: Validate frequently and stop training when performance plateaus.
    5. For neural networks, the dropout layer can also be used to avoid overfitting.


    Login to view more content
  • ML0002 Machine Learning Type

    What is the difference between supervised learning and unsupervised learning?

    Answer

    Supervised learning relies on labeled datasets, and each training sample comes with a label or output. The algorithm learns a mapping function that can predict the output, including new, unseen inputs.

    Unsupervised learning works with unlabeled Data. The algorithm aims to find hidden patterns or structures within the data.


    Login to view more content
  • ML0001 Loss Curve Plot

    The following training loss curves were plotted with different experiment settings. Which of these training loss curves most likely indicates the correct experiment settings?

    Answer

    A
    Explanation:
    In an ideal training environment, the training loss is expected to diminish steadily over time. This indicates that the model is learning and improving its performance over time.


    Login to view more content