Deep Learning

Deep Learning Interview Questions Table

IDTitleContentTagsCategories
1332DL0052 Rotary Positional Embedding What is Rotary Positional Embeddin… Transformer Medium
1310DL0051 Sparsity in NN Explain the concept of "Sparsity" … NN Medium
1229DL0050 Knowledge Distillation Describe the process and benefits … Basics Medium
1213DL0049 Weight Init Why is "weight initialization" imp… Basics Easy
1205DL0048 Adam Optimizer Can you explain how the Adam optim… Basics Easy
1197DL0047 Focal Loss II Please compare focal loss and weig… Loss Medium
1185DL0046 Focal Loss What is focal loss, and why does i… Loss Easy
1172DL0045 Dimension in FFN In Transformers, why does the feed… Transformer Medium
1162DL0044 Multi-Query Attention What is Multi-Query Attention in t… Transformer Medium
1152DL0043 KV Cache What is KV Cache in transformers, … Transformer Easy
1145DL0042 Attention Computation Please break down the computationa… Transformer Medium
1125DL0041 Hierarchical Attention Could you explain the concept of h… Transformer Medium
1121DL0040 Attention Mask What is the role of masking in att… Transformer Easy
1114DL0039 Transformer Weight Tying Explain weight sharing in Transfor… Transformer Hard
1109DL0038 Transformer Activation Which activation functions do tran… Transformer Easy
1103DL0037 Transformer Architecture III Why do Transformers use a dot prod… Transformer Medium
1097DL0036 Transformer Architecture II What are the main differences betw… Transformer Easy
1083DL0035 Transformer Architecture Describe the original Transformer … Transformer Easy
1077DL0034 Layer Norm What is layer normalization, and w… Norm Transformer Easy
1065DL0033 Transformer Computation In a Transformer architecture, whi… Transformer Hard
1055DL0032 Transformer VS RNN What makes Transformers more paral… RNN Transformer Easy
1049DL0031 FFN in Transformer What is the purpose of the feed-fo… Transformer Easy
1044DL0030 Positional Encoding Explain "Positional Encoding" in T… Transformer Easy
1024DL0029 Dilated Attention Could you explain the concept of d… Transformer Medium
1012DL0028 Sliding Window Attention Explain the sliding window attenti… Transformer Medium
1002DL0027 Multi-Head Attention How does multi-head attention work… Transformer Easy
996DL0026 Self-Attention vs Cross-Attention What distinguishes self-attention … Transformer Easy
948DL0025 Attention Mechanism Please explain the concept of "Att… Transformer Easy
795DL0024 Fixed-size Input in CNN What is the "dilemma of fixed-size… CNN Medium
787DL0023 Dilated Convolution What are dilated convolutions? Whe… CNN Medium
783DL0022 CNN Architecture Describe the typical architecture … CNN Easy
779DL0021 Feature Map What is the feature map in Convolu… CNN Easy
775DL0020 CNN Parameter Sharing How do Convolutional Neural Networ… CNN Easy
763DL0019 Go Deep How does increasing network depth … NN Medium
761DL0018 NaN Values What are the common causes for a d… Basics Medium
757DL0017 Reproducibility How to ensure the reproducibility … Basics Easy
752DL0016 Learning Rate Warmup What is Learning Rate Warmup? What… Basics Easy
719DL0015 Cold Start What is a "cold start" problem in … Basics Medium
686DL0014 Mixed Precision Training Can you explain the primary benefi… Basics Medium
674DL0013 Instance Normalization Can you explain what Instance Norm… Norm Medium
652DL0012 Zero Padding Why is zero padding used in deep l… Basics NN Medium
618DL0011 Fully Connected Layer Can you explain what a fully conne… Basics NN Easy
576DL0010 Receptive Field What is the receptive field in con… Basics NN Medium
570DL0009 Pooling Please compare max pooling and ave… Basics Easy
567DL0008 Hyperparameter Tuning What are the common strategies for… Basics Easy
563DL0007 Batch Norm Why use batch normalization in dee… Norm Easy
560DL0006 Layer Freeze in TL What are the common strategies for… Basics Easy
557DL0005 Transfer Learning Why use transfer learning in deep … Basics Easy
549DL0004 Small Kernels What are the key advantages of usi… NN Easy
542DL0003 1×1 Convolution What are the benefits of using 1×1… NN Medium
532DL0002 All Ones Init What are the potential consequence… Basics NN Medium
462DL0001 Residual Connection Why are residual connections impor… Basics NN Medium