Kubernetes In-depth Interview questions

In-depth k8s interview questions

1. Pod Scheduling & Resource Management

Q: Can you explain how the Kubernetes scheduler decides where to place pods? How do you configure requests, limits, and QoS classes to ensure optimal resource usage in a multi-tenant environment?

Scheduler Process: The Kubernetes scheduler evaluates resource requests, node labels, taints, and affinity rules to pick the best node. By default, it’s a “bin-packing” approach, placing pods so resource usage is balanced.
Requests & Limits: In my previous e-commerce system, each microservice declared CPU and memory requests to ensure it always had the minimum resources to run effectively. We set limits so a runaway process doesn’t starve others. For example, the payment service might have request: 200m CPU and limit: 500m CPU.
QoS Classes: If we set both requests and limits, the pod is Guaranteed or Burstable, which affects scheduling priority. Our critical workloads got Guaranteed QoS to minimize eviction risk.
Real Impact: By fine-tuning these values, we avoided frequent OOM kills (out-of-memory) and prevented noisy neighbors from impacting payment transactions during high traffic (e.g., Black Friday).

2. Affinity, Anti-Affinity, and Topology Spread Constraints

Q: How do you configure PodAntiAffinity or topology spread constraints to ensure workloads are evenly distributed across Availability Zones, thereby avoiding single points of failure?

PodAntiAffinity: We used rules like:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: payment-service
topologyKey: "topology.kubernetes.io/zone"
This ensures if we have 3 replicas, they’re spread across at least 3 zones in our Amazon EKS cluster.

Real Scenario: In production, if us-east-1a had networking issues, pods in us-east-1b and us-east-1c could still handle traffic. This significantly reduced the impact of an AZ outage.

3. Kubernetes Probes (Liveness & Readiness)

Q: How do you design health checks for complex microservices? Can you discuss the interplay between liveness and readiness probes when dealing with rolling updates?

A :

Readiness Probe: Tells the service mesh/load balancer when a pod is ready to receive traffic. For instance, a /health endpoint might check if the microservice has established a database connection.
Liveness Probe: Ensures the container hasn’t hung. If the probe fails repeatedly, Kubernetes restarts the pod.
Rolling Updates: During a rolling deployment of our product catalog service, readiness probes prevented traffic from hitting new pods until they’d fully loaded product data. Liveness probes caught any pods stuck in initialization. This combo allowed near-zero downtime deployments.

4. Stateful Applications & Persistent Volumes

Q: Describe how you would run a stateful application (e.g., PostgreSQL, Cassandra) on EKS. How do PVCs map to StorageClasses (e.g., EBS, EFS), and what are the operational challenges?

A :

StatefulSets: In a previous project, we deployed Postgres in a StatefulSet so each pod got a stable network identity and persistent storage.
PVCs and StorageClasses: The volumeClaimTemplates in the StatefulSet automatically created PersistentVolumeClaims (PVCs), which referenced an EBS-based StorageClass (for single-AZ). If we needed multi-AZ, we considered EFS for shared storage (though we kept performance in mind).
Challenges: EBS is zone-scoped, so if a node in us-east-1a fails, the volume can’t easily move to us-east-1b without re-attaching. That’s why we use Multi-AZ RDS for critical DBs and only use EBS for ephemeral or single-AZ test environments.

5. Networking & CNI Plugins

Q: Explain the role of a CNI plugin in Kubernetes networking. How does AWS VPC CNI differ from other options (e.g., Calico or Weave Net), and what are the trade-offs?

CNI Role: A Container Network Interface plugin is responsible for provisioning IP addresses to pods and setting up network routes.
AWS VPC CNI: Pods get IPs directly from the VPC subnet, which simplifies security group usage but can consume many IPs. Great for AWS-native performance and integration.
Other CNIs (Calico): Offers advanced network policies, IP-in-IP overlay, and is cloud-agnostic. The trade-off can be more complexity in multi-AZ routing.
Real Impact: Our e-commerce environment used AWS VPC CNI for performance and AWS security group integration. However, we had to carefully plan subnets to avoid IP exhaustion under high scale.

6. Service Discovery & Ingress

Q: In an Amazon EKS cluster, how do you decide between using a Service of type LoadBalancer versus an Ingress controller (like AWS Load Balancer Controller)?

Type LoadBalancer: We used it initially for simple services (e.g., a development environment) where each service needed its own external endpoint. This was convenient but quickly became expensive (multiple ALBs).
Ingress Controller: For our public-facing APIs, we used the AWS Load Balancer Controller to create a single ALB that routes traffic to multiple microservices via Ingress rules. This reduced costs and simplified management.
Best Practice: We typically adopt Ingress for consolidated routing. We only use type LoadBalancer for specialized needs (like separate IP or dedicated NLB for TCP/UDP).

7. Security in Kubernetes

Q: How do you handle RBAC (Role-Based Access Control) in a multi-team Kubernetes environment? Can you discuss NetworkPolicies, secrets management, and container isolation?

RBAC: We created separate Namespaces per team (e.g., team-a, team-b) and assigned granular Role and RoleBinding objects so each team could manage only their own Deployments. Cluster-wide changes required cluster admin roles.
NetworkPolicies: For internal microservices, we used NetworkPolicies to limit traffic to known ports and namespaces, preventing cross-team infiltration.
Secrets Management: We integrated AWS Secrets Manager with the EKS cluster using an external secrets operator. This let developers store credentials in AWS Secrets Manager but retrieve them as native K8s Secrets at runtime.
Isolation: We enforced resource quotas so one team couldn’t saturate cluster CPU/memory.

8. Cluster Autoscaler vs. Horizontal Pod Autoscaler (HPA)

Q: Can you explain the difference between Cluster Autoscaler and the Horizontal Pod Autoscaler? Under what scenarios would you tune them differently in EKS?

Horizontal Pod Autoscaler (HPA): Scales pod replicas based on metrics like CPU, memory, or custom metrics. E.g., if the checkout service hits 70% CPU usage, we add more pod replicas.
Cluster Autoscaler: Checks for unschedulable pods. If a new pod can’t find space on any node, it spins up additional EC2 instances (in managed node groups).
Real Use Case: On Black Friday, our HPA often triggered scaling for the “checkout” Deployment from 5 to 50 pods. The Cluster Autoscaler then provisioned extra nodes if the existing ones got maxed out. We tuned the max node count carefully to avoid cost blowouts while ensuring we could handle peak loads.

9. Upgrades & Version Management

Q: How do you perform a zero-downtime upgrade of a Kubernetes cluster—particularly on EKS—while avoiding disruptions to running workloads?

Managed Node Groups: We used a rolling update strategy for node groups, where new nodes (with the updated AMI/K8s version) joined the cluster before old ones were drained and terminated.
Control Plane Upgrades: EKS control plane versions can be upgraded via the AWS console or CLI. We do this in a maintenance window, but it’s usually seamless for running pods.
Best Practices: We pinned the cluster version in our IaC (Infrastructure as Code) approach (Terraform). Before upgrading from 1.22 to 1.23, we tested all CRDs and resources in staging. During the real upgrade, we had zero service disruptions thanks to the rolling approach.

10. etcd

Q: What role does etcd play in Kubernetes? If you were self-hosting Kubernetes, how would you back up etcd and ensure consistent state recovery?

A :

Role: etcd is the key-value store for the entire cluster state—Deployments, Services, Secrets, etc.
Backup Strategy: We tested self-managed clusters in a dev environment. We ran a cron job to take etcd snapshots (etcdctl snapshot save) and stored them in Amazon S3 daily. We also used Velero for cluster resource backups.
Recovery: In a failure, we’d restore etcd from the latest snapshot, ensuring version compatibility. We kept the etcd version the same during restore to avoid data corruption.

11. Managing Secrets at Scale

Q: How do you safely store and manage secrets in Kubernetes (e.g., environment variables vs. volumes, encryption at rest)? How would you integrate with AWS Secrets Manager or HashiCorp Vault?

K8s Native Secrets: By default, they’re base64-encoded, not encrypted. We enabled KMS encryption of secrets at rest in EKS.
External Secrets: For our e-commerce DB credentials, we used the external-secrets operator that synced secrets from AWS Secrets Manager to a K8s Secret. This meant rotation in AWS automatically updated in the cluster.
Vault: In a previous fintech project, we used HashiCorp Vault for dynamic secrets (short-lived DB creds). An agent injector container retrieved secrets and exposed them via a mounted volume.

12. Service Mesh

Q: When is it appropriate to introduce a service mesh (e.g., Istio, Linkerd, or AWS App Mesh)? What are the benefits and potential overhead?

Benefits: We introduced Istio in a microservices architecture to get traffic splitting (canary deployments), mTLS between services, and robust observability (tracing, metrics).
Overhead: A service mesh adds a sidecar proxy to each pod, which can increase resource usage and complexity in debugging. We found it valuable in a scenario with ~30 microservices that needed consistent security and advanced routing. For smaller apps with fewer services, it might be overkill.

13. Multi-Tenancy

Q: If you had multiple teams or services sharing a single EKS cluster, how would you enforce isolation and resource quotas (CPU, memory)?

Namespaces: Each team had a dedicated namespace. We used ResourceQuota objects to cap CPU and memory usage.
Network Policies: Minimally, we restricted inbound traffic so that team services could only talk to their own namespace or explicitly allowed namespaces.
Security Boundaries: We used RBAC to limit who could deploy to production namespaces. We also integrated with IAM Roles for Service Accounts (IRSA) so that pods used team-specific IAM roles, preventing cross-team AWS resource access.
Real Impact: This kept a single cluster cost-effective while ensuring no team’s resource spike would cripple the entire environment.

14. GitOps Strategy

Q: Can you describe a GitOps approach (e.g., Argo CD, Flux) for automated deployment in Kubernetes? How do you handle secrets, image updates, and rollbacks using Git as the source of truth?

Git Repo as Source of Truth: Each microservice (payment, cart, inventory) had a Helm chart stored in Git.
Argo CD: Continuously watched the Git repo. When we merged changes to main, Argo CD automatically applied them to the cluster.
Secrets: We used the external-secrets pattern so that placeholders in Helm charts pointed to AWS Secrets Manager keys. No plain-text secrets in Git.
Rollbacks: If a deployment caused issues, we reverted the Git commit. Argo CD recognized the difference and reverted the cluster state automatically.

15. Advanced Debugging

Q: How would you troubleshoot a scenario where pods are repeatedly evicted or stuck in CrashLoopBackOff due to out-of-memory conditions?

kubectl describe: First step: kubectl describe pod <pod-name> to see if OOMKilled events appear. We confirmed pods were getting evicted once they exceeded memory limits.
Logs & Metrics: Checked container logs (kubectl logs) for memory-related exceptions. Monitored memory usage in Prometheus.
Solution: Adjusted requests/limits so the pod had enough memory overhead. In one case, our payment service’s memory usage spiked due to a large number of concurrent checkouts. We scaled horizontally and added a small memory buffer in the limit.
Outcome: Post-fix, pods stabilized, and CrashLoopBackOff events ceased.

16. Multi-Cluster Management

Q: If you manage multiple EKS clusters (dev, staging, prod), how do you keep configurations and policies consistent? Do you use tools like Rancher, ClusterAPI, or manage them individually?

GitOps for Each Cluster: Each environment had its own Git repo branch (e.g., dev, staging, prod). Argo CD in each cluster pulled from the corresponding branch.
Common Baseline: We used Helm umbrella charts or Kustomize overlays so dev/staging/prod had a consistent set of microservices but different replicas or CPU requests.
Rancher / Other Tools: For a larger org, we introduced Rancher to get a centralized cluster view and manage RBAC. But for a smaller e-commerce startup, GitOps was enough, with some Terraform for provisioning the EKS clusters.

17. High-Performance & Large Scale

Q: What special considerations do you have for a Kubernetes cluster that needs to scale to hundreds or thousands of nodes? How do you address API server performance and etcd scaling?

Node Limits: We recognized that etcd and the API server can get stressed above certain thresholds (~5,000 nodes for large clusters). We segmented workloads into multiple clusters if necessary (e.g., one cluster for front-end microservices, another for batch/ML jobs).
Watch Efficiency: We used watch caching in the API server config to reduce load from high-churn resources (like pods scaling up and down).
Auto Scaling Groups: For hundreds of nodes, we carefully sized subnets and monitored IP usage with the AWS VPC CNI.
Shard the Services: If the e-commerce platform experiences insane traffic, we might shard the product catalog across multiple clusters to keep etcd from becoming a bottleneck.

18. Explain a real-time issue that you experienced in EKS recently.

S (Situation)

Our team noticed increased 5xx/504 error rates in our production environment hosted on Amazon EKS. Services behind the AWS Load Balancer Controller were intermittently timing out. Investigating further, we found 429 Too Many Requests responses in the Kubernetes API server logs, suggesting the API server was throttling the Load Balancer Controller’s requests. This cluster had multiple controllers (including custom CRD operators) making heavy use of list/watch calls, contributing to the load on the APF(API Priority and Fairness) subsystem.

Key Observations:

User-facing logs showed 504 errors from the Application Load Balancer (ALB) side.
Load Balancer Controller logs indicated API server timeouts and 429 errors.
Control plane logs (viewable in CloudWatch) confirmed the throttling due to concurrency limits.

T (Task)

We needed to:

Determine why the Load Balancer Controller was receiving 429 responses from the API server (i.e., confirm it was an APF issue).
Tune the API Priority and Fairness configurations to ensure critical system controllers (like the LB controller) have sufficient priority under load.
Verify that the fix resolved the throttling and restored normal ALB reconciliation without production downtime.

A (Action)

Below is a step-by-step breakdown with commands, CloudWatch usage, and YAML references.

1. Gather Logs & Metrics

Check Controller Logs (Kubernetes)
# kubectl logs -n kube-system <lb-controller-pod-name>
- In the logs, we found repeated messages such as error creating load balancer: context deadline exceeded or references to 429 Too Many Requests.
Check EKS Control Plane Logs (CloudWatch)
- If Control Plane Logging is enabled on EKS, you can see api or audit logs in CloudWatch.
- Look for 429 entries in these logs. You might see lines like:
  "verb":"WATCH","URI":"/apis/elbv2.k8s.aws/v1beta1","code":429
Prometheus Metrics (Optional)
- If you scrape API server metrics via kube-apiserver endpoints or from the metrics-server approach, look for concurrency usage or request rejections.
- Metric name examples might include
  - apiserver_flowcontrol_request_concurrency_in_use or
  - apiserver_flowcontrol_request_queue_length.

2. Investigate APF Configuration

Kubernetes API Priority and Fairness is configured via FlowSchema and PriorityLevelConfiguration objects. We can inspect what’s currently in place:

# kubectl get flowschema,prioritylevelconfiguration -A

You might see something like:

NAME AGE

flowschema.scheduling.k8s 30d

flowschema.workload-ops 30d

...

NAME AGE

prioritylevelconfiguration.xxx 30d

prioritylevelconfiguration.yyy 30d

...

Check the references to see if the aws-load-balancer-controller is lumped into a “general” FlowSchema with a lot of other controllers.