LockedIn AI logo

AI Cloud Engineer

LockedIn AI
2 hours ago
Full-time
On-site
33 Irving Pl, Manhattan, New York, United States
Engineering

About LockedIn AI

LockedIn AI is the #1 real-time AI interview and meeting copilot, trusted by over 1 million users worldwide. We are building a next-generation AI career platform that helps users succeed in interviews, coding assessments, and professional communication using real-time AI assistance.

Our system runs on complex, high-performance AI infrastructure that must scale reliably, globally, and in real time.


Role Overview

We are hiring a cloud-native AI Cloud Engineer to design, build, and optimize the infrastructure powering LockedIn AI’s machine learning systems and real-time AI products.

This is a specialized role at the intersection of cloud engineering, distributed systems, and AI infrastructure. You will be responsible for the environments where models are trained, fine-tuned, deployed, and served at scale to over 1 million users.

You will own the full AI cloud stack — from GPU compute clusters to inference serving infrastructure and cost-optimized scaling systems.


Key Responsibilities

AI Cloud Architecture

  • Design cloud-native infrastructure for AI/ML workloads
  • Build GPU clusters for training, fine-tuning, and evaluation
  • Architect multi-environment setups (training, staging, production)
  • Optimize AWS/GCP/Azure systems for AI performance

Inference & Model Serving Infrastructure

  • Deploy and manage real-time AI inference systems (LLMs, STT, RAG)
  • Optimize serving frameworks like vLLM, Triton, TensorRT, or TGI
  • Improve latency, throughput, batching, and GPU utilization
  • Build failover, routing, and load balancing for AI endpoints

GPU & Distributed Compute

  • Manage GPU infrastructure for distributed training and inference
  • Configure multi-node training and model parallelism
  • Optimize scheduling for spot and reserved GPU instances
  • Automate scaling of compute resources based on demand

Cloud Cost Optimization (FinOps for AI)

  • Reduce cloud spend across GPU, storage, and inference workloads
  • Implement reserved/spot instance strategies for AI training
  • Track cost-per-inference and cost-per-training job metrics
  • Optimize LLM API usage and caching strategies

Infrastructure as Code & Automation

  • Build all infrastructure using Terraform, Pulumi, or CloudFormation
  • Automate provisioning of AI environments and services
  • Implement GitOps workflows for cloud infrastructure
  • Ensure reproducible and version-controlled cloud systems

Observability & Reliability

  • Monitor GPU health, inference latency, and system performance
  • Build dashboards for AI infrastructure metrics
  • Set up alerting for failures, spikes, and performance degradation
  • Ensure high availability of real-time AI systems

Security & Networking

  • Design secure cloud networks (VPC, IAM, encryption, access control)
  • Protect model weights, embeddings, and AI pipelines
  • Ensure compliance readiness (SOC2, GDPR, CCPA)
  • Secure inference endpoints and data flows

Required Qualifications

Experience

  • 3+ years in cloud engineering, DevOps, or infrastructure roles
  • Experience with ML/AI production systems
  • Hands-on GPU infrastructure or AI deployment experience
  • Experience working with engineering and AI teams in production environments

Technical Skills

  • Cloud platforms: AWS, GCP, or Azure (deep expertise)
  • Kubernetes + Docker (production-grade usage)
  • Infrastructure as Code: Terraform / Pulumi / CloudFormation
  • AI serving frameworks: vLLM, Triton, TensorRT, or similar
  • Monitoring tools: Prometheus, Grafana, Datadog, CloudWatch
  • Python / Go / Bash for automation and tooling

Preferred Qualifications

  • Experience with LLM inference at scale
  • Distributed training systems (multi-GPU / multi-node)
  • Real-time systems (WebSockets, streaming, low-latency APIs)
  • Knowledge of RDMA, InfiniBand, or GPU networking
  • Multi-cloud infrastructure experience
  • Background in SaaS, edtech, or AI consumer products

What We Offer

  • Equity in a fast-growing AI company
  • Direct impact on a product used by 1M+ users
  • Remote-first flexibility with optional NYC collaboration
  • High ownership over AI infrastructure systems
  • Fast-paced startup environment with real technical challenges

Why Join LockedIn AI?

  • Build infrastructure powering real-time AI at scale
  • Work on GPU-heavy, latency-critical AI systems
  • Own cloud systems that directly impact model performance
  • Join a category-defining AI career tools platform
  • Operate at the frontier of applied AI infrastructure

How to Apply

Please submit:

  • Resume / CV
  • Short note including:
    • Why you want to join LockedIn AI
    • Whether you’ve used the product
    • What improvements you would suggest
  • GitHub, projects, or technical writing (optional)

Equal Opportunity

LockedIn AI is committed to building a diverse and inclusive team. All hiring decisions are based on merit, skills, and business needs.