Software Engineer, Containers and Kubernetes
Seoul (Hybrid) • Full time
- KOREAN: NOT REQUIRED
- Machine Learning
- Python
Responsibilities
- Integrating SW stack with DNN frameworks (e.g., PyTorch) to support eager mode, model export path, custom kernel interface and DNN compiler.
- Collaborate with cross-teams, including compiler and algorithm teams, to achieve the native integration between SW stack and DNN frameworks.
- Design and implement DNN model graph’s pre-processing modules, involving sub-graph pattern matching/replacement, and graph partitioning, to enable the model parallelism and optimize the end-to-end inference performance.
- Participate in designing and implementing the user-facing interface of LLM stack.
- Write and maintain API references and development documentation.
Minimum Qualifications
- Bachelor’s degree in Computer Science or equivalent work experience.
- Experience with ML/DNN frameworks, such as TensorFlow, PyTorch, or JAX.
- Strong programming skills with 3+ years of experience in Python.
- Excellent communication skills for gathering and clarifying requirements.
- Ability to identify inefficiencies in programs and processes, with a proactive approach to proposing sustainable solutions.
Preferred Qualifications
- Deep understanding of transformer-based DNN model architecture.
- 2+ years of experience working in AI research.
- Familiarity with PyTorch 2.0 technologies (e.g., TorchDynamo, TorchInductor) or DNN compiler technologies (e.g., Triton, MLIR).
- Proficient programming skills, particularly in CUDA or Rust.
- Experience developing production-grade software for customers.
- Understanding of LLM serving optimization techniques, such as mixture of experts, paged attention, speculative decoding, chunked prefill, or continuous batching
- Experience with Hugging Face Transformers, PEFT, TGI, vLLM, or TensorRT-LLM
- Proven track record of contributing to open-source projects.
- Knowledge of testing and CI/CD pipelines, such as Jenkins or similar tools.