Lead Generative AI Engineer at Tykhe Inc (pronounced Tie-key) in Palo Alto, California

Posted in Other 12 days ago.

Type: full-time





Job Description:

Job Description:

We are looking for an experienced Lead Generative AI Engineer to train, optimize, scale, and deploy a variety

of generative AI models such as large language models, voice/speech foundation models, vision and

multi-modal foundation models using cutting-edge techniques and frameworks. In this hands-on role, you will

architect and implement state of art neural architecture, robust training and inference infrastructure to

efficiently take complex models with billions of parameters to production while optimizing for low latency,

high throughput, and cost efficiency.

Key Responsibilities:

1. Architect and refine foundation model infrastructure to support the deployment of optimized AI

models with a focus on C/C++, CUDA, and kernel-level programming enhancements.

2. Implement state-of-the-art optimization techniques, including quantization, distillation, sparsity,

streaming, and caching, for model performance enhancements.

3. Spearhead the development of Vision pipelines, ensuring scalable training and inference workflows of

10s and 100s of billions of parameter foundation models.

4. Should be able to innovate for the state-of-the-art architectures involving Panoptic Segmentation,

Image Classification and Image Generation. It is expected that the candidate experiments with the

internals of Vision Transformers and convolutional Models like ConvNext, CLIP, Visual Question

Answering (VQA) and Diffusion Models. Practice around AI Arts, Image Prompts, Conditional Image

Generation will be an additional advantage.

5. Execute training and inference processes with a key emphasis on minimizing latency and maximizing

throughput, utilizing GPU clusters and custom hardware.

6. Innovate on current model deployment platforms, employing AWS, GCP, and GPU clusters, to enable

high scalability and responsiveness.

7. Integrate and tailor frameworks such as PyTorch, TensorFlow, DeepSpeed, and FSDP for the

advancement of super-fast model training and inference.

8. Advance the deployment infrastructure with MLOps frameworks such as KubeFlow, MosaicML,

Anyscale, Terraform, ensuring robust development and deployment cycles.

9. Enhance post-deployment mechanisms with exhaustive testing, real-time monitoring, and

sophisticated explainability and robustness checks.

10. Drive continuous improvement initiatives for deployed models with automated pipelines for drift

detection and performance degradation.

11. Lead the charge in model management, encompassing version control, reproducibility, and lineage

tracking.

12. Cultivate a culture of high-performance computing and optimization within the AI/ML domain,

propagating best practices and knowledge sharing.

Qualifications:

1. Ph.D. with 5+ years or MS with 8+ years of experience in ML Engineering, Data Science, or related

fields.

2. Demonstrated expertise in high-performance computing with proficiency in Python, C/C++, CUDA, and

kernel-level programming for AI applications.

3. Extensive experience in the optimization of training and inference for large-scale AI models, including

practical knowledge of quantization, distillation, and Vision Pipelines.

4. It will be of additional benefit if the Candidate understands Diffusion Models (DDPM), Variational

Autoencoders, Bayesian Modelling, Stochastic Variational Inference (SVI) and Reinforcement

Learning.

5. Experience in building 10s and 100s of billions of parameters generative AI foundation models

6. AI training job scheduling, orchestration, and management via SLURM and Kubeflow.

7. Proven success in deploying optimized ML systems on a large scale, utilizing cloud infrastructures and

GPU resources.

8. In-depth understanding and hands-on experience with advanced model optimization frameworks such

as DeepSpeed, FSDP, PyTorch, TensorFlow, and corresponding MLOps tools.

9. Familiarity with contemporary MLOps frameworks like MosaicML, Anyscale, Terraform, and their

application in production environments.

10. Strong grasp of state-of-the-art ML infrastructures, deployment strategies, and optimization

methodologies.

11. An innovative problem-solver with strategic acumen and a collaborative mindset.

12. Exceptional communication and team collaboration skills, with an ability to lead and inspire.
More jobs in Palo Alto, California

Other
about 5 hours ago

Veterinary Emergency Group
Other
about 15 hours ago

LaneGray, Inc.
Other
about 15 hours ago

Insight Global
More jobs in Other

Other
less than a minute ago

Rust-Oleum Corporation
Other
less than a minute ago

Rust-Oleum Corporation
Other
less than a minute ago

Stonhard