Lead Audio ML Engineer
HarkSummary
Hark is seeking a Lead Audio ML Engineer to develop and deploy on-device audio models for consumer products. This role involves the full ML lifecycle, from model design and training to evaluation and deployment on constrained hardware. You will collaborate with DSP, firmware, and hardware teams to integrate advanced audio features and optimize models for performance and resource efficiency. The ideal candidate has 3+ years of experience with audio/speech ML models, PyTorch or TensorFlow, and embedded system deployment.
Required Skills
Details
- Salary
- $120,000 – $300,000/yr
- Experience Required
- 3+ years
- Posted
- Jun 29, 2026
Description
About Hark
Hark is an artificial intelligence company building advanced, personalized intelligence. One that is proactive, multimodal, and capable of interacting with the world through speech, text, vision, and persistent memory.
We're pairing that intelligence with next-generation hardware to create a universal interface between humans and machines. While today's AI largely operates through chat boxes and decade-old devices, Hark is focused on what comes next: agentic systems that interact naturally with people and the real world.
To get there, we're developing multimodal models and next-generation AI hardware together - designed from the ground up as a single, unified interface for a new era of intelligent systems.
About the Role
We are looking for an Lead Audio ML Engineer to implement, train, and ship audio models that run on-device across our consumer products. This role spans the full lifecycle of on-device audio intelligence: model design, training, evaluation, and deployment to constrained hardware. You will work alongside our DSP, firmware, and product teams to turn audio model research into production features that ship at scale.
Responsibilities
- Implement and train audio models for wake-word detection, voice activity detection, source separation, speech enhancement and similar audio
- Take models from research prototype to on-device deployment within latency, memory, and power budgets
- Build and maintain training data pipelines, evaluation harnesses, and re-training cadence across model families
- Partner with DSP and firmware engineers to integrate models into the Hark Audio Engine and DSP runtime
- Collaborate with hardware and acoustics teams to characterize the signal conditions models must operate under
- Profile and optimize models on target platforms (DSP, NPU, CPU) and define accuracy and resource budgets per product
Requirements
- 3+ years of professional experience building and shipping audio or speech ML models
- Strong fluency in PyTorch or TensorFlow and modern audio deep learning toolchains
- Hands-on experience deploying models to embedded targets such as DSP, NPU, or mobile NPU and CPU
- Comfort working across the full ML lifecycle: data, training, evaluation, deployment, and monitoring
- Solid foundation in audio signal processing concepts and how they intersect with ML pipelines
- Experience collaborating with DSP, firmware, and hardware engineers on resource-constrained systems
Bonus Qualifications
- Background shipping voice-first or far-field audio products
- Experience with on-device wake-word, ASR front-ends, or speech enhancement at production scale
- Familiarity with model compression techniques such as quantization, pruning, and distillation
- Familiarity with Qualcomm AI stacks or similar alternatives from other providers
- Open-source contributions to audio ML projects
Compensation
The US base salary range for this full-time position is between $120,000 and $300,000 annually.
The pay offered for this position may vary based on several individual factors, including job-related knowledge, skills, and experience. The total compensation package may also include additional components and benefits depending on the specific role. This information will be shared if an employment offer is extended.
