at Apple
Location
San Jose, United States of America
Compensation
$147k–$272k USD
Type
full time
Posted
Yesterday
Market range · company + function + seniority
p25 · target · p75 · n=671
Posted $272k · in the market band
Tailor your résumé to this role in 30 seconds.
Free account · ATS keyword check · per-job bullet rewrite by Claude.
We are looking for a Visual Intelligence and Machine Learning Scientist who bridges computational neuroscience and modern AI. You will build models that reason about human visual attention, behavioral states, comfort, and perceptual salience from video and multimodal sensor streams. You will have a track record of original research contributions, a drive to deploy algorithms on real-world data, and the versatility to take an idea from exploratory research to polished user experience.
Apply large and on-device vision foundation models, multimodal transformers, and generative AI to build visual intelligence systems that predict saliency, understand scene context, and estimate perceptual state from video and sensor streams.
Design and own end-to-end AI pipelines: from data curation and model training to fine-tuning and adapting pretrained vision-language and multimodal foundation models for on-device, real-time inference.
Build multimodal AI systems that fuse visual signals with IMU, depth, and temporal context, improving robustness and perceptual fidelity across diverse real-world conditions.
Use generative AI and computational modeling to develop learned representations and priors that extend model generalization beyond labeled data, across devices, environments, and user populations.
Partner with app teams, UX designers, and systems engineers to translate algorithmic outputs directly into user-facing features on iOS, macOS, and spatial computing platforms.
Define and track perceptual and behavioral KPIs at the object, scene, and user level to rigorously validate algorithm quality and drive continuous improvement.
Stay at the frontier of visual intelligence, multimodal AI, and computational neuroscience, rapidly prototyping novel ideas and carrying the most promising from research to shipped product.
MS with 3+ years of research experience, in Computer Science, Biomedical Engineering, Computational Neuroscience, Applied Mathematics, or a related field, with a focus on computer vision or machine learning.
Demonstrated research contributions to visual intelligence or perceptual modeling, including peer-reviewed publications or equivalent industry impact.
Deep expertise in visual perception: saliency modeling, object detection and localization, optical flow, depth estimation, and multimodal learning including sensor fusion with IMU and temporal signals.
Strong hands-on experience training and deploying deep learning models using modern frameworks (PyTorch, TensorFlow, JAX) and Python scientific libraries (NumPy, OpenCV, scikit-learn).
Strong mathematical foundations in linear algebra, probability, optimization, and signal processing, with experience in large-scale dataset curation and evaluation methodology.
Excellent communication and collaboration skills; ability to thrive in a fast-paced environment alongside scientists, engineers, designers, and domain experts from the behavioral and cognitive sciences.
PhD in Computer Science, Biomedical Engineering, Computational Neuroscience, Applied Mathematics, or a related field, with a focus on computer vision or machine learning.
Experience with vision foundation models, vision transformers (ViT), and adapter-based fine-tuning for perceptual or behavioral downstream tasks.
Familiarity with generative AI and neural rendering approaches (diffusion models, implicit neural representations) as components of computational perception pipelines.
Background in computational modeling of perceptual or behavioral phenomena: modeling user state, comfort, visual salience, or behavioral response from video or sensor data.
Experience developing or deploying models for spatial computing platforms (AR/VR headsets, iOS, macOS).
Track record of shipping algorithms in resource-constrained, real-time on-device inference environments.
Ability to be versatile across a multi-faceted role, moving fluidly between deep research, engineering rigor, and direct collaboration with product and design teams to bring science to life in user-facing features.
Apple is where individual imaginations gather together, committing to the values that lead to great work. Every new product we build, every service we deliver, is the result of us making each other's ideas stronger. Here, you'll do more than join something — you'll add something.
Our team sits at the intersection of computational neuroscience, visual intelligence, and AI, translating cutting-edge research directly into experiences that users see and feel every day. We prototype algorithms from first principles all the way through to shipping features, working hand-in-hand with app teams and designers so that the science we build has immediate, tangible impact on how people interact with Apple products. If you want your research to matter beyond a paper, this is where that happens.
Apple is an equal opportunity employer that is committed to inclusion and diversity. We seek to promote equal opportunity for all applicants without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, Veteran status, or other legally protected characteristics. Learn more about your EEO rights as an applicant
At Apple, we believe accessibility is a fundamental human right. You’ll find that idea reflected in everything here — in our culture, our benefits and our digital tools. By welcoming as many perspectives as possible, we help you build a career where you feel like you belong.
Learn about accessibility in Apple’s workplace
Learn about reasonable accommodations for job applicants
Apple accepts applications to this posting on an ongoing basis.
More open roles at Apple
Hiring velocity, headcount trend, and every open posting on one page.
Open postings ranked by description similarity — useful if this role isn't quite right.