Research Engineer, Frontier Safety Mitigations, DeepMind
at Google
Location
San Francisco, CA, USA; Mountain View, CA, USA
Compensation
$174k–$252k USD
Type
full time
Posted
2 weeks ago
Tailor your résumé to this role in 30 seconds.
Free account · ATS keyword check · per-job bullet rewrite by Claude.
Job description
Our team focuses on de-risking model launches by defending against critical misuse domains such as Cybersecurity, CBRNE (Chemical, Biological, Radiological, Nuclear, Conventional Explosive), and Harmful Manipulation. Currently, this involves building novel evaluations, red-teaming, researching and deploying advanced mitigations (both in-model and out-of-model), and monitoring emerging risks. We ensure that our mitigations are highly robust, while still enabling the beneficial use of our technology.
GDM is a dedicated scientific community, committed to ‘solving intelligence’ and ensuring our technology is used for widespread public benefit.
The Frontier Safety Mitigation team operates in a fast-paced, highly collaborative environment. We have a strong culture of support, dedication, and teamwork. We take the possibility of tangibly dangerous model capabilities seriously as AI advances. Because of this, we believe that proactively researching and implementing robust, defense-in-depth mitigations is a critical part of the overall strategy for building safe AI.
We are looking for a research engineer for the Frontier Safety Mitigation team within the Gemini Safety team. In this role, you will help us build the next generation of safety mitigations for frontier models. This role is highly applied and focuses on building robust, end-to-end defenses against severe risks. This work feeds directly into DeepMind's Frontier Safety Framework commitments.
Artificial intelligence will be one of humanity’s most transformative inventions. At Google DeepMind, we are a pioneering AI lab with exceptional interdisciplinary teams focused on advancing AI development to solve complex global challenges and accelerate high-quality product innovation for billions of users. We use our technologies for widespread public benefit and scientific discovery, ensuring safety and ethics are always our highest priority.
Responsibilities
- Build advanced classifiers and data pipelines to detect misuse, owning the end-to-end process from automated evaluation to rapid model iteration.
- Build cross-context monitoring systems to detect coordinated harms, developing novel signal aggregation methods across disparate user sessions to identify large-scale attack vectors.
- Implement data-driven, semi-automated account-level response systems to detect, track, and apply strikes against persistent malicious actors using rich signals from production traffic.
- Evaluate and secure agentic AI systems by developing threat models, creating testing environments, and deploying robust mitigations against frontier-level agentic hacking and long-horizon attacks.
- Be able to advance research in automated red-teaming and adversarial robustness, leveraging multi-turn/agentic attacks to systematically test for and uncover misuse vulnerabilities.
Minimum qualifications:
- Bachelor’s degree or equivalent practical experience.
- 5 years of experience with software development in one or more programming languages.
- 3 years of experience testing, maintaining, or launching software products, and 1 year of experience with software design and architecture.
- Experience working across the research-to-deployment pipeline in a frontier AI environment.
Preferred qualifications:
- PhD in Computer Science or Machine Learning, or publications at venues such as NeurIPS, ICLR, ICML, or EMNLP.
- Experience with cybersecurity detection and response, building classifiers and anomaly detection systems at scale, taking safety defenses or mitigations from research concepts to scalable production systems.
- Experience collaborating on or leading applied ML projects, including LLM training, inference, and fine-tuning.
- Experience using AI coding agents with strong architectural judgment and with TPUs and JAX.
- Knowledge of AI control, chain-of-thought monitoring, faithfulness, monitorability, and related frontier safety research.
- Background in adversarial machine learning, automated red-teaming, or model interpretability and probes.
Applicants in San Francisco: Qualified applications with arrest or conviction records will be considered for employment in accordance with the San Francisco Fair Chance Ordinance for Employers and the California Fair Chance Act.