← All Jobs
Posted Mar 26, 2026

Site Reliability Engineer

Apply Now
Thales is a leader in digital security, providing identity management and data protection solutions. They are seeking a Site Reliability Engineer to ensure high service levels for their Telecommunication solution deployed in the public cloud, focusing on automation, reliability engineering, and incident management. Responsibilities - Design, build, and maintain scalable infrastructure using tools such as Terraform, Ansible, and Kubernetes - Develop automated CI/CD pipelines via GitLab to reduce manual toil - Define and monitor Service Level Objectives (SLOs) and Service Level Indicators (SLIs) - Manage 'Error Budgets' to balance the velocity of new features with the stability of the platform - Participate in 24/7 on-call rotations to provide emergency response and perform deep-dive troubleshooting for production issues - Conduct system performance analysis, identify bottlenecks, and perform capacity planning to ensure the infrastructure can handle growth and peak loads - Implement and refine symptom-based alerting and comprehensive monitoring strategies using platforms like Datadog to ensure high visibility into system health - Lead blameless postmortems after incidents to identify root causes and implement long-term technical fixes to prevent recurrence - Partner with Cloud Security teams to implement security best practices, manage access controls, and respond to security breaches or vulnerabilities - Interface with other stakeholders to define solution improvement plan - You will have the ownership of solution service availability Skills - Engineer or equivalent - At least 1 year experience - Java development skill is required - You are familiar with Public Cloud (GCP, AWS), containers and microservices (Docker, Kubernetes, Java), CI/CD and automation (Jenkins, Gitlab, Helm), NoSQL database - Must have U.S. or Dual Citizenship and be able to obtain post-hire clearance from the Committee on Foreign Investments in the U.S. (CFIUS) and Department of Treasury - GCP cloud architect certification is a plus - You have already set up product monitoring and the underlying infrastructure - You have development experience in a distributed systems and/or high availability context - You are familiar with microservices development - You participated in the definition of architectures, data structures, algorithms with performance, security, reliability constraints, etc - Public cloud architect certification - You are interested in aspects of Site Reliability Engineer: CI/CD, automation, monitoring and observability, and continuous improvement - You are an accomplished, versatile and multi-tasking developer engineer Benefits - Elective Health, Dental, Vision, FSA/HSA, Voluntary Life and AD&D, Whole Group Life w/LTC, Critical Illness, Hospital Indemnity, Accident Insurance, Legal Plan, Identity Theft, and Pet Insurance - Retirement Savings Plan after 30 days of employment with a company contribution and a match, and with no vesting period - Company paid holidays and Paid Time Off - Company provided Life Insurance, AD&D, Disability, Employee Assistance Plan, and Well-being Program Company Overview - Thales (Euronext Paris: HO) is a global leader in advanced technologies for the Defence, Aerospace, and Cyber & Digital sectors. It was founded in 1893, and is headquartered in Paris, Ile-de-France, FRA, with a workforce of 10001+ employees. Its website is http://www.thalesgroup.com.