Senior Site Reliability Engineer

Mars Capital

Dublin
Permanent
Full-time

13 days ago
Apply easily

Company DescriptionWELCOME TO MARS CAPITALMars Capital Finance Ireland DAC (“Mars Capital”) is part of Arrow Global Group Limited, which is one of Europe’s leading integrated asset managers with €112bn Assets Under Management (“AUM”) and 4,400 employees across five jurisdictionsMars Capital was established in 2015 and is a Regulated Credit Servicing Firm, authorised by the by the Central Bank of Ireland and located in Dublin 2. Mars Capital has the capacity and regulatory authority to service loans secured against properties and has €8bn AUM, experiencing significant growth in recent years. We provide services that span the full lifecycle of a loan post origination, from drawdown through to loan administration, asset management, enforcement as well as syndication, securitisation, and standby servicing.We value our relationship with our clients and believe that our deployment of dedicated, Dublin-based project and asset management teams give us a competitive advantage over other competitors who operate a shared services model across their portfolios. Placing the customer first in a simple and efficient manner in line with our regulatory obligation is our number one priority.Our local strategic ambition is to provide value accretive services to institutional clients, while building better financial futures for our customers, clients, communities and colleagues.Our Culture and Way of WorkingWe’re entrepreneurial, fast-paced and decisive, working together safely and supportively. We trust our colleagues to make the right decisions and are brave enough to acknowledge mistakes and to learn from them. Celebrating success, we reward those who help us to achieve exceptional long-term results.We’re inclusive and encourage our colleagues to be themselves. Our culture supports the difference that makes each of us unique. We’re open and eager to embrace new ways of working and have a diverse community, enriched by our local identities that works collaboratively to build a unified and dynamic organisation.We believe that a supported flexible working approach helps us to retain valued colleagues, enhance wellbeing, increase motivation and encourage a healthy work/life balance. If you’re interviewed, ask about the flexibility involved in this role.Our ValuesOur stakeholders expect us to act in an ethical and responsible way and this is at the heart of how we conduct our business. Our values support this philosophy, and we seek out and reward behaviours that will make us more sustainable, responsible and successful. Our values are- we succeed together; we do the right thing; we’re trusted and valued; and we’re brave and creative. You’ll hear more about these throughout the interview process.Our STAR Awards provide a fantastic platform to send a special thank you to a colleague who may have gone above and beyond to help you and the business succeed, or someone who has been brave and creative with their own approach to how they work. Values orientated, our Recognition Scheme plays a pivotal role in aligning and celebrating our culture, and simply wouldn’t exist without colleague participation and input.Our Environmental, Social and Governance (“ESG”) CommitmentWe’re committed to investing responsibly and supporting our local communities and charitable organisations. Every colleague is encouraged to take a paid day each calendar year to volunteer for our nominated charities. Internally we are very proud of our four colleague-led engagement groups, who run events/initiatives, that promote our culture and values which is a key part of the life blood of this business.Job DescriptionRole OverviewWe are seeking a Senior Site Reliability Engineer (SRE) with strong expertise in AWS cloud infrastructure, containerised platforms, and Azure DevOps CI/CD pipelines. The successful candidate will focus on improving system reliability, availability, performance, and scalability while enabling engineering teams to deliver high-quality services efficiently.This role combines engineering and operational excellence, with a focus on automation, observability, scalability, and resilience across cloud-native environments. As a senior engineer, you will drive engineering-led solutions to reduce operational toil, enhance system reliability, and promote DevOps and SRE best practices.Note: This is a reliability-focused engineering role with on-call responsibilities and involvement in platform modernisation initiatives.Key Responsibilities

Design, implement, and manage highly available and scalable infrastructure on AWS.
Build, maintain, and optimise DevOps Pipelines (CI/CD) for automated build, test, and deployment processes.
Implement end-to-end CI/CD workflows, including multi-stage pipelines, approvals, and release strategies.
Manage and support Windows (IIS, .NET) and Linux-based production systems.
Deploy, manage, and optimise containerised applications using Docker and Kubernetes (EKS/AKS).
Implement Infrastructure as Code (IaC) using Terraform, CloudFormation, or ARM
Develop and maintain automation scripts using PowerShell, Bash, or Python.
Define and monitor SLIs, SLOs, and SLAs to ensure system reliability.
Implement robust monitoring, logging, and alerting solutions (CloudWatch, Prometheus, Grafana, Azure Monitor).
Lead incident management, troubleshooting, and root cause analysis (RCA) for production issues.
Drive performance tuning and capacity planning for applications and infrastructure.
Collaborate with development teams to improve deployment strategies (blue-green, canary releases).
Ensure security, compliance, and best practices across CI/CD pipelines and infrastructure.

QualificationsRequired Skills & Experience

8+ years of experience in Site Reliability Engineering / DevOps / Infrastructure Engineering
Strong hands-on experience with AWS services (EC2, S3, RDS, VPC, IAM, ELB, Auto Scaling, CloudWatch)
Deep expertise in Azure DevOps Pipelines (CI/CD), including YAML pipelines and release automation
Experience designing multi-stage pipelines and deployment strategies
Expertise in Windows Server administration, including IIS and .NET application support
Strong experience with Linux system administration
Hands-on experience with Docker and Kubernetes (EKS/AKS)
Experience with Infrastructure as Code (Terraform, CloudFormation, or ARM templates)
Strong scripting skills in PowerShell (mandatory) and Bash/Python
Experience with monitoring and logging tools (Prometheus, Grafana, ELK, CloudWatch)
Solid understanding of networking, security, and cloud architecture principles

Preferred Qualifications

Experience with hybrid cloud or multi-cloud environments
Knowledge of Active Directory, Group Policy, and enterprise Windows environments
Familiarity with Helm, GitOps practices, or service mesh technologies
Experience with performance testing and tuning
Relevant certifications (AWS, Kubernetes, Azure DevOps)

Key Competencies / Characteristics

Reliability-driven: Focused on uptime, performance, and system resilience
Automation-first mindset: Continuously reduces manual effort and operational toil
Ownership mentality: Takes end-to-end responsibility from design through production
Strong communicator: Clearly articulates incidents, RCA outcomes, and technical concepts
Collaborative: Works effectively with platform, security, and application teams
Mentorship mindset: Actively supports and develops junior team members
Continuous learner: Keeps up with evolving SRE practices and cloud-native technologies

Additional InformationD&I statement

Mars Capital