
Site Reliability Engineer (Swing shift 4 days a week)
- Dublin
- Permanent
- Full-time
- Assist in supporting production systems and performing troubleshooting tasks.
- Provide relief and sustainable resolution to issues within our infrastructure.
- Help maintain and enhance system observability, including metrics, logging, and alerts.
- Participate in troubleshooting bridges and provide support during critical incidents.
- Collaborate on infrastructure automation projects using internal or open-source tools and frameworks.
- Develop skills in building and maintaining CI/CD pipelines and deployment scripts.
- Adhere to established SRE practices, such as documenting runbooks, composing postmortems, and contributing to team retrospectives.
- Automate repetitive tasks to improve efficiency and reduce human error.
- Ability to work one weekend day per week.
- A degree in computer science/engineering or equivalent
- 3+ years of experience in a Site Reliability Engineering or similar role.
- Experience in leveraging AI into work processes, decision-making, or problem-solving
- Solid knowledge of ITIL driven IT Operations – Incident, Problem & Change Management
- Excellent communication skills and a collaborative mindset.
- Good knowledge of Unix/Linux operating system, including memory management, process management, disk/IO troubleshooting, and network troubleshooting
- Understanding of networking concepts, including TCP, IP addressing, routing, HTTP, HTTPS/TLS/SSL, DNS, DHCP, FTP/SFTP
- Understanding of relational databases (e.g., MySQL, Postgres)
- Knowledge of one (or more) scripting languages: JavaScript, Python, Unix Shell
- Hands-on experience with Kubernetes and containerization
- Familiarity with at least one programming language (Shell, Python, Go, or Java preferred)
- Familiarity with cloud computing (AWS, Azure, or GCP)
- Basic knowledge of monitoring and alerting tools (e.g., Prometheus, Grafana, or similar)
- Strong desire to learn and grow in infrastructure reliability, automation, and DevOps/SRE principles
- Certifications in one or more public cloud platforms
- Exposure to DevOps and Agile methodologies.
- Familiarity with CI/CD pipelines and tools like Jenkins or GitLab CI.
- Understanding of development on ServiceNow platform
- Basic understanding of configuration management tools (Ansible, Puppet, etc.).
- Knowledge of monitoring and observability tools (Prometheus, Grafana, etc.)
- Experience with Splunk, Gitlab