
Senior Site Reliability Engineer
- Dublin
- Permanent
- Full-time
Have you heard of Tenable? Our cloud-based exposure management platform built for today's dynamic IT assets, like cloud, identity, containers and web apps? Well, that's what you'll be working on in this role. You will need to continue to quickly build out the platform, scale it automatically, and make it more self-managing for our cloud customers!As a Senior SRE, you will be a champion for customer experience, proactively driving initiatives to improve our service availability and user satisfaction. This is a hands-on role where you will actively contribute to building and maintaining our robust infrastructure and services while collaborating closely with product teams to embed SRE principles throughout the entire software development lifecycle.Your Opportunity:
- Drive a culture of customer-centricity and service level objective (SLO) focus within the SRE team and throughout the organization. Integrate this philosophy into all design decisions, onboarding of new services, and daily operations, whether for infrastructure or user-facing services.
- As a reliability expert, you'll partner with teams to embed SRE principles into our development lifecycle. Help translate product SLOs into actionable engineering requirements and provide guidance on building scalable, reliable, and operable services.
- You'll have the autonomy to dive deep into the code of our cloud products, troubleshoot incidents, identify performance bottlenecks, and contribute code to make them more resilient and perform flawlessly for our customers.
- Design, implement, and maintain comprehensive observability solutions for internal services, leveraging metrics, logging, and tracing to proactively identify and address potential issues.
- Employ a strong understanding of security principles to build solutions with security considerations embedded from the ground up. Champion a secure-by-design methodology across all development phases.
- Excel in a cloud-native environment utilizing Kubernetes for service deployment and management. Develop automation tools and solutions for robust operations in this environment.
- Design and lead projects within the SRE organization.
- Collaboration with cloud engineers in understanding new cloud technologies, assessing the impact to security services operations, and proposing solutions to existing business problems.
- Collaboration in the secure software development lifecycle to develop detailed enhancement/bug definitions, write functional requirements, translate the requirements into solution designs/code, and navigate the functional requirements through to Production deployments.
- Proactively look for ways to create efficiencies within operations as it pertains to the tools and technology used by Tenable to support their customer base.
- Create/maintain documentation for operational procedures.
- Participate in an on-call rotation and support 24x7 availability of production application systems.
- 4+ years of related SRE or DevOps experience.
- 2+ years deploying public cloud infrastructures (AWS, GCP, Azure) preferred including administering managed AWS services (EKS, OpenSearch, MSK, Batch, etc).
- Bachelor's Degree or Master's degree in a technical field such as Computer Science, Information Technology Engineering or equivalent work experience
- Experience with Terraform or similar IaC technologies.
- Experience with orchestration tooling such as Kubernetes.
- Experience with Docker or similar container solutions.
- Experience with datastore technologies: DynamoDB, RDS Aurora PostgreSQL, Elasticsearch etc.
- Experience with Jenkins or similar CICD tools and processes.
- Good understanding of distributed systems architecture, particularly those leveraging technologies like Kafka and distributed datastores.
- Strong experience with the Agile software development methodology and collaboration with internal teams to deliver software and configuration artifacts.
- Strong background in bash scripting in addition to experience in higher-level scripting languages like Python or Node.js.
- Experience leading projects through to completion with a team of peers.
- Be an enthusiastic learner, user, and advocate of our technologies.
- Has a desire to win as a team - make big things happen by working together and being open and willing to try new ideas.
- Strong interpersonal and communication skills (written, verbal, & virtual) with ability to work in a team-oriented, collaborative environment.
- Must have high degree of personal integrity and ability to maintain strict confidentiality.
- Must have a strong drive, be self-motivated, logical, and have a keen attention to detail.
- Experience deploying distributed, microservice-oriented applications at scale.
- Experience with distributed monitoring tools such as Datadog, Splunk, Coralogix, and solutions based on OpenTelemetry (OTEL), etc.
- Experience with Helm.
- Experience with Go, Java/Kotlin, and/or Groovy.
- Experience designing, developing, and operating distributed systems.
- Familiarity with 1+ years of operational experience with industry-leading "big data" services technologies, data warehousing and analytics technologies, such as Snowflake.