Chief Architect - Cluster Management Reliability - Permanent
Huawei
- Dublin
- Permanent
- Full-time
- Lead the organization responsible for defining and defending meaningful cluster management reliability KPIs.
- Work closely with cross-functional teams including hardware systems engineering, SRE, and cloud service owners to understand the gaps in the cluster management ecosystem and create solutions that meet their stringent reliability demands.
- Define and execute key technical projects necessary to achieve a highly reliable and scalable scheduling and workload management abstractions.
- Research and develop key foundational capabilities around multi-tenancy, isolation, performance optimization, observability, simulation, hitless upgrades, machine failure resilience, failure domain awareness, autoscaling etc.
- Maintain academic partnerships, collaborate with hardware vendors and engage with open communities that are relevant to this domain.
- Publish key findings in relevant conferences & journals or file patents as appropriate
- Ph.D. or Master’s degree in Computer Science or a related field.
- 10+ years of experience leading organizations or teams that support large scale cloud infrastructure.
- Deep practical experience in designing and scaling clusters of more than 10,000 machines using technologies like Kubernetes, OpenStack, Google Borg, Facebook Twine.
- Technical mastery of foundational technologies used in cluster management such as cgroups, namespaces, overlay networking, hardware virtualization etc.
- Strong API design & System design skills along with the fluency to read and write code in a modern programming language like Go or Rust.
- Exceptional communication skills required to negotiate, collaborate with and educate cross-functional teams across the globe.
- Optional: Understanding of the Open Compute Projects (OCP) ecosystem.
- Competitive salary package
- Long-term personal growth space
- Opportunities to work on high profile initiatives that impact the whole company
- Opportunities to work with the brightest minds in software engineering (including Huawei Fellow and renowned professors in the world)
- A multi-cultural, international working environment
- Work for an international world leader, an established yet still rapidly growing Fortune 500 company