Clickhouse · Singapore(Remote)

Senior Site Reliability Engineer- Remote

🏢 Clickhouse📍 Singapore(Remote)🕐 Posted 75 days ago
⏱ Full-time🌐 RemoteEngineering✅ Direct from employer ATS
Apply on Clickhouse
ℹ️ Please note: This listing is sourced from a third-party job board. Jobnique is a job search platform and is not the employer for this role. The hiring company is Clickhouse.

About this role

About ClickHouse Recognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads

The company’s sustained, accelerating momentum was recently validated by a $400M Series D financing round. Over the past three months, customers including Capital One, Lovable, Decagon, Polymarket, and Airwallex have adopted the platform or expanded existing deployments. These customers join an established base of AI innovators and global brands such as Meta, Cursor, Sony, and Tesla

We’re on a mission to transform how companies use data. Come be a part of our journey! About the role We are committed to providing our customers with reliable and secure services so we are expanding our central Site Reliability Engineering team. You will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance of our cloud infrastructure. You will collaborate with different teams like Control Plane, Data Plane, Core, Security, Support and Operations and guide them to design and implement scalable, secure, highly available and fault-tolerant distributed systems. You will also own the areas of incident management and response, post-mortem analysis including running blameless postmortems, and continuous improvement of our Cloud services. You will be leveraging your software engineering expertise to develop software platforms and tools to optimize the operational and engineering efficiencies of ClickHouse Cloud. This role is a unique opportunity to make a significant impact on our elastic, limitless scale, high-performance ClickHouse Cloud

What will you do? Collaborate with various engineering teams in ClickHouse to design and implement scalable, secure, and highly available systems for ClickHouse

Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud

Ensure all the infrastructure components in ClickHouse Cloud (including Data Plane, Control Plane,ClickHouse Core, etc) have monitoring and alerting in place to ensure timely detection and resolution of incidents

Enhance and refine incident response processes and post-mortem analysis for any outages in ClickHouse Cloud including working with the support team to communicate to the impacted customers

Continuously improve the reliability and performance of our ClickHouse services

Plan, enable, and drive Chaos initiatives across Engineering teams, based upon internal priorities

Manage on-call processes to respond to performance and reliability issues, and establish best practices for coordinating e

Apply on Clickhouse