At Talkdesk, we are disruptive risk-takers, don't have a traditional way of doing things, and will do whatever it takes to build a product that improves customer experiences. Join the adventure and grow with us. With $268 million in funding and more than 1,800 customers, we’re revolutionizing the industry of contact centers. Join a hyper-growth company that's innovative, bold, and ambitious. Be a part of the future of technology and make an impact on an evolving industry. If you're a self-starter and hungry to learn, apply now and enjoy the ride.
Here, our Engineering team follows a micro-service architecture approach to build the next generation of Talkdesk, with vertical teams responsible for all the decisions under their services. Through our Agile Coaches, we promote agile and collaborative practices, we are huge fans of Scrum, pair programming and we won’t let a single line of code reach production without peer code reviews. We strongly believe that the only true authority stems from knowledge, not from position and we always treat others with respect, deference, and patience.
We are currently looking for a Senior Operations Engineer to join our Artificial Intelligence Unit and work with our Engineering, Product, Support, and Customer Success teams and have the responsibility to keep our platform and services working at full steam.
- Experience in developing dashboards and appropriate monitoring metrics;
- Ability to understand software technical stack and the components that require thorough monitoring;
- Working with the service teams, and also self discovering those alarms that need to be configured;
- Setting the alarms and testing them in all the relevant environments;
- Work with the teams in developing and implementing operational best practices for a highly available, scalable and reliable software platform;
- Mentor teams and develop operational mindset within individual teams;
- Continuous monitoring of the system, proper escalation and reporting;
- Track all the issues, drive root cause analysis and drive them to closure;
- Analytical skills and deep understanding of the system are required to ensure systems are functioning properly and that important issues are being prioritized and closed;
- Develop proper escalation procedures;
- Ensure playbooks are available but more importantly make sure teams understand various failure scenarios;
- Execute dry runs;
- Measure and track uptime of the system;
- BS in Computer Science or related field, or equivalent work experience;
- 4+ years of experience;
- Experience with cloud environments such as AWS;
- Experience with Java, Python;
- Experience with monitoring and Data observability/visualization: New Relic, Datadog, Splunk, Kibana;
- Experience with Big Data processing frameworks monitoring such as Spark, Data Warehousing;
- Experience with databases monitoring: RDBMS, NoSQL;
- Good understanding of container orchestration: EKS (Elastic Kubernetes Service);
- C1 or equivalent level of English Proficiency.
The Talkdesk story hinges on empathy and acceptance. It is the shared goal among all Talkdeskers to empower a new kind of customer hero through our innovative software solution, and we firmly believe that the best path to success for our mission is inclusivity, diversity, and genuine acceptance. To that end, we will hire, promote, work along, cheer for, bond with, and warmly welcome into the Talkdesk family all persons without regard to ethnic and racial identity, indigenous heritage, national origin, religion, gender, gender identity, gender expression, sexual orientation, age, disability, marital status, veteran status, genetic information, or any other legally protected status.