We are looking for Site Reliability Engineers (SREs) to focus on Developer Experience and who can help us design, build, and maintain high-performance, scalable, and reliable services. As Talkdesk provides a Contact Center service, we play a very critical role in our Customer’s business operations and therefore need to provide a highly available and fault tolerant service.
We believe in a DevOps philosophy where every engineering team at Talkdesk should be responsible for the software they build and deploy and SREs play a critical role in ensuring that the teams have the tools, practices, and expertise to make that happen in a blame free culture. Our mission is to improve developers’ experience by giving them the tools to manage the entire software lifecycle and to be self-sufficient.
To help with this we are building our own internal PaaS using the latest technologies like Kubernetes, Prometheus, Kotlin and others. This platform is an important pillar in Talkdesk’s engineering effort and helps us deliver better, faster and more reliable solutions for our customers.
- Design, build, harden, and maintain the core infrastructure used by all of Talkdesk’s engineering teams
- Design, build, harden, and maintain some key parts of our internal platform (from CI/CD , to developer tools and several other internal tools that increase R&D productivity)
- Coach and up-skill other engineering team members
- Solve challenging technical problems and put your skills to the test every day; see an immediate impact of your work and value you created for other engineers
- Automate every aspect of our infrastructure to remove as much as possible any human intervention
- Develop effective tooling, alerts, and response to both identify and address reliability risks
- Drive and promote protocols on production readiness and operational excellence
- Partner with product engineering teams to debug production outages and carry out action items to improve reliability of those systems
- Advocate for automated testing, continuous integration and delivery, feature toggles and progressive rollouts
- Plan for growth of Talkdesk’s infrastructure.
Skills and Qualifications:
- Understand large-scale complex systems from a reliability perspective
- Design, implement and maintain CI/CD processes and tools
- Passion for producing clean, standards-compliant, secure code
- Bringing a developer mindset and applying it to infrastructure
- Know your way around Linux/Unix systems
- Experience building software with a programming language such as Java, Kotlin, Scala or any other JVM-based languages
- Experience writing scripts for automating the execution of certain tasks with a programming language like Ruby, Python, Bash or any other scripting language
- Experience with at least one relational and non-relational databases (ex: PostgreSQL, MySQL, MongoDB, Redis, ElasticSearch)
- Ability to identify time consuming and error prone manual tasks and then build tooling to automate them
- Ability to identify root causes of instability in a large-scale distributed system across stacks
Nice to haves / Pluses:
- Experience with cloud-based solutions such as Amazon AWS, Google Cloud, or Microsoft Azure
- Experience with Kubernetes
- Experience with CI/CD platforms (e.g Jenkins, GitlabCI), Containers (Docker, Kubernetes), Artifact Management tools (e.g: Nexus)
- Some experience with Infrastructure as code tools like Terraform and Ansible
- Experience with Go programming language
The Talkdesk story hinges on empathy and acceptance. It is the shared goal among all Talkdeskers to empower a new kind of customer hero through our innovative software solution, and we firmly believe that the best path to success for our mission is inclusivity, diversity, and genuine acceptance. To that end, we will hire, promote, work along, cheer for, bond with, and warmly welcome into the Talkdesk family all persons without regard to ethnic and racial identity, indigenous heritage, national origin, religion, gender, gender identity, gender expression, sexual orientation, age, disability, marital status, veteran status, genetic information, or any other legally protected status.