Director of Site Reliability Engineering

at QGenda
Location Atlanta
Date Posted May 5, 2021
Category Engineering
Job Type Not Specified



QGenda is a fast growing Atlanta-based healthcare software company, with an amazing corporate culture, where we strive to be the best place to be a customer. Our software is used by thousands of hospital departments around the world to automatically generate the most optimized physician work schedules to accommodate complex business rules and accurately schedule the appropriate medical provider based on their skill level, specialty, availability, and preferences.

As Director of Site Reliability Engineering, you will apply your engineering leadership skills and knowledge of infrastructure and software development to drive scalable and highly reliable software systems. You'll work closely with leaders in Product, Quality Assurance, and Software Development to align our technical initiatives to business objectives. You will be responsible for driving project initiatives, owning the state of infrastructure, leading an SRE scrum team to consistently meet commitments, participating in code reviews of infrastructure changes, and mentoring team members.

Job Duties

& Responsibilities

  • Drive Operational Excellence
    • Utilize best practices and tools to champion our primary SRE goals:
      • Assist in Development Operations
      • Build and Maintain Infrastructure
      • Ensure Application Uptime and Performance
      • Assure High Security Across the Application and Organization
    • Own the state of the infrastructure, monitoring trends over time, and leading right-sizing efforts
    • Bring a focus on data security and compliance in all aspects of SRE work
  • Promote Innovation and Learning
    • Evaluate new technologies and continually look for ways to improve the code base
    • Ensure we're using the most performant AWS services to deliver exceptional user experiences for our customers
  • Lead Project Initiatives
    • Collaborate with internal stakeholders to identify infrastructure initiatives
    • Work closely with leadership to gather requirements, scope resources, and track project implementations
    • Ensure a robust roadmap and healthy backlog of SRE projects
  • Manage the SRE Team
    • Lead the SRE team as a scrum team, participating in planning and allocation work, consistently meeting sprint commitments
    • Coordinate efforts across departments and identify overlaps between teams
    • Perform code deployments and participate in code reviews for SRE team members
    • Manage and mentor team members, focusing on growth opportunities through clear goals


  • Minimum of 5 years leading a Site Reliability or DevOps team
  • Experience designing and delivering secure, high performance and highly‐available cloud services
  • Experience working with stakeholders to define and track SLIs, SLOs and SLAs using metrics and monitoring to ensure the objectives are met or exceeded
  • Bachelor's degree specializing in computing, engineering, or related field
  • Hands‐on experience building infrastructure and supporting applications in AWS using services such as Lambda, EC2, ECS, S3, SNS, SQS, RDS, Redshift, and Elasticache
  • Strong understanding of networking and DNS
  • Familiarity with configuration management and infrastructure as code (IaC) tools such as Ansible, Terraform or Cloudformation
  • Availability for off-hours deployment and upgrades of production systems during release and maintenance windows
  • Firm understanding and experience with Agile and Scrum SDLC processes
  • Solid Windows administration experience and familiarity with environments using Active Directory
  • Using distributed version control system experience (Git or Mercurial preferred) to check‐in code, branching, merging, pull request, code review, etc.
  • Knowledge of CI/CD best practices and tools such as AWS CodeBuild, Jenkins and TeamCity
  • Strong critical thinking and problem solving skills
  • Excellent communication skills and upbeat personality


  • 2018 - EY Entrepreneur of the Year
  • 2018 - GA Fast 40
  • 2018 - Deloitte Technology Fast 500
  • 2018 - Glassdoor Top 50 CEO
  • 2019 - GA Fast 40
  • 2019 - AJC Best Places to Work
  • 2020 - Deloitte Technology Fast 500
  • 2020 - AJC Best Places to Work


& Perks:

  • Competitive Salary
  • Bonus Eligible
  • 401k Employer Match
  • Pluralsight Subscription

Great Benefits & Culture:

  • Full Health and Dental (QGenda pays 100% of the individual premiums)
  • Employee-centric work culture
  • Work remotely when needed
  • 3 "Flex Hours" per week
  • Relaxed vacation policy
  • Company outings
  • Costco membership
  • Casual dress
  • Opportunity to be part of a fast growing software company with hundreds of customers and thousands of users around the world.

Powered by JazzHR


Drop files here browse files ... Dropbox ...