Big Data SRE Manager

Job Description

Required Skills

Description

DescriptionDescriptionWe are seeking a hands-on Manager who has experience leading large Big Data environments spread across thousands of nodes and petabytes of data. We look forward to a people manager with a background & experience that looks like this : - Grown into leadership roles after proving technical skills in individual contributor roles but still enjoys hands on work when the situation calls for it. - You have designed and built large data environments for availability, security and reliability. - You keep yourself informed about the choices and trade off as the new technology evolves in big data landscape. - You have an eye for talent and hire and grow your engineers by mentoring and challenging them. - You will collaborate across many teams to deliver on projects related to big data platform and data pipeline and provide SRE support for reliability of these managed services. - You will have significant opportunity to influence and shape our big data platform strategy and data products as we work on the next generation of our architecture, platform and processes., Key QualificationsKey Qualifications
  • 10 Years of Management experience leading team of engineers
  • Hands on manager who likes troubleshooting complex performance and scale problems
  • Excellent problem solving, critical thinking, and communication skills - Lead by example to motivate and challenge the team to deliver their best.
  • 5+ years of experience in Hadoop based technologies - HDFS/Yarn cluster administration, Hive, Spark
  • Strong Experience leading cross functional initiatives and thought leadership
  • Zoom in and zoom out to clear out ambiguity and set a clear path forward
  • Experience managing Hadoop/YARN clusters with thousands of nodes and 10 s of petabytes of data running 10 s of thousands of jobs
  • Have a passion for automation by creating tools using Python, Java or other JVM languages
  • Strong expertise in troubleshooting complex production issues.
  • The candidate should be adapt at prioritizing multiple issues in a high pressure environment
  • Should be able to understand complex architectures and be comfortable working with multiple teams
  • Ability to conduct performance analysis and troubleshoot large scale distributed systems
  • Should be highly proactive with a keen focus on improving uptime availability of our mission-critical services
  • Comfortable working in a fast paced environment while continuously evaluating emerging technologies
  • The position requires solid knowledge of secure coding practices and experience with the open source technologies.
 
 
Apply

There is something wrong with this job ad? Report the error



Related Ads