NEEDED FOR THIS ROLE
- JAVA, PYTHON, Golang (intermediate+ to Expert -preferred) in one or more
- NoSQL DB (Cassandra, etc or time series non structured DB experience)
- Big Data and Data at very large scale
- Experienced battle-hardened SW engineer (large distributed systems, large scale)
This is NOT an SRE role!
This is a software engineering role that will work on a team that provides ALL monitoring and will be responsible for developing custom stack for data integration retrieval. The team monitors time series data ingest in upwards of 1.5M+ records a min.
MUST HAVE
- Have the ability to develop code to access resident data and then digest and correlate data.
- Experienced battle hardened SW engineer with distributed systems experience deploying large scale/implementing at large scale.
- Solid programmer -knows one or more (Java, Python, Golang) and expert at one or more.
THEY ARE NOT looking for script writer
Ideal candidate has experience with timeseries data store (e.g. Cassandra, etc.)
- Expertise in NoSQL DB at a GIGA scale
The SRE Monitoring Infrastructure team (Note this is NOT an SRE Role) is looking for a backend software engineer with experience working with large-scale systems and an operational mindset to help scale our operational metrics platform. This is a fantastic opportunity to enable all engineers to monitor and keep our site up and running. In return, you will get to work with a world class team supporting a platform that serves Billions of metrics at Millions of QPS
The engineers fill the mission-critical role of ensuring that our complex, web-scale systems are healthy, monitored, automated, and designed to scale. You will use your background as an operations generalist to work closely with our development teams from the early stages of design all the way through identifying and resolving production issues. The ideal candidate will be passionate about an operations role that involves deep knowledge of both the application and the product, and will also believe that automation is a key component to operating large-scale systems.
Responsibilities:
• Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our Internet-facing services
• Gain deep knowledge of our complex applications.
• Assist in the roll-out and deployment of new product features and installations to facilitate our rapid iteration and constant growth.
• Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale UNIX environment.
• Work closely with development teams to ensure that platforms are designed with "operability" in mind.
• Function well in a fast-paced, rapidly-changing environment.
• Participate in a 24x7 rotation for second-tier escalations.
Basic Qualifications:
• B.S. or higher in Computer Science or other technical discipline, or related practical experience.
• UNIX/Linux systems administration background.
• Programming skills (Golang, Python)
Preferred Qualifications:
• 5+ years in a UNIX-based large-scale web operations role.
• Golang and/or Python experience
• Previous experience working with geographically-distributed coworkers.
• Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, Engineers, Product Managers, etc.
• Knowledge of most of these: data structures, relational and non-relational databases, networking, Linux internals, filesystems, web architecture, and related topics- basic knowledge
Team
- Interact with 4-5 people (stand ups) but not true scrum
- No interaction with outside teams
Candidate workflow
- 2 rounds
- 1 technical coding
- 1 team fit