View Our Website View All Jobs

Cloud Platform SRE

At Bossa Nova we create service robots for the global retail industry. Our robots’ mission is to make large scale stores run efficiently by automating the collection and analysis of on-shelf inventory data. We drive autonomously through aisles, navigating safely among customers and store associates. If we were a self- driving car we’d be operating at level 5 autonomy.

Oh, we should add, it’s real, happening today, you can meet our robots in some of the world’s biggest retailers.

Position: Cloud Platform SRE

Location: Pittsburgh, PA

Help keep our distributed autonomous fleet of robots up, running and delivery data. This SRE role will contribute to the cloud side of data ingestion, processing and delivery of customer data.

What you will own:

  • Review and influence the design and standards of the software with a focus on reliability
  • Respond to and resolve unexpected service problems,  participate in post-mortems and provide guidance to other teams
  • Manage system releases, coordinate all aspects of the release including coverage and communication plans 
  • Create dashboards and instrument the code to capture and publish essential metrics, and use this data to define alerts 
  • Build data analysis tools to keep track of important service level agreements

Requirements:

  • 2-3 years as a Site Reliability or DevOps Engineer
  • 6 to 10 years of experience overall with experience as a software engineer.
  • 2 years of Python 2.7/3.5 or 2 years Java 8+
  • Experience solving for scalability, performance and stability issues
  • Exposure to the modern container ecosystem such as docker and kubernetes
  • Experience with linux networking fundamentals or software defined networking in the data center
  • Working knowledge of SQL
  • Ability to participate in a 24x7 on-call rotation
  • Production experience with a cloud provider

Bonus:

  • A degree in a Computer Science, technology, engineering, physical science, or math discipline
  • Experience increasing  system observability  through logs and metrics.
  • Working with the ELK stack
  • Exposure to machine learning frameworks such as tensforflow or pytorch
  • Desire to learn GO
  • Distributed systems design

 

Read More

Apply for this position

Required*
Apply with
Attach resume as .pdf, .doc, .docx, .odt, .txt, or .rtf (limit 5MB) or Paste resume

Paste your resume here or Attach resume file

150