Requirements
Define and enforce production standards, processes, and tools to ensure operational excellenceGuide and mentor team members, fostering technical growth and helping to develop the next generation of engineering leadersStrong coding ability in at least one language (e.g., Golang, Python, Java, Typescript) with the capability to solve complex issues through codeDeep understanding of production reliability concepts, including SLIs, SLOs, and incident managementFamiliarity with working in dynamic, reliability-focused production environments (preferred)What We UseOur infrastructure runs primarily in Kubernetes hosted in AWS’s EKSInfrastructure tooling includes Istio, Datadog, Terraform, CloudFlare, and HelmOur backend is Java / Spring Boot microservices, built with Gradle, coupled with things like DynamoDB, Kinesis, AirFlow, Postgres, Planetscale, and Redis, hosted via AWSOur frontend is built with React and TypeScript, and uses best practices like GraphQL, Storybook, Radix UI, Vite, esbuild, and PlaywrightOur automation is driven by custom and open source machine learning models, lots of data and built with Python, Metaflow, HuggingFace 🤗, PyTorch, TensorFlow, and PandasYou'll get competitive perks and benefits, from health & wellness to equity, to help you bring your best self to work