Service Reliability Engineer (and many other positions) in Orange County, CA, USA

Patrick McLean patrickm at gaikai.com
Wed Nov 28 02:03:05 UTC 2012

Gaikai ( http://www.gaikai.com ) is a cloud-based gaming service that
allows users to play high-end PC and console games rendered on remote
servers via internet streaming.

In August 2012, we were purchased by Sony Computer Entertainment.

We do take international applicants, no pre-existing visas are
required. Telecommuting is generally not permitted, but working from
other locations may be negotiable.

There are quite a few other open positions listed on our career page
at http://gaikai.com/careers

Job Description:

Our SRE's focus is on three things: overall ownership of production,
production code quality, and deployments.

We expect our SREs to have opinions on the state of our network, what
we are doing right, and what we can do better. They are empowered to
say when new features are ready for production, and work with other
teams to make sure our requirements are met as early in the lifecycle
as possible.

• 5+ years in either Software Development or Systems Administration
(or both!): We expect you to be knowledgeable in one or two core
fields and open to coming up to speed quickly in - everything else.
• Strong interpersonal and communication skills: You will interact
with other teams on a daily basis.
• A strong sense of responsibility: SREs are largely self-directed,
and are key decision makers so must take pride in the part(s) of
production they own.
• Available for on call. There will be times when your expertise is
needed outside of core hours.

Skills & Knowledge
• Development experience in one or more languages: SRE tools are
primarily written in Python and Bash, but often need to inspect other
languages such as Java, Node.js, C++ and Ruby.
• Comfortable at a Bash prompt: Ideal candidate will also be familiar
with *nix debugging tools, both at the system (lsof, strace, tcpdump)
and code level (gdb, jvisualvm, etc.).

One or more of the following areas of expertise:
• SQL (ideally MySQL or Postgres)
• NoSQL at scale (ideally Hadoop, Mongo clusters and/or sharded Redis)
• Event Aggregation (e.g. Graphite, Zenoss, Flume, Splunk)
• Virtualization (Ideally in-house clouds using OpenStack or Eucalyptus)
• Release Engineering (Package management and distribution at scale)
• Load Testing (QA or SDET experience is a big plus)

