[Jobposts] Senior Systems and Reliability Engineer at Marketo (Portland)

Thu Jan 8 16:45:20 UTC 2015

Title: Senior Systems and Reliability Engineer
Location: Portland, OR, USA.

Please visit following Link to Take a shortcut to Marketo recruiter's inbox
and take an Online Technical Interview On a Live Server NOW!
http://bit.ly/1FufiEw

This member of Marketo’s Operations team will be responsible for
maintaining the Test and Hosted Production systems for a fast growing SaaS
company in the Marketing Automation space. As the primary technical bridge
between Engineering and the production environment this position is a key
player in ensuring a secure, robust and available solution. This person
should be energetic, bright, and focused on team dynamics and results.

Responsibilities:
The Systems and Reliability Engineer will provide technical leadership to
operations team as we deploy, grow and scale our SaaS offerings leveraging
NoSQL technologies. Additional responsibilities include
• Work with Engineering and Production Operations to recommend changes to
ensure application robustness
• Hands-on management of the production systems, fault resolution, capacity
planning and root cause analysis for critical issues.
• Design & Deploy large-scale distributed systems with a strong
understanding of scaling, performance and scheduling.
• Build tools to automate processes and make recommendations for product
and infrastructure changes to make the application more robust.
• Monitor and respond to production application issues
• Create tools to automate system recovery
• Establish operations policies and procedures for production (with the
Operations, DBA and Engineering teams)
• Script and procedure development
• Able to communicate clearly with peers as well as management and provide
technical leadership to more junior team members.
• The ability to socialize ideas, make recommendations and gather team
consensus to move forward

Required Skills/ Experience:
• MS/ BS degree in Computing Sciences or equivalent is preferred. Credit
will be given for candidates who do not have relevant degrees but can
demonstrate equivalent work experience.
• 3+ years of experience with NoSQL data stores (Hadoop/HBase, MongoDB,
Solr)
• 5+ years of experience with MySQL databases, SQL and Linux
• Experience in Hadoop, MongoDB, NoSQL administration and scaling
• 5+ years performing capacity planning, performance analysis and
development of data disaster recovery plans for large scale data systems
• Expert understanding of ETL techniques and best practices to handle
extremely large and complex volume of data
• Server scripting (PHP, Perl, awk etc.)
• Implementing and supporting our Management tools and scripts
• Experience in large Web/DB projects
• Systems and infrastructure security
• High availability system design/support – ideally in a SaaS or commodity
website environment (Google, Yahoo, MSN, SalesForce etc.)
• Ultra high volume messaging
• Management infrastructure tools (Nagios, Cacti, Tivoli, OpenView, Splunk
etc.)

Bonus:
• Deployment and distribution automation tools (YUM, Puppet, RPM, etc)
• Consumer software/website database design
• Expertise in Mail systems and MTA’s
• Salesforce’s Force.com Appexchange platform
• Virtual systems build and management utilizing VMWare, Xen or an
equivalent

With Regards,
Ron D

TrueAbility <https://trueability.com/>