Operations Engineer Kuala Lumpur
Job Description
<h2 style="margin-top: 18pt; margin-bottom: 4pt;"><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">ABOUT YOU</span></strong></h2>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">We are looking for an </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">Operations Engineer</span></strong><span style="font-size: 12pt; font-family: Arial, sans-serif;"> who is </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">technically curious, detail-oriented, a strong communicator, and proactive</span></strong><span style="font-size: 12pt; font-family: Arial, sans-serif;"> to join our </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">Global Technical Operations (GTO)</span></strong><span style="font-size: 12pt; font-family: Arial, sans-serif;"> team. The best candidate will be someone who thrives in a fast-paced, highly collaborative, and exceptionally dynamic setting and is excited to </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">monitor and investigate production issues across a global platform, help improve how we detect and respond to incidents, analyze trends and patterns in production data, and contribute to better communication with partners and stakeholders during incidents.</span></strong></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Strong </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">troubleshooting skills, observability platform experience, and scripting ability</span></strong><span style="font-size: 12pt; font-family: Arial, sans-serif;"> are essential, along with experience in </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">SRE, DevOps, production operations, or NOC environments supporting high-availability platforms (payments, e-commerce, SaaS, or gaming).</span></strong><span style="font-size: 12pt; font-family: Arial, sans-serif;"> The ability to </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">communicate clearly and effectively in English â both written and verbal â when writing incident updates, shift handoffs, and status page communications</span></strong><span style="font-size: 12pt; font-family: Arial, sans-serif;"> will be key to your success in this role.</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">If you're passionate about </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">keeping critical systems running and continuously improving operational processes</span></strong><span style="font-size: 12pt; font-family: Arial, sans-serif;"> and love </span><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">being the first to spot issues and the one who drives them to resolution for game developers and players worldwide,</span></strong><span style="font-size: 12pt; font-family: Arial, sans-serif;"> we would love to hear from you!</span></p>
<p style="margin-top: 0pt; margin-bottom: 0pt;"><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">Operations Engineer, Kuala Lumpur</span></strong></p>
<h2 style="margin-top: 18pt; margin-bottom: 4pt;"><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">ABOUT US</span></strong></h2>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Xsolla is a global commerce company with robust tools and services to help developers solve the inherent challenges of the video game industry. From indie to AAA, companies partner with Xsolla to help them fund, distribute, market, and monetize their games. Grounded in the belief in the future of video games, Xsolla is resolute in the mission to bring opportunities together, and continually make new resources available to creators. Headquartered and incorporated in Los Angeles, California, Xsolla operates as the merchant of record and has helped over 1,500+ game developers to reach more players and grow their businesses around the world. With more paths to profits and ways to win, developers have all the things needed to enjoy the game.</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">For more information, visit </span><a href="http://xsolla.com/" style="text-decoration: none;"><span style="font-size: 12pt; font-family: Arial, sans-serif; color: #1155cc; text-decoration-line: underline; text-decoration-color: currentcolor; text-decoration-skip-ink: none;">xsolla.com</span></a><span style="font-size: 12pt; font-family: Arial, sans-serif;">.</span></p>\n<p></p><p><br></p><b>Responsibilities:</b><div>
<ul style="margin-top: 0px; margin-bottom: 0px; padding-inline-start: 48px;">
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 12pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Serve as the primary dashboard monitor during your shift â continuously watch the GTO Operational Dashboard in Datadog, detect anomalies by correlating signals across APM, logs, metrics, synthetic tests, and Real User Monitoring, and determine whether alerts warrant an incident ticket or can be resolved through immediate investigation.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Triage and investigate production incidents â create incident tickets in JIRA Service Management, perform initial technical investigation using Datadog (traces, logs, infrastructure and application metrics), determine blast radius and likely root cause domain, and route to the correct team (Product SRE, Infrastructure SRE, or Engineering) using the smart routing model.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Own lower-severity incidents end-to-end from detection through resolution â diagnose, execute runbook procedures, and resolve without escalation where possible. Escalate promptly when an incident is unresolved within defined thresholds or requires a code-level fix.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Support the TSO Lead during major incidents as the technical right hand in the war room â surface real-time data (error rates, impact scope, deployment history, related alerts), maintain the incident ticket with live timeline entries and linked evidence, and execute mitigation actions as directed.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Draft incident communications under TSO Lead direction, including internal Slack updates, stakeholder notifications, and customer-facing status page updates (status.xsolla.com). Support clear, timely communication throughout the incident lifecycle.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">During non-incident periods, analyze incident trends, recurring issues, and production bugs â compile data from Datadog, JIRA, and Slack, identify patterns, and contribute findings to regular reports for product and engineering teams.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Publish health reports of critical apps periodically.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Compile incident timelines and draft initial PIR documents for Post-Incident Review preparation. Track PIR action items post-session and flag overdue items to the TSO Lead.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Build and maintain operational automation (alert enrichment scripts, incident templates, Slack workflows, dashboard widgets) and contribute to runbook development â documenting new resolution procedures so they can be repeated by any Operations Engineer on any shift.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 12pt;"><span style="font-size: 12pt;">Conduct structured shift handoffs covering active incidents, at-risk services, upcoming deployments, and follow-up items. Participate in knowledge transfer sessions with SREs to continuously expand independent resolution capability.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 12pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Cover for the TSO Lead during vacations, absences, or emergencies â including severity classification, escalation decisions, stakeholder communications, and basic Incident Commander functions.</span></p>
</li>
</ul>
</div><p><br></p><b>Qualifications:</b><div>
<ul style="margin-top: 0px; margin-bottom: 0px; padding-inline-start: 48px;">
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 12pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">4+ years of experience in SRE, DevOps, production operations, NOC, or technical operations in a high-availability environment. Experience with platforms that handle payments, e-commerce, SaaS, or gaming workloads is preferred.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Strong troubleshooting and investigation skills â ability to take an alert or user-reported symptom and methodically trace it through the stack: application logs, APM traces, infrastructure metrics, database queries, and network paths.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Hands-on experience with Datadog (or equivalent observability platform: Grafana, Splunk, New Relic, Elastic) â navigating APM, building log queries, reading infrastructure dashboards, interpreting SLO burn rates, and configuring monitors and alerts.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Proficiency in at least one scripting language: Python, Go, or Bash. You will write automation scripts, build operational tooling, and work with APIs.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Clear written and verbal communication skills in English â ability to write incident tickets, investigation notes, Slack updates, shift handoff reports, status page communications, and PIR drafts that are clear, concise, and useful to both technical and non-technical audiences.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Working knowledge of Kubernetes and cloud infrastructure (GCP preferred, AWS/Azure acceptable) â understanding of pods, deployments, services, ingress, node health, and how to investigate Kubernetes-related production issues.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Understanding of SLOs, error budgets, and burn-rate alerting â knowing what a multi-window burn-rate alert means, how error budgets deplete, and how SLO breaches translate into incident severity.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Experience with incident management tooling: JIRA or JIRA Service Management, PagerDuty or OpsGenie, Slack, and Confluence.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 12pt;"><span style="font-size: 12pt;">Experience with or strong interest in AI/ML-assisted operations: anomaly detection, alert correlation, predictive monitoring, or automated remediation.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 12pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Comfort with 24x7 shift-based operations as part of a follow-the-sun model with handoff overlaps. Weekend on-call (rotating) is required.</span></p>
</li>
</ul>
</div><p><br></p><b>Nice to have:</b><div>
<ul style="margin-top: 0px; margin-bottom: 0px; padding-inline-start: 48px;">
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 12pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Experience in the gaming, payments, or fintech industry â particularly environments where transaction processing, checkout flows, or player-facing services must meet strict uptime requirements.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Familiarity with Datadog Service Catalog, synthetic monitoring, and RUM (Real User Monitoring).</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Experience with distributed systems debugging: tracing failures across microservices, understanding cascading failures, and reading distributed traces end-to-end.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Exposure to database operations (MySQL, PostgreSQL, Redis, Kafka) at a level sufficient to investigate connection pool exhaustion, replication lag, slow queries, or queue backlogs during incidents.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">Familiarity with CI/CD pipelines and deployment tooling (GitLab CI, ArgoCD, Helm) â enough to correlate recent deployments with production issues and identify rollback targets.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 0pt; margin-bottom: 12pt;"><span style="font-size: 12pt;">JIRA Service Management administration experience: workflows, automation rules, SLA timers, and queues.</span></p>
</li>
<li style="font-size: 12pt; font-family: Arial, sans-serif;">
<p style="margin-top: 12pt; margin-bottom: 0pt;"><span style="font-size: 12pt;">ITIL Foundation certification is a plus but not required â practical experience matters more.</span></p>
</li>
</ul>
</div><p><br></p><p></p>\n<div>RM144,000 - RM216,000 a year</div>\n<p>
</p><h2 style="margin-top: 18pt; margin-bottom: 4pt;"><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">BENEFITS</span></strong></h2>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">Convenient work tools</span></strong></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Latest Mac workplaces + additional hardware to make you more effective at work</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Google Chat, Gmail, Google Drive, Confluence, Jira, GitLab</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">Professional growth</span></strong></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Free trainings and participation in specialized conferences</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Rich knowledge exchange within the company</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><strong><span style="font-size: 12pt; font-family: Arial, sans-serif;">More perks</span></strong></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Health insurance (Medical, dental and optical)- Employee and dependants</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Flexible hours: organize your day according to your needs and sprint & teamwork demands</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">No dress code</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">Comfortable and new office environment</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">The duties of this position may change from time to time so the individual and organization can achieve their results. This job description is intended to describe the general level of work being performed. It is not intended to be all-inclusive. By submitting your application, you consent to Xsolla conducting background checks, where permitted by law, after the final interview stage. All checks will comply with local regulations, and your information will be handled confidentially. Xsolla KL Sdn Bhd takes your privacy very seriously, and will not sell or externally distribute any data received during the hiring process. Pursuant to the Personal Data Protection Act 2010 ("PDPA"), Xsolla KL Sdn Bhd is mindful and committed to the protection of your personal information and your privacy. Please direct any inquiries regarding your data privacy to </span><a href="mailto:careers@xsolla.com" style="text-decoration: none;"><span style="font-size: 12pt; font-family: Arial, sans-serif; color: #1155cc; text-decoration-line: underline; text-decoration-color: currentcolor; text-decoration-skip-ink: none;">careers@xsolla.com</span></a><span style="font-size: 12pt; font-family: Arial, sans-serif;">.</span></p>
<p style="margin-top: 12pt; margin-bottom: 12pt;"><span style="font-size: 12pt; font-family: Arial, sans-serif;">For more vacancies: </span><a href="https://xsolla.com/careers" style="text-decoration: none;"><span style="font-size: 12pt; font-family: Arial, sans-serif; color: #1155cc; text-decoration-line: underline; text-decoration-color: currentcolor; text-decoration-skip-ink: none;">Careers | Xsolla</span></a></p>
<p></p><br/><br/>Please mention the word **PROVING** and tag RNjkuNDIuMjIyLjEzNA== when applying to show you read the job post completely (#RNjkuNDIuMjIyLjEzNA==). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.
Required Skills
Requirements
Employment Type
Remote
Category
jira, game, gaming, technical, support
About Xsolla
Location: Kuala Lumpur
Industry: jira, game, gaming, technical, support