Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territories, and help them connect with the stories and characters they crave. Whether that experience is online or in-person, streaming video, theatrical, games, merchandise, events and more, it’s powered by the anime content we all love.
Join our team, and help us shape the future of anime!
Crunchyroll, LLC is an independently operated joint venture between US-based Sony Pictures Entertainment, and Japan's Aniplex, a subsidiary of Sony Music Entertainment (Japan) Inc., both subsidiaries of Tokyo-based Sony Group Corporation.
The Insights & Reliability Engineering team builds data-intensive software systems that help Crunchyroll proactively detect, investigate, and improve reliability across user-facing experiences, including device QoE, payments, partner integrations, and core product journeys.
As a Senior Software Engineer, you will build, operate, and evolve tools, telemetry workflows, alerting systems, and operational insight platforms that turn high-volume product and platform data into actionable reliability signals. Your work will help engineering teams identify regressions, understand customer impact, improve signal quality, and drive systemic fixes before issues meaningfully affect users.
This is a software engineering role for someone who enjoys working close to data and production systems. The ideal candidate has strong coding fundamentals, SQL/data fluency, experience building reliable tools or data workflows, and the production instincts to solve ambiguous production problems across systems. As a senior engineer, you will help shape technical direction, define scalable operational practices, and influence cross-functional partners across engineering, product, data, and partner-facing teams.
- Build, operate, and evolve reliability tooling, telemetry workflows, alerting systems, dashboards, and automation that help teams detect and investigate customer-impacting issues.
- Transform high-volume product, device, payment, partner, and platform telemetry into reliable operational signals for anomaly detection, alerting, debugging, and leadership visibility.
- Use SQL, logs, metrics, dashboards, and product context to identify anomalies, baseline performance, validate hypotheses, and uncover patterns that drive reliability improvements.
- Lead cross-team improvements to monitoring, alerting, signal quality, triage workflows, and operational automation, defining scalable patterns that reduce time to detection and resolution.
- Participate in incident response, alert review, and production issue investigation for device, payment, and partner-related issues, with a focus on improving the systems and processes that make future issues easier to detect and resolve.
- Work with internal service owners and external technical partners when reliability issues span device, payment, or partner ecosystems, with a focus on improving the software systems, telemetry, and ownership paths that make future issues easier to detect and resolve.
- Collaborate with engineering, product, data, analytics, and partner teams to drive systemic fixes, postmortem learnings, and long-term reliability improvements.
- Help set technical direction for reliability tooling and investigation workflows within your area, establishing patterns for telemetry quality, debugging, and operational readiness.
- Communicate technical findings, customer impact, tradeoffs, and recommendations clearly to both technical and non-technical stakeholders.
- Bachelor’s degree in Computer Science, Engineering, Data Science, or a related field, and/or 8+ years of practical engineering experience.
- Strong coding skills, with Python preferred; experience with similar modern programming languages is also welcome if you can ramp up quickly in Python-based workflows.
- Strong SQL skills and experience working with high-volume event, telemetry, product, or operational datasets.
- Experience building, testing, deploying, and maintaining production-quality software, internal tools, data workflows, automation, or observability systems.
- Experience with data-processing systems, analytics engineering workflows, reliability tooling, alerting systems, or operational insight platforms.
- Ability to debug ambiguous production problems using data, logs, metrics, dashboards, code, system behavior, and product context.
- Experience working across data, observability, and analytics platforms, such as Databricks/Spark for data processing, Airflow/dbt for workflow orchestration and modeling, Datadog/Grafana for monitoring, and Tableau/Mixpanel/Mux for product or experience analytics.
- Experience defining scalable operational processes, improving signal quality, reducing alert noise, and turning repeated manual workflows into durable automation.
- Experience in reliability engineering, SRE, incident response, support engineering, analytics engineering, data engineering, or operational tooling roles.
- Comfortable working with consumer device, payment, partner integration, or distributed system domains.
- Proven ability to influence stakeholders across Engineering, Product, Data, Analytics, and partner-facing teams.
- Experience leading ambiguous, cross-system initiatives and aligning multiple teams around technical priorities, operational tradeoffs, and durable reliability improvements.
- Experience using AI-assisted engineering tools, including agentic workflows where appropriate, to accelerate investigation, implementation, testing, or automation, with strong judgment around correctness, maintainability, security, and production safety.
- Clear communicator, capable of translating technical findings, reliability risks, and customer impact for broad audiences.
- Enjoy building and evolving software, data workflows, and observability systems that make complex production behavior easier to understand and improve.
- Are fluent in both code and data, and like using telemetry, SQL, logs, metrics, and product context to solve ambiguous problems.
- Are obsessed with understanding root causes, not just patching symptoms.
- Believe in turning repeated manual workflows into reliable tools, automation, and scalable operating practices.
- Have a strong sense of ownership and pride in improving customer experience through better reliability systems.
- Enjoy working behind the scenes to make things “just work” for millions of fans.
The Scaling Client and Partnership Engineering team at Crunchyroll plays a pivotal role in enhancing and expanding our users' experiences. We collaborate extensively with a diverse network of device, payment, and gaming partners to broaden the reach of Crunchyroll's offerings. Our primary objective is to drive growth, open up new acquisition channels, and optimize both the scope and quality of our services. Situated at the crossroads of technology and business, we are dedicated to continually enabling experiences that delights our fans.
#LifeAtCrunchyroll #LI-Hybrid
We want to be everything for someone rather than something for everyone and we do this by living and modeling our values in all that we do. We value
Courage. We believe that when we overcome fear, we enable our best selves.
Curiosity. We are curious, which is the gateway to empathy, inclusion, and understanding.
- Kaizen. We have a growth mindset committed to constant forward progress.
-
Service. We serve our community with humility, enabling joy and belonging for others.
Our mission of helping people belong reflects our commitment to diversity & inclusion. It's just the way we do business.
We are an equal opportunity employer and value diversity at Crunchyroll. Pursuant to applicable law, we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Crunchyroll, LLC is an independently operated joint venture between US-based Sony Pictures Entertainment, and Japan's Aniplex, a subsidiary of Sony Music Entertainment (Japan) Inc., both subsidiaries of Tokyo-based Sony Group Corporation.
Questions about Crunchyroll’s hiring process? Please check out our Hiring FAQs: https://help.crunchyroll.com/hc/en-us/articles/360040471712-Crunchyroll-Hiring-FAQs
Please refer to our Candidate Privacy Policy for more information about how we process your personal information, and your data protection rights: https://tbcdn.talentbrew.com/company/22978/v1_0/docs/spe-jobs-privacy-policy-update-for-crpa-dec-21-22.pdf
Please beware of recent scams to online job seekers. Those applying to our job openings will only be contacted directly from @crunchyroll.com email account.