“We used ContainIQ as a drop-in replacement for Datadog to handle our metrics, traces, and logs. Their dashboards give us exactly what we’d spend an hour or two making ourselves yet are more responsive. ContainIQ’s highly predictable pricing model reduced our observability costs by nearly 8x.” Dan Goodman, Co-Founder, and CTO of Ultimate Tournament.
Background on Ultimate Tournament
Ultimate Tournament is a fast-growing technology company in the process of scaling a new type of social gaming. Ultimate Tournament operates the Ultimate Arcade™, a skill-based arcade where players win real money by defeating other players. Ultimate Tournament created an innovative and new model of monetizing games without ads, selling in-game items, or only receiving payment from a one-time purchase. The arcade increases the fun of games because real money is on the line. It also functions as a platform where third-party game developers can make skill-based games on the company’s platform leveraging their infrastructure and transaction processing technology.
As a social gaming platform, the company has invested significant time and resources into its observability stack. And headed into the new year, the company began looking for a scalable and comprehensive monitoring platform to support growth objectives and the demands from their own customers.
Challenges
After a recent fundraise, Ultimate Tournament was in the fortunate position to be scaling its infrastructure and the size of its engineering team to meet the demands of its new Ultimate Arcade. Increasing demands from end-users also pushed the Ultimate Tournament team to invest in a scalable platform for metrics, logs, and traces.
Using AWS, Ultimate Tournament had recently scaled to two Kubernetes clusters, two Redis clusters, and had multiple Kinesis shards on-demand. And while scaling the size of its infrastructure, Ultimate Tournament noticed that the cost of its current observability infrastructure was also increasing. “Our Datadog bill matched our AWS bill. The thought of observability matching the very thing it monitors is borderline offensive.” said Goodman.
Ultimate Tournament needed to find a drop-in solution that was cost-effective and that would provide the mission-critical metrics, logs, and traces that the engineering team needed to improve the user experience. In summary, Ultimate Tournament had to find a new tool that:
- Was cost-effective for the company today, and as it scaled from here.
- Was a managed solution, and that wasn’t built on top of open-source tools prone to break.
- Could aggregate metrics, logs, and traces into one view. And that provided the engineering team with the information it needed to debug issues quickly.
- Could correlate Kubernetes events and latencies with logs at given points in time, which on EKS, is notoriously difficult.
- Would help the engineering team maintain low latency and performant experiences for end-users.
Solution & Benefits
After learning about ContainIQ, a Kubernetes monitoring and tracing platform, Ultimate Tournament made the decision to instrument ContainIQ across the company’s Kubernetes infrastructure on EKS. With less than 50 nodes between their two clusters, the Ultimate Tournament team was able to use ContainIQ’s self-service sign-up process.
Ultimate Tournament choose ContainIQ’s Power plan which offers five core features for $20 per node per month, and $0.50 per GB of log ingest:
(1) metrics: CPU and memory for pods/nodes, view limits, capacity, and correlate to events, alert on changes;
(2) events: K8s events dashboard, correlate to logs, alerting (ex crash loops, evictions, etc);
(3) latency: monitor RPS, p95, and p99 latencies by microservices, including by URL path, alerts; and
(4) logs: container level log storage and search.
(5) tracing: View all incoming and outgoing HTTP requests alongside metadata (ex. the status code of the response, the latency of the request, and the pod(s) / services involved in the request.)
Using ContainIQ’s Kubernetes Events dashboard, Ultimate Tournament has been able to find issues, including pod evictions, and to correlate these events to logs at given points in time. By clicking once on Warning events on the Events dashboard, engineers from Ultimate Tournament are instantly able to view logs at the time when the events occurred.
And by using ContainIQ’s latency and tracing dashboards, Ultimate Tournament is able to identify problematic changes in latency for particular services, paths, and individual requests. For gaming companies, latencies are particularly problematic, and for skills-based gaming companies like Ultimate Tournament where real money is on the line, spikes in latency for end users can become even more problematic. ContainIQ makes it easy to identify and alert on these spikes. And to view the logs instantly at the points in time when these spikes occurred.
Finally, Ultimate Tournament is able to monitor for particular status codes in real-time using the Tracer dashboard.
“For us at Ultimate Tournament, it’s all about knowing when to do things manually, and when to let a tool handle the work for us. Using ContainIQ enables us to spend our time actually working on our platform, instead of configuring observability, dashboards, and alerts.” Said Goodman.
ContainIQ’s affordable and simple pricing model gave the team confidence that ContainIQ would be the long-term solution today and as the size of its infrastructure scales dramatically this year. Ultimate Tournament estimates that it was able to reduce its observability costs by about 90%.
Onboarding Experience
Setting up ContainIQ was a relatively straightforward process for Ultimate Tournament. After completing the online self-service registration and initial installation, the Ultimate Tournament team was able to invite the entire engineering team to the shared account.
Next, the Ultimate Tournament team was able to connect the Slack channel for notifications, especially important for real-time notifications on changes in latency and Warning events.
All five of ContainIQ dashboards came pre-configured and started to populate within a couple of minutes.
“There is beauty in simplicity. Even the most managed services like Datadog still have you spending hours tuning Helm values and designing dashboards that fit the needs of modern developers. So many of these tools provide too many capabilities without prioritizing the important ones, so developers get lost in the noise.” Said Goodman.
When pressed to provide an estimate, Ultimate Tournament estimated that it took less than 30 minutes to sign-up and get running with ContainIQ.
And because ContainIQ delivers an incredible amount of value at a predictable price point, Ultimate Tournament was able to confidently deploy ContainIQ across their entire environment including developmental and production clusters. And Ultimate Tournament is able to toggle between these clusters seamlessly and also aggregate data when needed.
Get Started
ContainIQ is a Kubernetes monitoring and tracing platform. With ContainIQ, it is easy to correlate metrics, logs, events, latencies, and traces.
To get started with ContainIQ, you can sign-up directly on our website just like Ultimate Tournament did use the self-service flow. You can also Book a Demo with a member of our team to learn more about how ContainIQ can help improve cluster health and observability at your company.