PubMatic is one of the United States largest AdTech companies. Since 2006, PubMatic has created an efficient global infrastructure with eight global data centers. The company is one of the industry’s leaders in programmatic advertising innovation. As of December 2020, PubMatic every day served 200 billion ad impressions, handled one trillion advertiser bids, and processed more than 2 Petabytes of new data.
*As of December 2020
The Acceldata Platform isolated bottlenecks, automated performance improvements, and distinguished between mandatory and unnecessary data to rapidly scale big data environment to meet expanding business requirements and reliably support mission-critical and customer-facing analytics requirements.
PubMatic is in hyper-scale mode. Its current environment includes 3,000+ nodes, 150+ Petabytes and 65+ open-source HDP (Horton Dataworks Platform) Clusters and is expanding rapidly.
In addition, PubMatic uses other tools in the Hadoop big data stack, including Yarn, Kafka (50+ small Kafka clusters with 10-15+ nodes/cluster), Spark, and HBase.
Situation
Because of its massively scaled environment, PubMatic consistently experienced high MTTR (Mean Time to Resolution) metrics, frequent outages, and performance bottlenecks. Many of the issues stemmed from its large numbers of nodes — in one case, 1,500 nodes in a single cluster.
The system’s instability resulted in time-consuming operational issues and constant daily firefighting. In addition, PubMatic was looking for ways to reduce its infrastructure and OEM support costs.
Business Impact
When PubMatic’s data system performance wasn’t able to keep pace with its rapidly-expanding business requirements, the company decided to implement a data observability platform to improve reliability, scalability, and the return on investment on its data operations.
The inability to correlate events across the infrastructure, data layers and pipelines meant that PubMatic could not materially improve its ‘cost per ad impression’ metric, which is one of its most critical performance metrics. In addition, the company’s rapid scaling resulted in unnecessary software licenses, which it felt could better align with actual needs.
Finally, engineering’s constant involvement in resolving operational system issues caused a distraction from the real objectives of scaling the data system to support the fast-growing business requirements.
PubMatic began using the Acceldata Platform in mid-2020. At the data compute layer, Acceldata immediately provided improved visibility into the inner workings of PubMatic’s data applications and comprehensive observability for complex, interconnected data systems.
One of Acceldata's most important benefits was its ability to predict, prevent and optimize PubMatic’s data system performance at the very large scale that today’s digital ad market requires. In PubMatic’s environment, Acceldata isolated bottlenecks and automated performance improvements.
The Acceldata Platform distinguished between mandatory and unnecessary data to ensure scaled growth that could reliably support all critical enterprise and customer-facing analytics requirements.
Acceldata has helped PubMatic: