MOUNTAIN VIEW, Calif., Aug. 05, 2025 (GLOBE NEWSWIRE) — DataPelago today announced the launch of the DataPelago Accelerator for Spark, the world's first accelerator to combine native execution, CPU vectorization, and GPU acceleration for Apache Spark workloads. Powered by DataPelago Nucleus — the company's universal data processing engine — the Accelerator delivers significant price-performance advantages with no code changes required.
DataPelago Accelerator for Spark is a plug-and-play accelerator that integrates seamlessly with existing Apache Spark clusters and infrastructure. Organizations can deploy the Accelerator in minutes without migrating to new platforms, rewriting code, or changing data connectors or security policies.
“In recent years, the cost of operating a data processing stack over Apache Spark infrastructure has become untenable due to the growth in size and complexity of data,” said Rajan Goyal, CEO of DataPelago. “For organizations to unlock the full value of their data and maximize efficiency and cost-savings, open-source software systems like Apache Spark must take advantage of accelerated computing.”
The Accelerator is designed for companies processing data for GenAI and analytics workloads. As data volumes explode and compute costs soar, organizations face significant tradeoffs between open-source systems with lower licensing fees but higher infrastructure costs, and proprietary services with better performance but much higher licensing fees.
DataPelago Accelerator for Spark runs natively within Spark worker nodes, accelerating query execution plans in real time. It leverages vectorized execution, columnar processing, hardware acceleration, and I/O optimization to maximize throughput across CPUs, GPUs, and everything in between. The Accelerator intelligently determines at runtime which hardware is optimal for each operation based on the data being processed.
With the Accelerator, organizations can achieve both lower licensing fees and lower infrastructure costs:
- Up to 10x faster performance with zero application or security changes
- Up to 80% cost reduction for data processing workloads
- Plug-and-play deployment in minutes, not weeks or months
- Full compatibility with Spark's authentication and encryption protocols
- Real-time performance tracking with built-in observability tools
“Since its inception, Velox has been deeply focused on accelerating analytical workloads. To date, this acceleration has been oriented around CPUs, and we've seen the impact that lower latency and improved resource utilization have on businesses' data management efforts,” said Orri Erling, co-founder of Velox. “DataPelago's Accelerator for Spark, leveraging Nucleus for GPU architectures, introduces the potential for even greater speed and efficiency gains for organizations' most demanding data processing tasks.”
Many customers have deployed or are piloting DataPelago Accelerator for Spark, realizing major benefits even on their existing servers. Some of the results include:
- Delivered 3-4x speed up and 60-70% cost reduction for petabyte-scale ETL workloads for a Fortune 100 customer
- RevSure, a major e-commerce company, deployed the Accelerator in just 48 hours, achieving measurable performance gains and cost savings processing hundreds of terabytes through its data pipelines daily
- ShareChat, India's premiere social media platform serving over 350 million users, increased job speeds by 2x with 50% cost reduction
The Accelerator is available in GCP and AWS, with expanded availability through Google Cloud Marketplace providing customers a streamlined way to discover, procure, and deploy the Accelerator.
You can find DataPelago's Accelerator for Spark at datapelago.ai/dpa.
About DataPelago
DataPelago is unleashing the data acceleration revolution that AI demands. Today, AI's relentless hunger for data acceleration at massive scale has created the ultimate chokepoint — without economically scaled data processing, AI innovation itself will be throttled. At DataPelago, we're unleashing breakthrough thinking to transform data processing economics and ignite the next wave of AI-powered revolution.
DataPelago Nucleus is the world's first universal data processing engine built for accelerated computing, purpose-built to process any type of data, operate across any hardware, and support any query engine, delivering new price/performance benefits that make it viable to extract value from all the data in the world, igniting an AI-powered revolution.
DataPelago is backed by Eclipse, Taiwania Capital, Qualcomm Ventures, Alter Venture Partners, Nautilus Venture Partners, and Silicon Valley Bank, a division of First Citizens Bank. To learn more, visit datapelago.ai.
Media Contact
LaunchSquad for DataPelago
datapelago@launchsquad.com