StreamSets Transformer Released
September 09, 2019
Share this

StreamSets released StreamSets Transformer, a simple-to-use, drag-and-drop UI tool to create native Apache Spark applications.

Designed for a wide range of users — even those without specialized skills — StreamSets Transformer enables the creation of pipelines for performing ETL, stream processing and machine-learning operations. Now, data engineers, scientists, architects and operators gain deep visibility into the execution of Apache Spark while broadening usage across the business.

Apache Spark delivers on the promise of advanced data processing and machine learning at scale. But there are drawbacks. Developing and operating applications on Apache Spark is complex and requires hand-coding. It is typically restricted to developers and companies with mature data engineering and data science practices. In addition, users often have very limited visibility into how their Apache Spark jobs are running. StreamSets Transformer solves these issues. Its easy-to-use, logical user interface and rich tools for designing data transformations eliminate the complexity and need for specialized skills. Pipelines instrumented with StreamSets Transformer provide unparalleled visibility into every Spark execution. Equally important, developers now have a single tool to build both batch and streaming pipelines.

The key features of StreamSets Transformer include:

- Continuous monitoring — Unparalleled visibility into Apache Spark application execution

- Continuous data — Runs in both batch and streaming modes

- Progressive error handling — Finds where and why errors occur without the need for Apache Spark skills to decipher complex log files

- Execute on Apache Spark anywhere — Works in the cloud, Kubernetes or on premises

- Highly extensible — Higher order transformation primitives for the ETL developer, SparkSQL for the analyst, PySpark for the data scientist, and custom Java/Scala processors for the Apache Spark developer

- Sets-based processing — For ETL, machine learning and complex event processing

“With StreamSets Transformer, Apache Spark is finally available to a wide range of users, enabling visibility, monitoring and reporting for mission-critical workloads,” said Arvind Prabhakar, CTO of StreamSets. “In essence, StreamSets Transformer brings the power of Apache Spark to businesses, while eliminating its complexity and guesswork.”

“With StreamSets Transformer and Databricks integrated together, even more users can easily access the powerful capabilities of Delta Lake and our optimized Apache Spark for data science and analytics,” said Michael Hoff, SVP of Business Development and Partners at Databricks. “Especially as organizations migrate from legacy on premises platforms, our partnership will help them efficiently make that transition to manage their data and machine learning workloads in the cloud.”

StreamSets Transformer is available immediately.

Share this

The Latest

January 23, 2020

EMA is about to embark on some new research entitled Data-Driven Automation: A Vision for the Modern CIO. We're trying to piece a puzzle together that so far we don't believe anyone to date has fully done — seek out where and how IT is moving toward integrated strategies for automation in context with real-world objectives and obstacles. We'll be looking at four use cases, each of will no doubt tell its own story ...

January 22, 2020

Many pitfalls await CIOs on the journey to the cloud. In fact, a majority of companies have been only partially successful, while some are outright failing. To learn more about this migration, Business Performance Innovation (BPI) Network surveyed IT and business executives and conducted in-depth interviews ...

January 21, 2020

The online retail industry has yet to have a Black Friday/Cyber Monday weekend unscathed by web performance (speed and availability) problems. Luckily, performance during 2019's hyper-critical online holiday shopping weekend was better than in years past, as we did not see any systemic, lengthy outages. While no website went completely down, several retailers did experience significant problems. Why have online retailers yet to figure out how to be crash-free during this all-important peak traffic period? We've identified several reasons for this ...

January 16, 2020

Gartner highlighted the trends that infrastructure and operations (I&O) leaders must start preparing for to support digital infrastructure in 2020 ...

January 15, 2020

Edge computing usage is starting to increase. The obvious follow-up question is, "So, what can I do with edge computing?" I'm glad you asked. There are lots of things you can do ...

January 14, 2020

Industry experts offer predictions on how Network Performance Management (NPM) and related technologies will evolve and impact business in 2020. Part 2 offers predictions about 5G and more ...

January 13, 2020

Industry experts offer predictions on how Network Performance Management (NPM) and related technologies will evolve and impact business in 2020 ...

January 09, 2020

With AI on the edge, companies will more easily monitor desktops, tablets and other end-user devices. AIOps will enable IT to guide employees on improving productivity from the applications installed on their devices while delivering greater visibility and control around the entire IT environment ...

January 08, 2020

2020 will see AIOps adoption going mainstream as use cases crystallize for improving IT efficiencies and supporting faster decision-making. Expect AI-enhanced automation to become smarter and more contextual, move towards the edge, and used increasingly for customer and user experience analysis. Yet there are significant challenges and cautions, which will shape AI's development in not only IT but across business and society ...

January 07, 2020

Industry experts offer predictions on how Digital Transformation will evolve and impact business in 2020 ...