New Version of StreamSets Data Collector Released
April 01, 2016
Share this

StreamSets Inc. announced the latest release of StreamSets Data Collector, continuous ingest software that automatically addresses the problem of data drift.

The new release helps enterprises accelerate their time to insights by proactively managing the completeness, accuracy and consistency of their data as it flows from collection to consumption.

“Given the siloed, yet strategic nature of data, enterprises must develop a culture of data performance management,” said Girish Pancha, CEO, StreamSets, Inc. “Just as network operations and security operations matured from numerous siloed projects into centers of excellence, we believe it is time for data operations to make that same critical leap. StreamSets was founded to build the cornerstone infrastructure upon which enterprises can institute disciplined performance management for their data-in-motion.”

With its latest v1.2 release, StreamSets Data Collector automates data drift handling and now supports the Big 3 major Hadoop distributions from Cloudera, MapR and Hortonworks. This version also is certified with the MapR Converged Data Platform including extended support for MapR Streams. It also provides connectors for other popular big data technologies such as Elasticsearch, NoSQL databases such as MongoDB and Cassandra and transient stores such Apache Kafka, MapR Streams and JMS-compliant message queues.

StreamSets Data Collector gives enterprises the necessary control, efficiency and agility to effectively manage performance of their data flows.

- Data flow KPIs for real-time control: Uniquely, StreamSets Data Collector monitors, detects and acts on changes in data patterns alongside providing fine-grained metrics on data flow throughput, latency and error rates. Data drift-handling rules ensure that pipelines flow correctly even when schema changes. Threshold rules, alerts and plug-in processors combine to identify, filter, re-route and sanitize anomalies in-stream to ensure that data lands ready for consumption.

- Adaptable pipelines for efficiency: StreamSets Data Collector provides a visual (integrated development environment) IDE for the design and execution of intent-driven data flows with minimal schema specification and custom code. It is a highly flexible environment, handling both batch and streaming data, and deploying on edge nodes, natively in clusters and as part of an application stack.

- Containerized architecture for agility: Built for continuous operations, StreamSets Data Collector addresses the issues of constant infrastructure upgrades and data flow evolution head on. Each source, stage and destination in a pipeline is isolated, allowing you to maintain and modernize your data infrastructure while ensuring zero downtime.

Share this

The Latest

May 22, 2024

As IT practitioners, we often find ourselves fighting fires rather than proactively getting ahead ... Many spend countless hours managing several tools that give them different, fractured views of their own work — which isn't an effective use of time. Balancing daily technical tasks with long-term company goals requires a three-step approach. I'll share these steps and tips for others to do the same ...

May 21, 2024

IT service outages are more than a minor inconvenience. They can cost businesses millions while simultaneously leading to customer dissatisfaction and reputational damage. Moreover, the constant pressure of dealing with fire drills and escalations day and night can take a heavy toll on ITOps teams, leading to increased stress, human error, and burnout ...

May 20, 2024

Amid economic disruption, fintech competition, and other headwinds in recent years, banks have had to quickly adjust to the demands of the market. This adaptation is often reliant on having the right technology infrastructure in place ...

May 17, 2024

In MEAN TIME TO INSIGHT Episode 6, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses network automation ...

May 16, 2024

In the ever-evolving landscape of software development and infrastructure management, observability stands as a crucial pillar. Among its fundamental components lies log collection ... However, traditional methods of log collection have faced challenges, especially in high-volume and dynamic environments. Enter eBPF, a groundbreaking technology ...

May 15, 2024

Businesses are dazzled by the promise of generative AI, as it touts the capability to increase productivity and efficiency, cut costs, and provide competitive advantages. With more and more generative AI options available today, businesses are now investigating how to convert the AI promise into profit. One way businesses are looking to do this is by using AI to improve personalized customer engagement ...

May 14, 2024

In the fast-evolving realm of cloud computing, where innovation collides with fiscal responsibility, the Flexera 2024 State of the Cloud Report illuminates the challenges and triumphs shaping the digital landscape ... At the forefront of this year's findings is the resounding chorus of organizations grappling with cloud costs ...

May 13, 2024

Government agencies are transforming to improve the digital experience for employees and citizens, allowing them to achieve key goals, including unleashing staff productivity, recruiting and retaining talent in the public sector, and delivering on the mission, according to the Global Digital Employee Experience (DEX) Survey from Riverbed ...

May 09, 2024

App sprawl has been a concern for technologists for some time, but it has never presented such a challenge as now. As organizations move to implement generative AI into their applications, it's only going to become more complex ... Observability is a necessary component for understanding the vast amounts of complex data within AI-infused applications, and it must be the centerpiece of an app- and data-centric strategy to truly manage app sprawl ...

May 08, 2024

Fundamentally, investments in digital transformation — often an amorphous budget category for enterprises — have not yielded their anticipated productivity and value ... In the wake of the tsunami of money thrown at digital transformation, most businesses don't actually know what technology they've acquired, or the extent of it, and how it's being used, which is directly tied to how people do their jobs. Now, AI transformation represents the biggest change management challenge organizations will face in the next one to two years ...