Data Accelerator, Microsoft's new contribution to open source

data-accelerator

Throughout the past year Microsoft has been continuously opening and creating open source projects with which it has begun to put aside that great rivalry to the world of open source (or so it seems).

And it is not to belittle its activity but unlike the past decade where it had a declared war on open source, things have changed at least in recent years. As in recent days Microsoft unveiled to the community that has made the decision to open released a large-scale data processing project, Data Accelerator, which was originally used internally.

Since its development in 2017, the project has been applied on a large scale to various Microsoft product channels.

About Data Accelerator

Data Accelerator started in 2017 as a large-scale data processing project in Microsoft's Developers Divisiont that finally made it to Apache Spark for reasons of scale and speed.

Data Accelerator is more than just a conduit between EventHub and the database.

It enables users to reshape incoming events as they continue to broadcast, and then to route different parts of the same event to different data stores while providing health monitoring and alerts for the entire pipeline state.

Data Accelerator also provides a configuration user interface and rule / query design experience that allows users to get up and running without having to write any code.

In addition, anyone who performs transmission data processing as usual you need to use a sliding window to process the data, or to process the delay to get to the data, or to accumulate data over time.

Features

data accelerator supports and simplifies the use of these advanced features.

According to the official Microsoft open source blog, some Data Accelerator methods make it easy to create streams in Apache Spark:

Plug&Play: allows you to easily configure input sources and output receivers to create pipelines in minutes.

Data Accelerator supports fetching data from Eventhub and IoThub and supports downloading data to Azure blobs, CosmosDB, Eventhub, and more.

No-Code Experience: supports the ability to configure alerts and data processing without writing any code.

With the expertise of Rule Designer, you can specify simple and aggregated data processing, marking, and alerts.

SQL queries: allows writing of complex processing in SQL, no need to work in Scala.

The built-in extensibility model also supports user-defined functions and leverages the functionality of Azure, for example for streaming in ML.

Real-time consultations: Saves setup and test pipe processing time by running against incoming data samples and validating your queries in seconds.

Finally, Microsoft mentioned that the data accelerator supports a quick verification cycle for development test loops, where queries for sampled local events can be iteratively corrected to be available prior to deployment, which can save a lot of time for test workflow processing.

Data Accelerator is used daily by Microsoft's Developer Division and will continue to make toolchain improvements over time, but we recognize that the toolset could do much more as needed.

Data Accelerator offers the possibility to anyone who wishes to enable and simplifies the use of these advanced features.

We hope that by opening this project, some of you will find the data accelerator even more useful.

If you want to get more information About the Data Accelerator code, you can visit the announcement on Microsoft's open source blog.

The link is this.


Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: AB Internet Networks 2008 SL
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.