Every Apache Spark development company has a reason to rejoice as MAPR Technologies has released the version 3.0 of MapR Ecosystem Pack (MEP) that is targeted towards providing improved security for Apache Spark, new Apache Spark connectors for MapR-DB and HBase, integrations with Drill, and faster version of Hive.
The following provides an overview of the latest MEP 3.0 updates for the projects running on MapR:
This contains enhancements related to BI tool integration, end-to-end application security, and application performance in general.
This version of Hive has faster speed, as there are a lot of performance-centric improvements in data processing and querying.
This has significant improvements related to enterprise level stability and security, which make the applications enterprise-ready.
This provides tight integration through MapR-DB records in real-time, leading to improved efficiency of database transactions in the application.
This package provides new APIs for C and Python.
This installer has features to simplify upgrades and add-ons to the existing MEP version.
How will these features benefit developers?
MapR Technologies claims that its MEP ecosystem pack works on the pain points of the complexity arising due to the coordination issues in community projects and versions. It basically works on development, testing and integrating issues in open source projects like Apache Drill, Apache Spark, Hive, Myriad, etc.
Apart from the latest version (3.0), new versions of MEP are released on a quarterly basis. This helps the developers to work on the latest features of all the community-driven softwares like Apache Spark, Apache Drill, and others.
MEP handles version compatibility problems for the developers. Hence, instead of working on/upgrading separate installations of the open source softwares (instead of using their bundled versions in MEP), installing the latest MEP guarantees inter-project compatibility between the earlier and newer versions of softwares in MEP. This gives the developers in the Apache Spark development company enough bandwidth to work on the actual business logic in the project code rather than spending time troubleshooting compatibility issues with other software/project.
Upgrading MEP version does not make changes to the existing core MapR platform installation. Hence, it is easier to install the quarterly updates that only upgrade the open source project stacks.
Feature additions to Apache Drill and Apache Hive
Apache Hive 2.1.1
As mentioned above, Hive 2.1.1 will increase speed for data processing, have smaller latency for interactive queries, and increase throughput for batch queries. This leads to better big data processing abilities.
Key features:
Apache Drill 1.10
As mentioned above, this update provides improved BI tools, end-to-end security, and performance.
Key features:
What is special about the Apache Spark release in MEP 3.0?
As mentioned above, the Apache Spark 2.1.0 focuses on providing enterprise level stability and security. This is a result of the following features that are incorporated in the release:
All the content shared in this post belongs to the author of Apache Spark Consulting Company. Share your thoughts with other readers and let them know about your views.
Apache Spark is hailed for its exceptional data processing and analyzing capacities that are a result of its well-developed machine learning library (MLib). Data clustering is typically an offline process...