Source: OStatic Blog

OStatic Blog Ranger Joins Many Big Data Projects Graduating at Apache

Over the past couple of years, we've steadily taken note of the many projects that the Apache Software Foundation has been elevating to Top-Level Status. The organization incubates more than 350 open source projects and initiatives, and has recently squarely turned its focus to Big Data and developer-focused tools. As Apache moves Big Data projects to Top-Level Status, they gain valuable community support. Recently, the foundation announced that Apache Kudu had graduated as a Top-Level project. Then, the news came that Apache Geode had graduated from the Apache Incubator as well. It is a very interesting open source in-memory data grid that provides transactional data management for scale-out applications needing low latency response times during high concurrent processing.And now, it's announced that Apache Ranger has graduated from the Apache Incubator to become a Top-Level Project (TLP). Ranger is a centralized framework used to define, administer and manage security policies consistently across Apache Hadoop components. Ranger is a centralized way to manage security policies across components, and it also offers the most comprehensive security coverage, with native support for numerous Apache projects, including Atlas (incubating), HBase, HDFS, Hive, Kafka, Knox, NiFi, Solr, Storm, and YARN. According to Apache:Apache Ranger provides a simple and effective way to set access control policies and audit the data access across the entire Hadoop stack by following industry best practices. One of the key benefits of Ranger is that access control policies can be managed by security administrators from a single place and consistently across hadoop ecosystem. Ranger also enables the community to add new systems for authorization even outside Hadoop ecosystem, with a robust plugin architecture, that can be extended with minimal effort. In addition, Apache Ranger provides many advanced features, such as:- Ranger Key Management Service (compatible with Hadoop's native KMS API to store and manage encryption keys for HDFS Transparent Data Encryption);- Dynamic column masking and row filtering;- Dynamic policy conditions (such as prohibition of toxic joins);- User context enrichers (such as geo-location and time of day mappings)"As early adopters of Apache Ranger and having contributed to Apache Ranger, we have come to rely upon Apache Ranger as a key part of our security infrastructure for data," said Ferd Scheepers, Chief Information Architect at ING. "We are therefore pleased to learn that the project has now graduated to a TLP project through the efforts of the Apache community. We believe that Apache Ranger represents the best-in-class Open Source security framework for authorization, encryption management, and auditing across Hadoop ecosystem. We laud the community's efforts in building an extensible and enterprise grade architecture for Apache Ranger, and for innovative features such as tag or classification based security (built in conjunction with Apache Atlas). We congratulate the Apache Ranger community on achieving this significant milestone and are confident Apache Ranger will evolve into the de-facto standard for security stack across the Hadoop ecosystem.""As heavy users of Apache Ranger in production, we are pleased to see the project become a TLP through validation across community efforts," said Timothy R. Connor, Big Data & Advanced Analytics Manager at Sprint. "Apache Ranger has built a next generation ABAC model for authorization along with a robust data-centric Open Source security framework supporting advanced security capabilities such as dynamic row filtering and column masking. All of these point to Apache Ranger maturing into a robust and comprehensive security product for authorization, encryption management and auditing through the Apache community."Here is more on numerous other Apache Big Data projects that are moving forward:Allura. According to the Allura project page, new features include an Admin Nav Bar, which is a an improvement on how users customize the tools of a project. There is also a new interface. Apache encourages users to read an admin toolbar post to see how easy it is to access tool configurations and add new tools with Allura.Brooklyn. The foundation announced that Apache Brooklyn is now a Top-Level Project (TLP), "signifying that the project's community and products have been well-governed under the ASF's meritocratic process and principles." Brooklyn is an application blueprint and management platform used for integrating services across multiple data centers as well as and a wide range of software in the cloud.According to the Brooklyn announcement:"With modern applications being composed of many components, and increasing interest in micro-services architecture, the deployment and ongoing evolution of deployed apps is an increasingly difficult problem. Apache Brooklyn's blueprints provide a clear, concise way to model an application, its components and their configuration, and the relationships between components, before deploying to public Cloud or private infrastructure. Policy-based management, built on the foundation of autonomic computing theory, continually evaluates the running application and makes modifications to it to keep it healthy and optimize for metrics such as cost and responsiveness."Brooklyn is in use at some notable organizations. Cloud service providers Canopy and Virtustream have created product offerings built on Brooklyn. IBM has also made extensive use of Apache Brooklyn in order to migrate large workloads from AWS to IBM Softlayer.Kylin. Meanwhile, the foundation has also just announced that Apache Kylin, an open source big data project born at eBay, has graduated to Top-Level status. Kylin is an open source Distributed Analytics Engine designed to provide an SQL interface and multi-dimensional analysis (OLAP) on Apache Hadoop, supporting extremely large datasets. It is widely used at eBay and at a few other organizations."Apache Kylin's incubation journey has demonstrated the value of Open Source governance at ASF and the power of building an open-source community and ecosystem around the project," said Luke Han, Vice President of Apache Kylin. "Our community is engaging the world's biggest local developer community in alignment with the Apache Way."As an OLAP-on-Hadoop solution, Apache Kylin aims to fill the gap between Big Data exploration and human use, "enabling interactive analysis on massive datasets with sub-second latency for analysts, end users, developers, and data enthusiasts," according to developers. "Apache Kylin brings back business intelligence (BI) to Apache Hadoop to unleash the value of Big Data," they added.Lens. Apache recently announced that Apache Lens, an open source Big Data and analytics tool, has graduated from the Apache Incubator to become a Top-Level Project (TLP).According to the announcement:"Apache Lens is a Unified Analytics platform. It provides an optimal execution environment for analytical queries in the unified view. Apache Lens aims to cut the Data Analytics silos by providing a single view of data across multiple tiered data stores.""By providing an online analytical processing (OLAP) model on top of data, Lens seamlessly integrates Apache Hadoop with traditional data warehouses to appear as one. It also provides query history and statistics for queries running in the system along with query life cycle management.""Incubating Apache Lens has been an amazing experience at the ASF," said Amareshwari Sriramadasu, Vice President of Apache Lens. "Apache Lens solves a very critical problem in Big Data analytics space with respect to end users. It enables business users, analysts, data scientists, developers and other users to do complex analysis with ease, without knowing the underlying data layout."Ignite. The ASF has announced that Apache Ignite is to become a top-level project. It's an open source effort to build an in-memory data fabric that was driven by GridGain Systems and WANdisco.Apache Ignite is a high-performance, integrated and distributed In-Memory Data Fabric for computing and transacting on large-scale data sets in real-time, "orders of magnitude faster than possible with traditional disk-based or flash technologies," according to Apache. It is designed to easily power both existing and new applications in a distributed, massively parallel architecture on affordable, industry-standard hardware.Tajo. Apache Tajo v0.11.0, an advanced open source data warehousing system in Apache Hadoop, is another new Top-Level project. Apache claims that Tajo provides the ability to rapidly extract more intelligence fro Hadoop deployments, third party databases, and commercial business intelligence tools.And of course, Spark and other previously announced Big Data tools overseen by Apache are flourishing. Look for many more data- and developer-focused tools to move forward at Apache in the months to come. Related ActivitiesComments (1)Post a CommentAsk a QuestionRelated Blog PostsNew Options for Valuable Hadoop and Spark Training (post comment)MXNet, a Deep Learning Tool, Joins Apache's Incubator (post comment)EIT Digital to Launch Hadoop-Based Software Framework, and a Startup (post comment)

Read full article »
Est. Annual Revenue
$5.0-25M
Est. Employees
25-100
CEO Avatar

CEO

Update CEO

CEO Approval Rating

- -/100

Read more