Thursday, March 1, 2018

The Apache Sentry security service - part IV

This is the fourth in a series of blog posts on the Apache Sentry security service. The first post looked at how to get started with the Apache Sentry security service, both from scratch and via a docker image. The second post looked at how to define the authorization privileges held in the Sentry security service. The third post looked at securing Apache Kafka withe Apache Sentry, where the privileges were defined in the Sentry security service. In this post, we will update an earlier tutorial I wrote on securing Apache Hive using Apache Sentry to also retrieve the privileges from the Sentry security service.

1) Configure authorization in Apache Hive

Please follow this tutorial to install and configure Apache Hadoop and Apache Hive, except use version 2.3.2 of Apache Hive, which is the version supported by Apache Sentry 2.0.0. After installation, follow the instructions to create a table in Hive and make sure that a query is successful. Now we will integrate Apache Sentry 2.0.0 with Apache Hive. First copy the jars from the "lib" directory of the Sentry distribution to the Hive "lib" directory. We need to add three new configuration files to the "conf" directory of Apache Hive.

Create a file called 'conf/hiveserver2-site.xml' with the content:

Here we are enabling authorization and adding the Sentry authorization plugin. Note that it differs a bit from the hiveserver2-site.xml given in the previous tutorial, namely that we are not using the "v2" Sentry Hive binding as before.

Next create a new file in the "conf" directory of Apache Hive called "sentry-site.xml" with the following content:

This is the configuration file for the Sentry plugin for Hive. It instructs Sentry to retrieve the authorization privileges from the Sentry security service, and to get the groups of authenticated users from the 'sentry.ini' configuration file. As we are not using Kerberos, the "testing.mode" configuration parameter must be set to "true". Finally, we need to define the groups associated with a given user in 'sentry.ini' in the conf directory:

Here we assign "alice" the group "user". Note that in the earlier tutorial this file also contained the authorization privileges, but they are not required in this scenario as we are using the Apache Sentry security service.

2) Configure the Apache Sentry security service

Follow the first tutorial to install the Apache Sentry security service. Now we need to create the authorization privileges for our Apache Hive test scenario as per the second tutorial. Start the 'sentryCli" in the Apache Sentry distribution, and assign a role to the "user" group (of which "alice" is a member) with the privilege to perform a "select" statement on the "words" table:
  • cr select_role
  • gp select_role "Server=server1->Db=default->Table=words->Column=*->action=select"
  • gr select_role user
Now we can test authorization after restarting Apache Hive. The user 'alice' should now be able query the table according to our policy:
  • bin/beeline -u jdbc:hive2://localhost:10000 -n alice
  • select * from words where word == 'Dare'; (works)