Tuesday, June 13, 2017

Securing Apache HBase - part I

This is the first in a short series of blog posts on securing Apache HBase. HBase is a column-based database that facilitates random read/write access to data stored in the Hadoop FileSystem (HDFS). In this post we will focus on setting up a standalone instance of Apache HBase, and then demonstrate how to use Apache Ranger to authorize access to a HBase table.

1) Install Apache HBase

Download Apache HBase (version 1.2.6 was used for the purposes of this tutorial) and extract it. As stated above, we will set up a standalone version of HBase, which means that HBase itself and Apache Zookeeper run in a single JVM, and data is stored in the local filesystem instead of HDFS. Normally we would authenticate users via Kerberos, but as we are just running HBase in standalone mode, we will focus solely on authorization in this series of tutorials. Start HBase via:
  • bin/start-hbase.sh
Then start the shell and create a sample table called "data", with two column families, and add some rows to the table:
  • bin/hbase shell
  • create 'data', 'colfam1', 'colfam2'
  • put 'data', 'row1', 'colfam1:col1', 'val1'
  • put 'data', 'row1', 'colfam2:col1', 'val2'
  • scan 'data'
The latter command will print out the values stored in the table. Next we will look at using Apache Ranger to restrict access to the 'data' table to authorized users only.

2) Install the Apache Ranger HBase plugin 

Download Apache Ranger and verify that the signature is valid and that the message digests match. Extract and build the source, and copy the resulting plugin to a location where you will configure and install it, e.g.:
  • mvn clean package assembly:assembly -DskipTests
  • tar zxvf target/ranger-1.0.0-SNAPSHOT-hbase-plugin.tar.gz
  • mv ranger-1.0.0-SNAPSHOT-hbase-plugin ${ranger.hbase.home}
Now go to ${ranger.hbase.home} and edit "install.properties". You need to specify the following properties:
  • POLICY_MGR_URL: Set this to "http://localhost:6080"
  • REPOSITORY_NAME: Set this to "cl1_hbase".
  • COMPONENT_INSTALL_DIR_NAME: The location of your Apache HBase installation
Save "install.properties" and install the plugin as root via "sudo ./enable-hbase-plugin.sh". The Apache Ranger HBase plugin should now be successfully installed. The ranger plugin will try to store policies by default in "/etc/ranger/cl1_hbase/policycache". As we installed the plugin as "root" make sure that this directory is accessible to the user that is running HBase.

3) Configure authorization policies in the Apache Ranger Admin UI 

The next step is to create some authorization policies for Apache HBase in the Apache Ranger admin service. Please refer to this blog post for information on how to install the Apache Ranger admin service. Assuming the admin service is already installed, start it via "sudo ranger-admin start". Open a browser and log on to "localhost:6080" with the credentials "admin/admin".

Create a new HBase service, adding the following configuration items to the default values:
  • Service Name: cl1_hbase
  • Username/Password: admin
  • hbase.zookeeper.quorum: localhost
Click on "Test Connection" (if HBase is running) to verify that the connection is successful (note: only works from 1.0.0 onwards - see RANGER-1640) and then save the service. Click on "cl1_hbase" and edit the default policy which has been created, and add the user running HBase to the "Allow Condition" permission.

Now we will add a new authorization policy to test access to HBase. Under "Settings + Users/Groups" add two new users called "alice" and "bob", and also create these new users in your local system. Now we can create a new authorization policy to grant "alice" the "Read" permission for the "data" table (all column families and columns).



4) Testing authorization in HBase

The policy we have created above will get downloaded and enforced by the Ranger HBase plugin we installed into HBase. Restart HBase before proceeding further (if it was running with the Ranger plugin before downloading the policy which granted the user running HBase "admin" privileges, then HBase might not be working properly). Now start the shell as "alice" and try to read the table we created earlier:
  • sudo -E -u alice bin/hbase shell
  • scan 'data'
This should work due to the authorization policy we created. However "alice" should not be allowed to write to "data", e.g the following should result in a "AccessDeniedException":
  • put 'data', 'row1', 'colfam1:col1', 'val3'

1 comment:

  1. It’s been a amazing article. It’s provide lot’s of information, I really enjoyed to read this. thank u so much
    for your sharing
    big data hadoop course in chennai with placement
    best institute for big data in chennai
    best hadoop training in chennaii

    ReplyDelete