Tuesday, February 21, 2017

WS-Security with MTOM support in Apache CXF 3.2.0

Getting WS-Security to work with MTOM-enabled web services has been a long-standing feature request in Apache CXF. A couple of years ago, support was added to CXF and WSS4J to store raw cipher data in message attachments when MTOM is enabled, to avoid the cost of BASE-64 encoding the bytes and inlining them in the message. However, CXF did not support signing/encrypting content that contained xop:Include Elements (properly). In this case, just the references were signed/encrypted and not the attachments themselves (the user was alerted to this via a warning log). From Apache CXF 3.2.0, WS-Security with MTOM will be properly supported, something we will cover in this post.

1) Securing an MTOM-enabled message with WS-Security

Let's look at the outbound case first. There is a new configuration option in WSS4J 2.2.0:
  • expandXOPInclude: Whether to search for and expand xop:Include Elements for encryption and signature (on the outbound side). This means that the referenced bytes are encrypted/signed, and not just the references. The default is false on the outbound side in WSS4J.
CXF will set this configuration option to "true" automatically for both the "action" based and WS-SecurityPolicy based approaches if MTOM is enabled. Note that this configuration option also applies on the inbound side with slightly different semantics (see below).

The way this configuration option works is that it scans all children of all message elements to be signed/encrypted, and inlines any xop:Include bytes that it finds before signature/encryption. For the encryption case, if the "storeBytesInAttachment" configuration option is set to true (false in WSS4J, true by default in CXF if MTOM is enabled), the encrypted bytes are then stored in a message attachment. For signature, the original Element is retained and the inlined version is discarded and not included in the request, meaning that the signed bytes are not modified as a message attachment.

2) Validating an MTOM-enabled message with WS-Security

On the inbound side, the "expandXOPInclude" configuration option also applies:
  • expandXOPInclude: Whether to search for and expand xop:Include Elements prior in signed elements to signature verification. The default is "true". Note that this replaces the previous "expandXOPIncludeForSignature" configuration option prior to WSS4J 2.2.0.
CXF overrides this default behaviour by only setting "expandXOPInclude" to "true" on the inbound side if MTOM is enabled. So to summarize, if you wish to support WS-Security with MTOM in CXF from the (future) 3.2.0 release, you don't need to set any configuration option by default to get it to properly sign and encrypt the message bytes. CXF will take care of setting everything up for you.

Friday, February 17, 2017

Securing an Apache Kafka broker using Apache Ranger and Apache Atlas

Last year, I wrote a series of articles on securing Apache Kafka. In particular, the third article looked at how to use Apache Ranger to create authorization policies for Apache Kafka in the Ranger security admin UI, and how to install the Ranger plugin for Kafka so that it picks up and enforces the authorization policies. In this article, we will cover an alternative way of creating and enforcing authorization policies in Apache Ranger for Apache Kafka using Apache Atlas.

The Apache Ranger security admin UI allows you to assign users or groups a particular permission associated with a given Kafka topic. This is what is called a "Resource Based Policy" in Apache Ranger. However an alternative is also available called a "Tag Based Policy". Instead of explicitly associating the user/group + permission with a resource (such as a Kafka topic), instead we can associate the user/group + permission with a "tag" (we can also create "deny" based policies associated with a "tag"). The "tag" itself contains the information about the resource that is being secured.

How does Apache Ranger obtain the relevant tags and associated entities? This is where Apache Atlas comes in. The previous post described how to secure access to Apache Atlas using Apache Ranger. Apache Atlas allows you to associate "tags" with entities such as Kafka topics, Hive tables, etc. Apache Ranger provides a "tagsync" service which runs periodically and obtains the tags from Apache Atlas and uploads them to Apache Ranger. The Ranger authorization plugin for Kafka downloads the authorization policies, including tags, from the Ranger admin service and evaluates whether access is allowed or not based on the policy evaluation. Let's look at an example...

1) Start Apache Atlas and create entities/tags for Kafka

The first step is to start Apache Atlas as per the previous tutorial. Note that we are not using the Apache Ranger authorization plugin for Atlas, so there is no need to follow step 2). Next we need to upload the Kafka entity of type "kafka_topic" that we are interested in securing. That can be done via the following command:
  • curl -v -H 'Accept: application/json, text/plain, */*' -H 'Content-Type: application/json;  charset=UTF-8' -u admin:admin -d @kafka-create.json http://localhost:21000/api/atlas/entities
where "kafka-create.json" is defined as:
Once this is done, log in to the admin console using credentials "admin/admin" at http://localhost:21000. Click on "Tags" and "Create Tag" called "KafkaTag". Next go to "Search" and search for the entity we have uploaded ("KafkaTest"). Click on the "+" button under "Tags" and associate the entity with the tag we have created.


2) Start Apache Ranger and create resource-based authorization policies for Kafka

Next we will follow the first tutorial to install Apache Kafka and to get a simple test-case working with SSL authentication, but no authorization (there is no need to start Zookeeper as we already have Apache Atlas running, which starts a Zookeeper instance). Next follow the third tutorial to install the Apache Ranger admin service, as well as the Ranger plugin for Kafka. Create ("resource-based") authorization policies for the Kafka "test" topic in Apache Ranger. There is just one thing we need to change, call the Ranger service "cl1_kafka" instead of "KafkaTest" (this change needs to happen in Ranger, and in the "install.properties" when installing the Ranger plugin to Kafka).

Now verify that the producer has permission to publish to the topic, and the consumer has permission to consume from the topic. Once this is working, then remove the resource-based policy for the consumer, and verify that the consumer no longer has permission to consume from the topic.

3) Use the Apache Ranger TagSync service to import tags from Atlas into Ranger

To create tag based policies in Apache Ranger, we have to import the entity + tag we have created in Apache Atlas into Ranger via the Ranger TagSync service. After building Apache Ranger then extract the file called "target/ranger-<version>-tagsync.tar.gz". There are three alternatives available where the Ranger TagSync service can obtain tag information. From Apache Atlas via a Kafka topic, from Apache Atlas via the REST API and from a file. We will use the REST API of Atlas here. Edit 'install.properties' as follows:
  • Set TAG_SOURCE_ATLAS_ENABLED to "false"
  • Set TAG_SOURCE_ATLASREST_ENABLED to  "true"
  • Set TAG_SOURCE_ATLASREST_DOWNLOAD_INTERVAL_IN_MILLIS to "60000" (just for testing purposes)
  • Specify "admin" for both TAG_SOURCE_ATLASREST_USERNAME and TAG_SOURCE_ATLASREST_PASSWORD
Save 'install.properties' and install the tagsync service via "sudo ./setup.sh". It can now be started via "sudo ranger-tagsync-services.sh start".

4) Create Tag-based authorization policies in Apache Ranger

Now we can create tag-based authorization policies in Apache Ranger. Earlier we used the name "cl1_kafka" for the service name instead of "KafkaTest" as in the previous tutorial. The reason for this is that the service name must match the qualified name attribute of the Kafka entity that we are syncing into Ranger.

In Ranger, click on "Access Manager" and "Tag Based Policies". Create a new "TAG" service called "KafkaTagService". When this is done go into the new service and click on "Add New Policy". Hit (upper-case) "K" in the "TAG" field and "KafkaTag" should pop up automatically (hence the import of tags from Atlas was successful). Add an "allow condition" for the client user with permissions to "consume" and "describe" for "kafka" as shown in the following picture:


Finally, edit the "cl1_kafka" service we created and for "Select Tag Service" select "KafkaTagService" and save. Finally, wait some time for the Ranger plugin to download the new policies and tags and try the consumer again. This time it should work! So we have shown how Ranger can create authorization policies based on tags as well as resources.

Monday, February 6, 2017

Securing Apache Atlas using Apache Ranger

Apache Atlas, currently in the Apache Incubator, is a data governance and metadata framework for Apache Hadoop. It allows you to import data from a backend such as Apache Hive or Apache Falcon, and to classify and tag the data according to a set of business rules. In this tutorial we will show how to to use Apache Ranger to create authorization policies to secure access to Apache Atlas.

1) Set up Apache Atlas

First let's look at setting up Apache Atlas. Download the latest released version (0.7.1-incubating) and extract it. Build the distribution that contains an embedded HBase and Solr instance via:
  • mvn clean package -Pdist,embedded-hbase-solr -DskipTests
The distribution will then be available in 'distro/target/apache-atlas-0.7.1-incubating-bin'. To launch Atlas, we need to set some variables to tell it to use the local HBase and Solr instances:
  • export MANAGE_LOCAL_HBASE=true
  • export MANAGE_LOCAL_SOLR=true
Before starting Atlas, for testing purposes let's add a new user called 'alice' in the group 'DATA_SCIENTIST' with password 'password'. Edit 'conf/users-credentials.properties' and add:
  • alice=DATA_SCIENTIST::5e884898da28047151d0e56f8dc6292773603d0d6aabbdd62a11ef721d1542d8
Now let's start Apache Atlas with 'bin/atlas_start.py'. The Apache Atlas web service can be explored via 'http://localhost:21000/'. To populate some sample data in Apache Atlas, run the command 'bin/quick_start.py' (using credentials admin/admin). To see all traits/tags that have been created, use Curl as follows:
  • curl -u alice:password http://localhost:21000/api/atlas/types?type=TRAIT
2) Install the Apache Ranger Atlas plugin

To use Apache Ranger to secure Apache Atlas, the next step we need to do is to configure and install the Apache Ranger Atlas plugin. Follow the steps in an earlier tutorial to build Apache Ranger and to setup and start the Apache Ranger Admin service. I recommend to use the latest SNAPSHOT of Ranger (0.7.0-SNAPSHOT at this time) as there are some bugs fixed in relation to Atlas support since the 0.6.x release. Once this is done, go back to the Apache Ranger distribution that you have built and extract the atlas plugin:
  • tar zxvf target/ranger-0.7.0-SNAPSHOT-atlas-plugin.tar.gz
 Edit 'install.properties' with the following changes:
  • POLICY_MGR_URL=http://localhost:6080
  • Specify location for SQL_CONNECTOR_JAR 
  • Specify REPOSITORY_NAME (AtlasTest)
  • COMPONENT_INSTALL_DIR_NAME pointing to your Atlas install
Now install the plugin via 'sudo ./enable-atlas-plugin.sh'. If you see an error about "libext" then create a new empty directory called "libext" in the Atlas distribution and try again. Note that the ranger plugin will try to store policies by default in "/etc/ranger/AtlasTest/policycache". As we installed the plugin as "root" make sure that this directory is accessible to the user that is running Apache Atlas. Now restart Apache Atlas to enable the Ranger plugin.

3) Creating authorization policies for Atlas in the Ranger Admin Service

Now that we have set up Apache Atlas to use Apache Ranger for authorization, what remains is to start the Apache Ranger Admin Service and to create some authorization policies. Start Apache Ranger ('sudo ranger-admin start'). Log in to 'http://localhost:6080/' (credentials admin/admin). Click on the "+" button for Atlas, and specify the following fields:
  • Service Name: AtlasTest
  • Username: admin
  • Password: admin
  • atlas.rest.address: http://localhost:21000
Click on "Test Connection" to make sure that we can communicate successfully with Apache Atlas and then "Add". Click on the new link for "AtlasTest". Let's see if our new user "alice" is authorized to read the tags in Atlas. Execute the Curl command defined above (allowing 30 seconds for the Ranger plugin to pull the policies from the Ranger Admin Service). You should see a 403 Forbidden message from Atlas.

Now let's update the authorization policies to allow "alice" access to reading the tags. Back in Apache Ranger, click on "Settings" and then "Users/Groups" and "Groups". Click on "Add new group" and enter "DATA_SCIENTIST" for the name. Now go back into "AtlasTest", and edit the policy called "all - type". Create a new "Allow Condition" for the group "DATA_SCIENTIST" with permission "read" and click "Save". After waiting some time for the policies to sync, try again with the "Curl" command and it should work.


Thursday, February 2, 2017

Authenticating users in the Apache Ranger Admin Service via PAM

Over the past few months, I've written various tutorials about different ways you can authenticate to the Apache Ranger Admin Service. In summary, here are the options that have been covered so far:
The remaining option is to authenticate users directly to the local UNIX machine. There is a legacy way of doing this that supports authentication using shadow files. However, a much better approach is to support user authentication using Pluggable Authentication Modules (PAM). This means we can delegate user authentication to various PAM modules, and so we have a wide range of user authentication options. In this post we will show how to configure the Ranger Admin Service to authenticate users on a local linux machine using PAM. There is also an excellent in-depth tutorial that covers PAM and Ranger available here.

1) Configuring the Apache Ranger Admin Service to use PAM for authentication

Follow the steps in a previous tutorial to build Apache Ranger and to setup and install the Apache Ranger Admin service. Edit 'conf/ranger-admin-site.xml' and change the following configuration value:
  • ranger.authentication.method: PAM
2) Add a PAM configuration file for Apache Ranger

The next step is to add a PAM configuration file for Apache Ranger. Create a file called '/etc/pam.d/ranger-admin' with the content:
  • auth    required    pam_unix.so
  • account    required    pam_unix.so
Essentially this means that we are delegating authentication to the local unix machine. Now start the Apache Ranger Admin service. You should be able to log on to http://localhost:6080/login.jsp using a local user credential.