Archive for : February, 2016

How to semi-automate deploying dev hdp cluster

Purpose of this article:

When you install HDP for dev/test environment, you would repeat same commands to set up your host OS. To save time, created a BASH script which helps to set up the host OS (Ubuntu only) and docker image (CentOS).

 

What this script does:

  1. Install packages on Ubuntu host OS
  2. Set up docker, such as creating image and spawning containers
  3. [Optional] Set up a local repository for HDP (not Ambari) with Apache2

 

What this script does NOT:

  1. ​As of this writing, this does not install HDP
  2. ​Please use Ambari Blueprint if you would like to automate HDP installation as well.
    http://crazyadmins.com/automate-hdp-installation-using-ambari-blueprints-part-2/
  3. This step is NOT for production environment but would be useful to test HA components

​Host setup steps:

 

Step 1: ​Install Ubuntu 14.x LTS on your VirtualBox/VMware/Azure/AWS.

​It should be easy to deploy Ubuntu VM if you use Azure or AWS.
If you are using VirtualBox/VMWare, you might want to backup Ubuntu installed VM as a template, so that later you can clone.

 

Step 2: Login to Ubuntu and become root (sudo -i)

 

Step 3: Download script using below command

wget https://raw.githubusercontent.com/hajimeo/samples/master/bash/start_hdp.sh -O ./start_hdp.sh && chmod u+x ./start_hdp.sh

 

Step 4: Start the script with Install mode

./start_hdp.sh -i

 

Step 5: Start of an interview 

Script will ask a few questions such as your choice of guest OS, Ambari version, HDP version etc. Normally default values should be OK, so you can just keep pressing Enter key.
NOTE: The end of interview, it asks you to save your answer in a text file. You can reuse this file to skip interview when you install a new cluster.

 

Step 6: Confirm your answers 

After saving your responses, it will ask you “Would you like to start setup this host? [Y]:“. If you answer yes, it starts setting up your Ubuntu host OS. After waiting for while, the scripts finishes, or if there is any error, it stops.

The time would be depending on your choice. If you selected to setup a local repo, downloading repo may take long time.

 

Step 7: Complete the setup

Once the script completed successfully, your choice of Ambari Server should be installed and running on your specified docker container on port 8080.

NOTE: At this moment, docker containers are installed in a private network, so that you would need to do one of followings (“1″ would be the easiest):

Following command creates proxy from your local PC port 18080

ssh -D 18080 username@ubuntu-hostname

Following command do port forwarding from your localhost:8080 to node1:8080

ssh -L 8080:node1.localdomain:8080 username@ubuntu-hostname

Set up proper proxy, such as squid

If you decided to set up a proxy, installing addon such as “​SwitchySharp” would be handy.

  1. Once you confirmed you can use Ambari web interface, please proceed to install HDP.
    If you choose to set up a HDP local repository, please replace “public-repo-1.hortonworks.com” to “dockerhost1.localdomain” (if you used default value)
  2. Private key should be /root/.ssh/id_rsa in any node
  3. Remaining steps should be same as installing normal HDP.
    NOTE: if you decided to install older Ambari version, there is a known issue ​AMBARI-8620

 

Host Start up step

If you shutdown the VM, next time you can just run “./start_hdp.sh -s” which starts up containers, Ambari Server, Ambari Agents and HDP services.​

 

How to semi-automate deploying dev hdp cluster – Did you like this article ? please feel free to send an email to info@crazyadmins.com if you have any further questions on this. Please don’t forget to like our facebook page. Happy Hadooping!! :)

 

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

Automate HDP installation using Ambari Blueprints – Part 2

In previous post we have seen how to install single node HDP cluster using Ambari Blueprints. In this post we will see how to Automate HDP installation using Ambari Blueprints. 

 

Below are simple steps to install HDP multinode cluster using internal repository via Ambari Blueprints.

 

Step 1: Install Ambari server using steps mentioned under below link

http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Installing_HDP_AMB/content/_download_the_ambari_repo_lnx6.html

 

Step 2: Register ambari-agent manually

Install ambari-agent package on all the nodes in the cluster and modify hostname to ambari server host(fqdn) in /etc/ambari-agent/conf/ambari-agent.ini

 

Step 3: Configure blueprints

Please follow below steps to create Blueprints

 

3.1 Create hostmapping.json file as shown below:

Note – This file will have information related to all the hosts which are part of your HDP cluster.

{
"blueprint" : "multinode-hdp",
"default_password" : "hadoop",
"host_groups" :[
   {
     "name" : "host2",
     "hosts" : [
       {
         "fqdn" : "host2.crazyadmins.com"
       }
     ]
   },
   {
     "name" : "host3",
     "hosts" : [
       {
         "fqdn" : "host3.crazyadmins.com"
       }
     ]
   },
   {
     "name" : "host4",
     "hosts" : [
       {
         "fqdn" : "host4.crazyadmins.com"
       }
     ]
   }
]
}

 

3.2 Create cluster_configuration.json file, it contents mapping of hosts to HDP components

{
 "configurations": [],
 "host_groups": [{
 "name": "host2",
 "components": [{
 "name": "PIG"
 }, {
 "name": "METRICS_COLLECTOR"
 }, {
 "name": "KAFKA_BROKER"
 }, {
 "name": "HISTORYSERVER"
 }, {
 "name": "HBASE_REGIONSERVER"
 }, {
 "name": "OOZIE_CLIENT"
 }, {
 "name": "HBASE_CLIENT"
 }, {
 "name": "NAMENODE"
 }, {
 "name": "SUPERVISOR"
 }, {
 "name": "HCAT"
 }, {
 "name": "METRICS_MONITOR"
 }, {
 "name": "APP_TIMELINE_SERVER"
 }, {
 "name": "NODEMANAGER"
 }, {
 "name": "HDFS_CLIENT"
 }, {
 "name": "HIVE_CLIENT"
 }, {
 "name": "FLUME_HANDLER"
 }, {
 "name": "DATANODE"
 }, {
 "name": "WEBHCAT_SERVER"
 }, {
 "name": "ZOOKEEPER_CLIENT"
 }, {
 "name": "ZOOKEEPER_SERVER"
 }, {
 "name": "STORM_UI_SERVER"
 }, {
 "name": "HIVE_SERVER"
 }, {
 "name": "FALCON_CLIENT"
 }, {
 "name": "TEZ_CLIENT"
 }, {
 "name": "HIVE_METASTORE"
 }, {
 "name": "SQOOP"
 }, {
 "name": "YARN_CLIENT"
 }, {
 "name": "MAPREDUCE2_CLIENT"
 }, {
 "name": "NIMBUS"
 }, {
 "name": "DRPC_SERVER"
 }],
 "cardinality": "1"
 }, {
 "name": "host3",
 "components": [{
 "name": "ZOOKEEPER_SERVER"
 }, {
 "name": "OOZIE_SERVER"
 }, {
 "name": "SECONDARY_NAMENODE"
 }, {
 "name": "FALCON_SERVER"
 }, {
 "name": "ZOOKEEPER_CLIENT"
 }, {
 "name": "PIG"
 }, {
 "name": "KAFKA_BROKER"
 }, {
 "name": "OOZIE_CLIENT"
 }, {
 "name": "HBASE_REGIONSERVER"
 }, {
 "name": "HBASE_CLIENT"
 }, {
 "name": "HCAT"
 }, {
 "name": "METRICS_MONITOR"
 }, {
 "name": "FALCON_CLIENT"
 }, {
 "name": "TEZ_CLIENT"
 }, {
 "name": "SQOOP"
 }, {
 "name": "HIVE_CLIENT"
 }, {
 "name": "HDFS_CLIENT"
 }, {
 "name": "NODEMANAGER"
 }, {
 "name": "YARN_CLIENT"
 }, {
 "name": "MAPREDUCE2_CLIENT"
 }, {
 "name": "DATANODE"
 }],
 "cardinality": "1"
 }, {
 "name": "host4",
 "components": [{
 "name": "ZOOKEEPER_SERVER"
 }, {
 "name": "ZOOKEEPER_CLIENT"
 }, {
 "name": "PIG"
 }, {
 "name": "KAFKA_BROKER"
 }, {
 "name": "OOZIE_CLIENT"
 }, {
 "name": "HBASE_MASTER"
 }, {
 "name": "HBASE_REGIONSERVER"
 }, {
 "name": "HBASE_CLIENT"
 }, {
 "name": "HCAT"
 }, {
 "name": "RESOURCEMANAGER"
 }, {
 "name": "METRICS_MONITOR"
 }, {
 "name": "FALCON_CLIENT"
 }, {
 "name": "TEZ_CLIENT"
 }, {
 "name": "SQOOP"
 }, {
 "name": "HIVE_CLIENT"
 }, {
 "name": "HDFS_CLIENT"
 }, {
 "name": "NODEMANAGER"
 }, {
 "name": "YARN_CLIENT"
 }, {
 "name": "MAPREDUCE2_CLIENT"
 }, {
 "name": "DATANODE"
 }],
 "cardinality": "1"
 }],
 "Blueprints": {
 "blueprint_name": "multinode-hdp",
 "stack_name": "HDP",
 "stack_version": "2.3"
 }
}

 

Step 4: Create an internal repository map

 

4.1: hdp repository – copy below contents, modify base_url to add hostname/ip-address of your internal repository server and save it in repo.json file.

{
"Repositories" : {
   "base_url" : "http://<ip-address-of-repo-server>/hdp/centos6/HDP-2.3.4.0",
   "verify_base_url" : true
}
}

 

4.2: hdp-utils repository – copy below contents, modify base_url to add hostname/ip-address of your internal repository server and save it in hdputils-repo.json file.

{
"Repositories" : {
   "base_url" : "http://<ip-address-of-repo-server>/hdp/centos6/HDP-UTILS-1.1.0.20",
   "verify_base_url" : true
}
}

 

Step 5: Register blueprint with ambari server by executing below command

curl -H "X-Requested-By: ambari" -X POST -u admin:admin http://<ambari-server-hostname>:8080/api/v1/blueprints/multinode-hdp -d @cluster_config.json

 

Step 6: Setup Internal repo via REST API.

Execute below curl calls to setup internal repositories.

curl -H "X-Requested-By: ambari" -X PUT -u admin:admin http://<ambari-server-hostname>:8080/api/v1/stacks/HDP/versions/2.3/operating_systems/redhat6/repositories/HDP-2.3 -d @repo.json

 

curl -H "X-Requested-By: ambari" -X PUT -u admin:admin http://<ambari-server-hostname>:8080/api/v1/stacks/HDP/versions/2.3/operating_systems/redhat6/repositories/HDP-UTILS-1.1.0.20 -d @hdputils-repo.json

 

Step 7: Pull the trigger! Below command will start cluster installation.

curl -H "X-Requested-By: ambari" -X POST -u admin:admin http://<ambari-server-hostname>:8080/api/v1/clusters/multinode-hdp -d @hostmap.json

 

Please feel free to comment or send us an email to info@crazyadmins.com if you need any further help on this. Happy Hadooping!! :)

 

 

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

How to integrate Ranger with LDAP

In this blog post we will see how to integrate Ranger with LDAP.

 

Ranger

 

What is Ranger ?

Ranger is an open-source utility to control authorization for different Hadoop components such as Hdfs, Hive, Hbase, Yarn etc. in a centralized way. Apache Ranger also keeps auditing information related to these Hadoop components which can be useful for tracking purpose.

 

What is LDAP?

LDAP stands for light weight directory access protocol. LDAP is an application protocol used over an IP network to manage and access the distributed directory information service. The primary purpose of a directory service is to provide a systematic set of records, usually organized in a hierarchical structure

 

I’m assuming that you have already installed openldap server, if not then please follow this link to install openldap on Centos 6.X/RHEL 6.x ( Please skip the ssl part while installing openldap from given link )

Note – For your reference I have attached my slapd.conf file, please download it from here.

 

Setup Environment:

HDP Version: 2.3.2

Ambari Version: 2.1.2

Ranger Version: 0.5.0.2.3

Openldap Version: 2.4.40-7.el6_7.x86_64

 

Below are the configuration changes that we need to make in order to implement LDAP authentication for Ranger.

Note – I’m adding changes according to my slapd.conf file, you might need to modify your configurations according to your openldap settings.

 

Step 1: Login to Ambari UI, Select Ranger Service and goto configuration tab.

 

Step 2: Under “Ranger Settings” section select authentication method as “LDAP”

 

Screen Shot 2016-02-12 at 5.52.45 PM

 

Step3: Under “LDAP Settings” section add below configuration properties

 

ranger.ldap.user.searchfilter = (uid={0})
ranger.ldap.user.dnpattern = cn=Manager,dc=example,dc=com
ranger.ldap.url = ldap://<ip-address-of-openldap-server>:389
ranger.ldap.referral = ignore
ranger.ldap.group.roleattribute = uid
ranger.ldap.bind.password = *****     <-- Admin password of openldap
ranger.ldap.bind.dn = cn=Manager,dc=example,dc=com
ranger.ldap.base.dn = dc=example,dc=com

 

Screen Shot 2016-02-12 at 5.52.51 PM

 

Step 4: Under “Advanced ranger-admin-site” section set below properties

 

ranger.ldap.group.searchfilter = (member=uid={0},ou=users,dc=example,dc=com)
ranger.ldap.group.searchbase = dc=example,dc=com

 

Screen Shot 2016-02-14 at 1.19.28 AM

 

Step 5: Under “Advanced ranger-ugsync-site” section set below properties

 

ranger.usersync.ldap.username.caseconversion = none
ranger.usersync.group.memberattributename = member
ranger.usersync.group.nameattribute = cn
ranger.usersync.group.objectclass = groupofnames
ranger.usersync.group.searchbase = dc=example,dc=com
ranger.usersync.group.searchenabled = false
ranger.usersync.group.searchscope = sub
ranger.usersync.group.usermapsyncenabled = false
ranger.usersync.ldap.user.searchscope = sub
ranger.usersync.ldap.user.searchbase= ou=users,dc=example,dc=com
ranger.usersync.ldap.user.objectclass = person
ranger.usersync.ldap.user.nameattribute = uid
ranger.usersync.ldap.url = ldap://<ip-address-of-openldap-server>:389
ranger.usersync.ldap.searchBase = dc=example,dc=com
ranger.usersync.ldap.referral = ignore
ranger.usersync.ldap.ldapbindpassword = *****   <-- openldap admin password
ranger.usersync.ldap.groupname.caseconversion = none
ranger.usersync.ldap.binddn = cn=Manager,dc=example,dc=com
ranger.usersync.ldap.bindalias = ranger.usersync.ldap.bindalias
ranger.usersync.source.impl.class = org.apache.ranger.ldapusersync.process.LdapUserGroupBuilder
ranger.usersync.sink.impl.class = org.apache.ranger.unixusersync.process.PolicyMgrUserGroupBuilder

 

Screenshot “Advanced ranger-usersync-site” – 1

 

Screen Shot 2016-02-12 at 5.53.56 PM

 

 

Screenshot “Advanced ranger-usersync-site” – 2

 

 

Screen Shot 2016-02-12 at 5.54.05 PM

 

 

Screenshot “Advanced ranger-usersync-site” – 3

 

Screen Shot 2016-02-12 at 5.54.15 PM

 

Step 6: Save the all above configuration changes and restart all the affected services from Ambari UI

 

Step 7: Login to Ranger UI using admin account and check if all the LDAP users are synced by ranger usersync process. if for some reason you cannot see openldap users in Ranger UI then you need to check the usersync daemon’s log and need to figure out what went wrong.

 

Note – You should see below lines in usersync logs in order to ensure that your openldap users are getting synced correctly in ranger.

 

12 Feb 2016 11:15:08 INFO UserGroupSync [UnixUserSyncThread] - Begin: update user/group from source==>sink
12 Feb 2016 11:15:08 INFO LdapUserGroupBuilder [UnixUserSyncThread] - LDAPUserGroupBuilder updateSink started
12 Feb 2016 11:15:08 INFO LdapUserGroupBuilder [UnixUserSyncThread] - LdapUserGroupBuilder initialization started
12 Feb 2016 11:15:08 INFO LdapUserGroupBuilder [UnixUserSyncThread] - LdapUserGroupBuilder initialization completed with -- ldapUrl: ldap://172.25.17.3:389, ldapBindDn: cn=Manager,dc=example,dc=com, ldapBindPassword: ***** , ldapAuthenticationMechanism: simple, searchBase: dc=example,dc=com, userSearchBase: ou=users,dc=example,dc=com, userSearchScope: 2, userObjectClass: person, userSearchFilter: , extendedUserSearchFilter: (objectclass=person), userNameAttribute: uid, userSearchAttributes: [uid, ismemberof, memberof], userGroupNameAttributeSet: [ismemberof, memberof], pagedResultsEnabled: true, pagedResultsSize: 500, groupSearchEnabled: false, groupSearchBase: dc=example,dc=com, groupSearchScope: 2, groupObjectClass: groupofnames, groupSearchFilter: , extendedGroupSearchFilter: (&(objectclass=groupofnames)(member={0})), extendedAllGroupsSearchFilter: (&(objectclass=groupofnames)), groupMemberAttributeName: member, groupNameAttribute: cn, groupUserMapSyncEnabled: false, ldapReferral: ignore
12 Feb 2016 11:15:08 INFO LdapUserGroupBuilder [UnixUserSyncThread] - Updating user count: 1, userName: student1, groupList: []
12 Feb 2016 11:15:08 INFO LdapUserGroupBuilder [UnixUserSyncThread] - Updating user count: 2, userName: student2, groupList: []
12 Feb 2016 11:15:08 INFO LdapUserGroupBuilder [UnixUserSyncThread] - LDAPUserGroupBuilder.updateSink() completed with user count: 2
12 Feb 2016 11:15:08 INFO UserGroupSync [UnixUserSyncThread] - End: update user/group from source==>sink

 

Step 8: Now try to login to Ranger UI via any openldap user and you should be able to get in :)

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather