Archive for : August, 2016

How to setup cross realm trust between two MIT KDC

How to setup cross realm trust between two MIT KDC – In this post, we will see how to setup cross realm trust between two MIT KDC. We can access and copy data of one cluster from another if the cross realm trust is setup correctly.

 

In our example, we have 2 clusters with same HDP version(2.4.2.0) and Ambari version(2.2.2.0)

Cluster 1:

172.26.68.47 hwx-1.hwx.com  hwx-1
172.26.68.46 hwx-2.hwx.com  hwx-2
172.26.68.45 hwx-3.hwx.com  hwx-3

Cluster 2:

172.26.68.48 support-1.support.com   support-1
172.26.68.49 support-2.support.com   support-2
172.26.68.50 support-3.support.com   support-3

 

Below are the steps:

 

Step 1: Make sure both the clusters are kerberized with MIT KDC. You can use below automated script for configuring Kerberos on HDP.

https://community.hortonworks.com/articles/29203/automated-kerberos-installation-and-configuration.html

 

Step 2: Please configure /etc/hosts file on both the clusters to have Ip <-> hostname mappings.

Example:

On both clusters /etc/hosts file should look like below:

172.26.68.47 hwx-1.hwx.com  hwx-1
172.26.68.46 hwx-2.hwx.com  hwx-2
172.26.68.45 hwx-3.hwx.com  hwx-3
172.26.68.48 support-1.support.com   support-1
172.26.68.49 support-2.support.com   support-2
172.26.68.50 support-3.support.com   support-3

 

Step 3: Configure krb5.conf:

 

3.1 Configure [realm] section to add another cluster’s KDC server details – This is required to find KDC to authenticate user which belongs to another cluster.

Example on Cluster1:

[realms]
  HWX.COM = {
    admin_server = hwx-1.hwx.com
    kdc = hwx-1.hwx.com
  }
  SUPPORT.COM = {
    admin_server = support-1.support.com
    kdc = support-1.support.com
  }

3.2 Configure [domain_realm] section to add another cluster’s domain <-> realm mapping. 

[domain_realm]
  .hwx.com = HWX.COM
  hwx.com = HWX.COM
  .support.com = SUPPORT.COM
  support.com = SUPPORT.COM

3.3 Configure [capaths] to add another cluster’s realm

[capaths]
    HWX.COM = {
         SUPPORT.COM = .
    }

On Cluster 1, the krb5.conf should look like below:

[libdefaults]
  renew_lifetime = 7d
  forwardable = true
  default_realm = HWX.COM
  ticket_lifetime = 24h
  dns_lookup_realm = false
  dns_lookup_kdc = false
  #default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
  #default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
[logging]
  default = FILE:/var/log/krb5kdc.log
  admin_server = FILE:/var/log/kadmind.log
  kdc = FILE:/var/log/krb5kdc.log
[realms]
  HWX.COM = {
    admin_server = hwx-1.hwx.com
    kdc = hwx-1.hwx.com
  }
  SUPPORT.COM = {
    admin_server = support-1.support.com
    kdc = support-1.support.com
  }
[domain_realm]
  .hwx.com = HWX.COM
  hwx.com = HWX.COM
  .support.com = SUPPORT.COM
  support.com = SUPPORT.COM
[capaths]
    HWX.COM = {
         SUPPORT.COM = .
    }

Note – Please copy modified /etc/krb5.conf to all the nodes in Cluster 1

 

Similarly on Cluster2, the krb5.conf should look like below:

[libdefaults]
  renew_lifetime = 7d
  forwardable = true
  default_realm = SUPPORT.COM
  ticket_lifetime = 24h
  dns_lookup_realm = false
  dns_lookup_kdc = false
  #default_tgs_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
  #default_tkt_enctypes = aes des3-cbc-sha1 rc4 des-cbc-md5
[logging]
  default = FILE:/var/log/krb5kdc.log
  admin_server = FILE:/var/log/kadmind.log
  kdc = FILE:/var/log/krb5kdc.log
[realms]
  SUPPORT.COM = {
    admin_server = support-1.support.com
    kdc = support-1.support.com
  }
  HWX.COM = {
    admin_server = hwx-1.hwx.com
    kdc = hwx-1.hwx.com
  }
[domain_realm]
  .hwx.com = HWX.COM
  hwx.com = HWX.COM
  .support.com = SUPPORT.COM
  support.com = SUPPORT.COM
[capaths]
    SUPPORT.COM = {
        HWX.COM = .
    }

Note – Please copy modified /etc/krb5.conf to all the nodes in Cluster 2

 

Step 4: Modify below property in hdfs-site.xml on a cluster from where you want to execute distcp command ( specifically speaking – client side )

dfs.namenode.kerberos.principal.pattern=*

 

Step 5: Add a common trust principal in both the KDCs. Please keep same password for both the principals

On Cluster 1 and 2, execute below commands in kadmin utility:

addprinc krbtgt/HWX.COM@SUPPORT.COM
addprinc krbtgt/SUPPORT.COM@HWX.COM

 

Step 6: Configure auth_to_local rules on both the clusters:

On Cluster1, append auth_to_local rules from Cluster2

Example on Cluster 1:

RULE:[1:$1@$0](ambari-qa-hadoop@HWX.COM)s/.*/ambari-qa/
RULE:[1:$1@$0](hdfs-hadoop@HWX.COM)s/.*/hdfs/
RULE:[1:$1@$0](spark-hadoop@HWX.COM)s/.*/spark/
RULE:[1:$1@$0](.*@HWX.COM)s/@.*//
RULE:[2:$1@$0](dn@HWX.COM)s/.*/hdfs/
RULE:[2:$1@$0](hive@HWX.COM)s/.*/hive/
RULE:[2:$1@$0](jhs@HWX.COM)s/.*/mapred/
RULE:[2:$1@$0](jn@HWX.COM)s/.*/hdfs/
RULE:[2:$1@$0](nm@HWX.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@HWX.COM)s/.*/hdfs/
RULE:[2:$1@$0](rm@HWX.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@HWX.COM)s/.*/yarn/
DEFAULT
RULE:[1:$1@$0](ambari-qa-support@SUPPORT.COM)s/.*/ambari-qa/
RULE:[1:$1@$0](hdfs-support@SUPPORT.COM)s/.*/hdfs/
RULE:[1:$1@$0](spark-support@SUPPORT.COM)s/.*/spark/
RULE:[1:$1@$0](.*@SUPPORT.COM)s/@.*//
RULE:[2:$1@$0](dn@SUPPORT.COM)s/.*/hdfs/
RULE:[2:$1@$0](hive@SUPPORT.COM)s/.*/hive/
RULE:[2:$1@$0](jhs@SUPPORT.COM)s/.*/mapred/
RULE:[2:$1@$0](jn@SUPPORT.COM)s/.*/hdfs/
RULE:[2:$1@$0](nm@SUPPORT.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@SUPPORT.COM)s/.*/hdfs/
RULE:[2:$1@$0](rm@SUPPORT.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@SUPPORT.COM)s/.*/yarn/

 

On Cluster2, append auth_to_local rules from Cluster1

 

Example on Cluster 2:

RULE:[1:$1@$0](ambari-qa-support@SUPPORT.COM)s/.*/ambari-qa/
RULE:[1:$1@$0](hdfs-support@SUPPORT.COM)s/.*/hdfs/
RULE:[1:$1@$0](spark-support@SUPPORT.COM)s/.*/spark/
RULE:[1:$1@$0](.*@SUPPORT.COM)s/@.*//
RULE:[2:$1@$0](dn@SUPPORT.COM)s/.*/hdfs/
RULE:[2:$1@$0](hive@SUPPORT.COM)s/.*/hive/
RULE:[2:$1@$0](jhs@SUPPORT.COM)s/.*/mapred/
RULE:[2:$1@$0](jn@SUPPORT.COM)s/.*/hdfs/
RULE:[2:$1@$0](nm@SUPPORT.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@SUPPORT.COM)s/.*/hdfs/
RULE:[2:$1@$0](rm@SUPPORT.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@SUPPORT.COM)s/.*/yarn/
DEFAULT
RULE:[1:$1@$0](ambari-qa-hadoop@HWX.COM)s/.*/ambari-qa/
RULE:[1:$1@$0](hdfs-hadoop@HWX.COM)s/.*/hdfs/
RULE:[1:$1@$0](spark-hadoop@HWX.COM)s/.*/spark/
RULE:[1:$1@$0](.*@HWX.COM)s/@.*//
RULE:[2:$1@$0](dn@HWX.COM)s/.*/hdfs/
RULE:[2:$1@$0](hive@HWX.COM)s/.*/hive/
RULE:[2:$1@$0](jhs@HWX.COM)s/.*/mapred/
RULE:[2:$1@$0](jn@HWX.COM)s/.*/hdfs/
RULE:[2:$1@$0](nm@HWX.COM)s/.*/yarn/
RULE:[2:$1@$0](nn@HWX.COM)s/.*/hdfs/
RULE:[2:$1@$0](rm@HWX.COM)s/.*/yarn/
RULE:[2:$1@$0](yarn@HWX.COM)s/.*/yarn/

 

Step7: Add common user principal to both the KDCs. Execute below commands on both the KDC, please keep same password for both the principals.

For Cluster 1:

6.1. Login to kadmin

6.2. Execute below command to add a user principal

addprinc kuldeepk@HWX.COM

 

For Cluster 2:

6.3. Login to kadmin

6.4. Execute below command to add a user principal

addprinc kuldeepk@SUPPORT.COM

 

Step 8: Login to Cluster 2, do a kinit and try to access hdfs files of Cluster 1

Example:

hdfs dfs -ls hdfs://hwx-2.hwx.com:8020/tmp
Found 8 items
drwx------   - ambari-qa hdfs          0 2016-07-29 23:24 hdfs://hwx-2.hwx.com:8020/tmp/ambari-qa
drwxr-xr-x   - hdfs      hdfs          0 2016-07-29 22:02 hdfs://hwx-2.hwx.com:8020/tmp/entity-file-history
drwx-wx-wx   - ambari-qa hdfs          0 2016-07-29 23:25 hdfs://hwx-2.hwx.com:8020/tmp/hive
-rwxr-xr-x   3 hdfs      hdfs       1414 2016-07-29 23:50 hdfs://hwx-2.hwx.com:8020/tmp/id1aac2d44_date502916
-rwxr-xr-x   3 ambari-qa hdfs       1414 2016-07-29 23:26 hdfs://hwx-2.hwx.com:8020/tmp/idtest.ambari-qa.1469834803.19.in
-rwxr-xr-x   3 ambari-qa hdfs        957 2016-07-29 23:26 hdfs://hwx-2.hwx.com:8020/tmp/idtest.ambari-qa.1469834803.19.pig
drwxr-xr-x   - ambari-qa hdfs          0 2016-07-29 23:53 hdfs://hwx-2.hwx.com:8020/tmp/tezsmokeinput

Note – hwx-2.hwx.com is the Active Namenode of Cluster 1.

 

You can try copying files from Cluster 2 to Cluster 1 using distcp

 

Example:

[kuldeepk@support-1 root]$ hadoop distcp hdfs://hwx-1.hwx.com:8020/tmp/test.txt /tmp/
16/07/30 22:03:27 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, maxMaps=20, sslConfigurationFile='null', copyStrategy='uniformsize', sourceFileListing=null, sourcePaths=[hdfs://hwx-1.hwx.com:8020/tmp/test.txt], targetPath=/tmp, targetPathExists=true, preserveRawXattrs=false}
16/07/30 22:03:27 INFO impl.TimelineClientImpl: Timeline service address: http://support-3.support.com:8188/ws/v1/timeline/
16/07/30 22:03:27 INFO client.RMProxy: Connecting to ResourceManager at support-3.support.com/172.26.68.50:8050
16/07/30 22:03:28 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 20 for kuldeepk on 172.26.68.47:8020
16/07/30 22:03:28 INFO security.TokenCache: Got dt for hdfs://hwx-1.hwx.com:8020; Kind: HDFS_DELEGATION_TOKEN, Service: 172.26.68.47:8020, Ident: (HDFS_DELEGATION_TOKEN token 20 for kuldeepk)
16/07/30 22:03:29 INFO impl.TimelineClientImpl: Timeline service address: http://support-3.support.com:8188/ws/v1/timeline/
16/07/30 22:03:29 INFO client.RMProxy: Connecting to ResourceManager at support-3.support.com/172.26.68.50:8050
16/07/30 22:03:29 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 24 for kuldeepk on ha-hdfs:support
16/07/30 22:03:29 INFO security.TokenCache: Got dt for hdfs://support; Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:support, Ident: (HDFS_DELEGATION_TOKEN token 24 for kuldeepk)
16/07/30 22:03:29 INFO mapreduce.JobSubmitter: number of splits:1
16/07/30 22:03:29 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1469916118318_0003
16/07/30 22:03:29 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: 172.26.68.47:8020, Ident: (HDFS_DELEGATION_TOKEN token 20 for kuldeepk)
16/07/30 22:03:29 INFO mapreduce.JobSubmitter: Kind: HDFS_DELEGATION_TOKEN, Service: ha-hdfs:support, Ident: (HDFS_DELEGATION_TOKEN token 24 for kuldeepk)
16/07/30 22:03:30 INFO impl.YarnClientImpl: Submitted application application_1469916118318_0003
16/07/30 22:03:31 INFO mapreduce.Job: The url to track the job: http://support-3.support.com:8088/proxy/application_1469916118318_0003/
16/07/30 22:03:31 INFO tools.DistCp: DistCp job-id: job_1469916118318_0003
16/07/30 22:03:31 INFO mapreduce.Job: Running job: job_1469916118318_0003
16/07/30 22:03:43 INFO mapreduce.Job: Job job_1469916118318_0003 running in uber mode : false
16/07/30 22:03:43 INFO mapreduce.Job:  map 0% reduce 0%
16/07/30 22:03:52 INFO mapreduce.Job:  map 100% reduce 0%
16/07/30 22:03:53 INFO mapreduce.Job: Job job_1469916118318_0003 completed successfully
16/07/30 22:03:53 INFO mapreduce.Job: Counters: 32
  File System Counters
    FILE: Number of bytes read=0
    FILE: Number of bytes written=142927
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=346
    HDFS: Number of bytes written=45
    HDFS: Number of read operations=12
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=2
  Job Counters
    Launched map tasks=1
    Other local map tasks=1
    Total time spent by all maps in occupied slots (ms)=14324
    Total time spent by all reduces in occupied slots (ms)=0
    Total time spent by all map tasks (ms)=7162
    Total vcore-seconds taken by all map tasks=7162
    Total megabyte-seconds taken by all map tasks=7333888
  Map-Reduce Framework
    Map input records=1
    Map output records=1
    Input split bytes=118
    Spilled Records=0
    Failed Shuffles=0
    Merged Map outputs=0
    GC time elapsed (ms)=77
    CPU time spent (ms)=1210
    Physical memory (bytes) snapshot=169885696
    Virtual memory (bytes) snapshot=2337554432
    Total committed heap usage (bytes)=66584576
  File Input Format Counters
    Bytes Read=228
  File Output Format Counters
    Bytes Written=45
  org.apache.hadoop.tools.mapred.CopyMapper$Counter
    BYTESSKIPPED=0
    SKIP=1

Note – hwx-1.hwx.com is the Active Namenode of Cluster 1.

 

Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! :)

 

 

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

How to configure Ambari Hive View for Kerberized cluster

How to configure Ambari Hive View for Kerberized cluster – This tutorial has been successfully tried and tested on HDP-2.4.0.0 and Ambari 2.2.1.0

 

I have my HDP Cluster Kerberized and Ambari has been configured for SSL.

 

Note – Steps are same for Ambari with or without SSL.

 

Please follow below steps for Configuring Hive View on Kerberized HDP Cluster.

 

Step 1 – Please configure your Ambari Server for Kerberos with the steps mentioned in below article. Please follow steps 1 to 5.

https://community.hortonworks.com/articles/40635/configure-tez-view-for-kerberized-hdp-cluster.html

 

Step 2 – Please add below properties to core-site.xml via Ambari UI and restart required services.

 

Note – If you are running Ambari Server as root user then add below properties

hadoop.proxyuser.root.groups=*
hadoop.proxyuser.root.hosts=*

 

If you are running Ambari server as non-root user then please add below properties in core-site.xml

hadoop.proxyuser.<ambari-server-user>.groups=*
hadoop.proxyuser.<ambari-server-user>.hosts=*

 

Please replace <ambari-server-user> with user running Ambari Server in above example.

 

I’m assuming that your ambari server principal is ambari-server@REALM.COM, if not then please replace ‘ambari-server’ with your principal’s user part.

hadoop.proxyuser.ambari-server.groups=*
hadoop.proxyuser.ambari-server.hosts=*

 

Step 3 – Create user directory on hdfs for the user accessing hive view. For e.g. in my case I’m using admin user to access hive view.

 

sudo -u hdfs hadoop fs -mkdir /user/admin 
sudo -u hdfs hadoop fs -chown admin:hdfs /user/admin
sudo -u hdfs hadoop fs -chmod 755 /user/admin

 

Step 4 – Goto Admin tab –> Click on Manage Ambari –> Views –> Edit Hive view ( Create a new one if it doesn’t exist already ) and configure settings as given below

 

Note – You may need to modify values as per your environment settings!

 

 

After above steps, you should be able to access your hive view without any issues. If you receive any error(s) then please check /var/log/ambari-server/ambari-server.log for more details and troubleshooting.

 

 

Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! :)

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

How to configure Ambari File View with Namenode HA Kerberized

How to configure Ambari File View with Namenode HA Kerberized cluster – This tutorial has been successfully tried and tested on HDP-2.4.2.0 and Ambari 2.2.2.0

 

I have my HDP Cluster Kerberized with Namenode HA.

 

Please follow below steps for Configuring File View on Kerberized HDP Cluster.

 

Step 1 – Please configure your Ambari Server for Kerberos with the steps mentioned in below article. Please follow steps 1 to 5.

https://community.hortonworks.com/articles/40635/configure-tez-view-for-kerberized-hdp-cluster.html

 

Step 2 – Please add below properties to core-site.xml via Ambari UI and restart required services.

 

Note – If you are running Ambari Server as root user then add below properties

hadoop.proxyuser.root.groups=*
hadoop.proxyuser.root.hosts=*

 

If you are running Ambari server as non-root user then please add below properties in core-site.xml

hadoop.proxyuser.<ambari-server-user>.groups=*
hadoop.proxyuser.<ambari-server-user>.hosts=*

Please replace <ambari-server-user> with user running Ambari Server in above example.

 

I’m assuming that your ambari server principal is ambari-server@REALM.COM, if not then please replace ‘ambari-server’ with your principal’s user part.

hadoop.proxyuser.ambari-server.groups=*
hadoop.proxyuser.ambari-server.hosts=*

 

Step 3 – Create user directory on hdfs for the user accessing file view. For e.g. in my case I’m using admin user to access file view.

 

sudo -u hdfs hadoop fs -mkdir /user/admin 
sudo -u hdfs hadoop fs -chown admin:hdfs /user/admin
sudo -u hdfs hadoop fs -chmod 755 /user/admin

 

Step 4 – Goto Admin tab –> Click on Manage Ambari –> Views –> Edit File view ( Create a new one if it doesn’t exist already ) and configure settings as given below

 

Note – You may need to modify values as per your environment settings!

 

 

 

After above steps, you should be able to access your file view without any issues. If you receive any error(s) then please check /var/log/ambari-server/ambari-server.log for more details and troubleshooting.

 

Please comment if you have any feedback/questions/suggestions. Happy Hadooping!! :)

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather