Category : Hadoop

How to enable Ranger Admin High Availability

It is useful to enable Ranger Admin High availability, since, it would help in having access to Policy Manager even if one of the Ranger Admin is down. This document provides steps to enable Ranger Admin HA [High Availability] using an example. To configure Ranger Admin HA,  it is also required to configure Load Balancing in

Read More →

Oozie HA configuration with Kerberos

Please follow below steps to setup Oozie HA configuration with Kerberos environment.   Step 1:  Configure mysql/oracle database for Oozie as HA configuration does not work with default embedded Derby Database. Please refer https://community.hortonworks.com/articles/183/moving-oozie-to-mysql-with-ambari.html for steps to migrate Oozie database.   Step 2: Login to Ambari UI, goto hosts, select host on which you need to add additional

Read More →

Automated Kerberos Installation and Configuration

Automated Kerberos Installation and Configuration – For this post, I have written a shell script which uses Ambari APIs to configure Kerberos on HDP Single or Multinode clusters. You just need to clone our github repository and modify property file according to your cluster environment, execute setup script and phew!! Within 5-10 minutes you should have

Read More →

Oozie coordinator based on input data events

Oozie coordinator based on input data events – This article explains how to start oozie workflow when input data is available.   Here is an example of scheduling oozie coordinator based on input data events. it starts Oozie workflow when input data is available.   In this example, coordinator will start at 2016-04-10, 6:00 GMT and

Read More →

Role of Journal nodes in Namenode HA

What is the Role of Journal nodes in Namenode HA ? I know many of us are aware that Role of Journal nodes is to keep both the Namenodes in sync and avoid hdfs split brain scenario by allowing only Active NN to write into journals. Have you ever wonder how does it works? Here you

Read More →

Oozie Tutorials – SSH Action

Oozie Tutorials – SSH Action Oozie ssh action executes shell script on remote machine in secure shell, workflow will wait until ssh script is complete and then move to next action.    Prerequisites: 1. Shell script must be present on remote host at given path. 2. Shell script will be executed in home directory of

Read More →

Hadoop Cluster Maintenance

As a Hadoop Admin it’s our responsibility to perform Hadoop Cluster Maintenance frequently. Let’s see what we can do to keep our big elephant happy! 😉     1. FileSystem Checks We should check health of HDFS periodically by running fsck command sudo -u hdfs hadoop fsck /   This command contacts the Namenode and checks

Read More →

Oozie Tutorials – Basics of Oozie and Oozie SHELL action

Our Oozie Tutorials will cover most of the available workflow actions with and without Kerberos authentication. Let’s have a look at some basic concepts of Oozie.   What is Oozie? Oozie is open source workflow management system. We can schedule Hadoop jobs via Oozie which includes hive/pig/sqoop etc. actions. Oozie provides great features to trigger workflows

Read More →

How to semi-automate deploying dev hdp cluster

Purpose of this article: When you install HDP for dev/test environment, you would repeat same commands to set up your host OS. To save time, created a BASH script which helps to set up the host OS (Ubuntu only) and docker image (CentOS).   What this script does: Install packages on Ubuntu host OS Set

Read More →

Automate HDP installation using Ambari Blueprints – Part 2

In previous post we have seen how to install single node HDP cluster using Ambari Blueprints. In this post we will see how to Automate HDP installation using Ambari Blueprints.    Below are simple steps to install HDP multinode cluster using internal repository via Ambari Blueprints.   Step 1: Install Ambari server using steps mentioned under below link http://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.1/bk_Installing_HDP_AMB/content/_download_the_ambari_repo_lnx6.html

Read More →

1 2 3 4