Category : Hadoop

How to integrate Ranger with LDAP

In this blog post we will see how to integrate Ranger with LDAP.     What is Ranger ? Ranger is an open-source utility to control authorization for different Hadoop components such as Hdfs, Hive, Hbase, Yarn etc. in a centralized way. Apache Ranger also keeps auditing information related to these Hadoop components which can be useful for

Read More →

Automate HDP installation using Ambari Blueprints – Part 1

Blogpost after long time okay, in this post we will see how to Automate HDP installation using Ambari Blueprints   What are Ambari Blueprints ? Ambari Blueprints are definition of your HDP cluster in “JSON” format, it contents information about all the hosts in your cluster, their components, mapping of stack components with each hosts

Read More →

Setup SQL Based authorization in hive

In this tutorial we will see how to setup SQL Based authorization in hive.   Step 1 – Goto ambari UI and add/modify below properties   Goto service hive → configs and change autherization to SQLStdAuth     Step 2 – In Hive-site.xml, make sure you have set below properties:   hive.server2.enable.doAs –> false hive.users.in.admin.role –>

Read More →

Unable to delete STORM REST API service component after hdp upgrade

Unable to delete STORM REST API service component after hdp upgrade to hdp2.2.0.0 ? Relax! You are at right place, this guide will show you how to handle these kind of errors.     Initially I had installed hdp2.1 with ambari 1.7, then I upgraded ambari to 2.1.2 and upgraded hdp stack to 2.2.0.0 as

Read More →

Kafka integration with Ganglia

In this blog post I will show you kafka integration with ganglia, this is very interesting & important topic for those who want to do bench-marking, measure performance by monitoring specific Kafka metrics via ganglia. Before going ahead let me briefly explain about what is Kafka and Ganglia. Kafka – Kafka is open source distributed

Read More →

Minimum user id error while submitting mapreduce job

Hello everyone! Hope you are enjoying our blogs on crazyadmins.com This is a small blog to help you all solve this minimum user id error while submitting mapreduce job in Hadoop.     Error: Application application_XXXXXXXXX_XXXX failed 2 times due to AM Container for appattempt_ XXXXXXXXX_XXXX _XXXXXX exited with exitCode: -1000 For more detailed output,

Read More →

Tune Hadoop Cluster to get Maximum Performance (Part 2)

In previous part we have seen that how can we tune our operating system to get maximum performance for Hadoop, in this article I will be focusing on how to tune hadoop cluster to get performance boost on hadoop level     Before I actually start explaining tuning parameters let me cover some basic terms

Read More →

Tune Hadoop Cluster to get Maximum Performance (Part 1)

I have been working on production Hadoop clusters for a while and have learned many performance tuning tips and tricks. In this blog I will explain how to tune Hadoop Cluster to get maximum performance. Just installing Hadoop for production clusters or to do some development POC does not give expected results, because default Hadoop

Read More →

Migration of Ambari server in Hortonworks Hadoop 2.2

There could be various scenarios in which migration of Hortonworks Management server (Ambari server) is required. For instance, due to hardware issues with server etc. In this article we will discuss migration of Ambari server in Hortonworks Hadoop in production cluster. Assuming that your Ambari server has been setup using default DB which is PostgresSQL.

Read More →

Install and configure Apache Phoenix on Cloudera Hadoop CDH5

  What is Apache Phoenix?       Apache Phoenix is a relational database layer over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC

Read More →

1 2 3 4