Archive for : June, 2015

Tune Hadoop Cluster to get Maximum Performance (Part 1)

I have been working on production Hadoop clusters for a while and have learned many performance tuning tips and tricks. In this blog I will explain how to tune Hadoop Cluster to get maximum performance. Just installing Hadoop for production clusters or to do some development POC does not give expected results, because default Hadoop configuration settings are done keeping in mind the minimum hardware configuration. Its responsibility of Hadoop Administrator to understand the hardware specs like amount of RAM, Total number of CPU Cores, Physical Cores, Virtual Cores, Understand if hyper threading is supported by Processor, NIC Cards, Number of Disks that are mounted on Datanodes etc.

 

tune

 

For Better Understanding I have divided this blog into two main parts.

1. Tune your Hadoop Cluster to get Maximum Performance (Part 1) – In this part I will explain how to tune your operating system in order to get maximum performance for your Hadoop jobs.

2. Tune your Hadoop Cluster to get Maximum Performance (Part 2) – In this part I will explain how to modify your Hadoop configurations parameters so that it should use your hardware very efficiently.

 

How OS tuning will improve performance of Hadoop?

Tuning your Centos6.X/Redhat 6.X can increase performance of Hadoop by 20-25%. Yes! 20-25% :-)

 

Let’s get started and see what parameters we need to change on OS level.

 

1. Turn off the Power savings option in BIOS:

This will increase the overall system performance and Hadoop Performance. You can go to your BIOS Settings and change it to PerfOptimized from power saving mode (this option may be different for your server based on vendor). If you have remote console command line available then you can use racadm commands to check the status and update it. You need to restart the system in order to get your changes in effect.

 

 

2. Open file handles and files:

By default number of open file count is 1024 for each user and if you keep it to default then you may face java.io.FileNotFoundException: (Too many open files) and your job will get failed. In order to avoid this scenario set this number of open file limit to unlimited or some higher number like 32832.

 

Commands:

ulimit –S 4096
ulimit –H 32832

Also, Please set the system wide file descriptors by using below command:

sysctl –w fs.file-max=6544018

As above kernel variable is temporary and we need to make it permanent by adding it to /etc/sysctl.conf. Just edit /etc/sysctl.conf and add below value at the end of it

fs.file-max=6544018

 

 

 

3. FileSystem Type & Reserved Space:

In order to get maximum performance for your Hadoop job, I personally suggest by using ext4 filesystem as it has some advantage over ext3 like multi-block and delayed allocation etc. How you mount your file-system will make difference because if you mount it using default option then there will excessive writes for file or directory access times which we do not need in case of Hadoop. Mount your local disks using option noatime will surely improve your performance by disabling those excessive and unnecessary writes to disks.

Below is the sample of how it should look like:

 

UUID=gfd3f77-6b11-4ba0-8df9-75feb03476efs /disk1                 ext4   noatime       0 0
UUID=524cb4ff-a2a5-4e8b-bd24-0bbfd829a099 /disk2                 ext4   noatime       0 0
UUID=s04877f0-40e0-4f5e-a15b-b9e4b0d94bb6 /disk3                 ext4   noatime       0 0
UUID=3420618c-dd41-4b58-ab23-476b719aaes  /disk4                 ext4   noatime       0 0

 

Note – noatime option will also cover noadirtime so no need to mention that.

 

Many of you must be aware that after formatting your disk partition with ext4 partition, there is 5% space reserved for special operations like 100% disk full so root should be able to delete the files by using this reserved space. In case of Hadoop we don’t need to reserve that 5% space so please get it removed using tune2fs command.

 

Command:

tune2fs -m 0 /dev/sdXY

 

Note – 0 indicates that 0% space is reserved.

 

 

4. Network Parameters Tuning:

 

Network parameters tuning also helps to get performance boost! This is kinda risky stuff because if you are working on remote server and you did a mistake while updating Network parameters then there can be a connectivity loss and you may not be able to connect to that server again unless you correct the configuration mistake by taking IPMI/iDrac/iLo console etc. Modify the parameters only when you know what you are doing :-)

Modifying the net.core.somaxconn to 1024 from default value of 128 will help Hadoop by as this changes will have increased listening queue between the master and slave services so ultimately number of connections between master and slaves will be higher than before.

 

Command to modify net.core.somaxconnection:

sysctl –w net.core.somaxconn=1024

To make above change permanent, simple add below variable value at the end of /etc/sysctl.conf

net.core.somaxconn=1024

 

 

MTU Settings:

Maximum transmission unit. This value indicates the size which can be sent in a packet/frame over TCP. By default MTU is set to 1500 and you can tune it have its value=9000, when value of MTU is greater than its default value then it’s called as Jumbo Frames.

 

Command to change value of MTU:

You need to add MTU=9000 in /etc/sysconfig/network-scripts/ifcfg-eth0 or whatever your eth device name. Restart the network service in order to have this change in effect.

 

Note – Before modifying this value please make sure that all the nodes in your cluster including switches are supported for jumbo frames, if not then *PLEASE DO NOT ATTEMPT THIS*

 

 

5. Transparent Huge Page Compaction:

 

This feature of linux is really helpful to get the better performance for application including Hadoop workloads however one of the subpart of Transparent Huge Pages called Compaction causes issues with Hadoop job(it causes high processor usage while defragmentation of the memory). When I was benchmarking our client’s cluster I have observed some fluctuations ~15% with the output and when I disabled it then that fluctuation was gone. So I recommend to disable it for Hadoop.

 

Command:

echo never > /sys/kernel/mm/redhat_transparent_hugepages/defrag

 

In order to make above change permanent, please add below script in your /etc/rc.local file.

if test -f /sys/kernel/mm/redhat_transparent_hugepage/defrag; then echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag ;fi

 

 

6. Memory Swapping:

For Hadoop Swapping reduces the job performance, you should have maximum data in-memory and tune your OS so that it will do memory swap only when there is situation like OOM (OutOfMemory). To do so we need to set vm.swappiness kernel parameter to 0

 

Command:

sysctl -w vm.swappiness=0

 

Please add below variable in /etc/sysctl.conf to make it persistent.

vm.swappiness=0

 
I hope this information will help someone who is looking for OS level tuning parameters for Hadoop. Please don’t forget to give your feedback via comments or ask questions if any.
Thank you :-) I will publish second part in the next week!

 

 

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

Migration of Ambari server in Hortonworks Hadoop 2.2

There could be various scenarios in which migration of Hortonworks Management server (Ambari server) is required. For instance, due to hardware issues with server etc.

In this article we will discuss migration of Ambari server in Hortonworks Hadoop in production cluster. Assuming that your Ambari server has been setup using default DB which is PostgresSQL.

PS: This requires Ambari server downtime. This means that the Management UI won’t be available but the services will continue running in background. Plan accordingly.

 

ambari

 

Prerequisites :

 

·         Please ensure that passwordless ssh is setup between new ambari server and other hosts in cluster.

·         Also, the /etc/hosts file should be updated with info of all hosts in the cluster.

·         Java must be installed.

·         HDP and ambari repo must be configured.

 

This is not a long exercise. So let us begin :-)

 

1. Stop ambari-server.

 

ambari-server stop

 

 

2. Now we need to backup the Ambari database for migration. In production environments, ideally backups should be scheduled daily/weekly to prevent loss in case of irrecoverable server crash.

 

  • make a directory to store your backup :
mkdir /tmp/ambari-db-backup
cd /tmp/ambari-db-backup
  • Backup Ambari DB:
pg_dump -U $AMBARI-SERVER-USERNAME ambari > ambari.sql Password: $AMBARI-SERVER-PASSWORD
pg_dump -U $MAPRED-USERNAME ambarirca > ambarirca.sql Password: $MAPRED-PASSWORD

 

Note : Defaults for $AMBARI-SERVER-USERNAME is ambari-server, $AMBARI-SERVER-PASSWORD is bigdata, $MAPRED-USERNAME is mapred and $MAPRED-PASSWORD is mapred.

These values are configured while ambari-server setup.

 

 

3. Stop ambari-agent on each host in the cluster

 

ambari-agent stop

 

 

4. Edit /etc/ambari-agent/conf/ambari-agent.ini and update the hostname to point to the new ambari-server host.

 

 

5. Install the ambari-server on the new host.

 

yum install ambari-server

 

 

6. Stop this new ambari server so that we can restore it from the backup of old.

 

ambari-server stop

 

 

7. service postgresql restart

 

 

8. Login to postgresql terminal :

 

su - postgres
psql

 

 

9. Drop the newly created DB setup while installing ambari on new host.

 

drop database ambari;
drop database ambarirca;

 

 

10. Check DBs have been dropped.

 

/list (You have to execute this command in psql teminal)

 

 

11. Now create new DBs to restore the old backup into them.

 

create database ambari;
create database ambarirca;

 

 

12. Check DBs have been created.

 

/list

 

 

13. Exit psql prompt

 

 

14. Copy the DB backup from old to new ambari host by using scp.

 

 

15. Restore DB on new ambari host from backup. PS : you are still logged in as postgres user for this.

 

psql -d ambari -f /tmp/ambari.sql
psql -d ambarirca -f /tmp/ambarirca.sql

 

 

16. Start ambari-server as root now.

 

ambari-server start

 

 

17. Open the Ambari Web UI

http://<new ambari host>:8080   (Open in your favorite browser)

 

 

18. Restart a service (say MapReduce) using the Management UI to verify all is working fine.

 

 

Done! We have successfully migrated Ambari server to a new host and restored it to its original state. Enjoy! :)

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather

Install and configure Apache Phoenix on Cloudera Hadoop CDH5

 

What is Apache Phoenix?

phon

 

 

 

Apache Phoenix is a relational database layer over HBase delivered as a client-embedded JDBC driver targeting low latency queries over HBase data. Apache Phoenix takes your SQL query, compiles it into a series of HBase scans, and orchestrates the running of those scans to produce regular JDBC result sets. The table metadata is stored in an HBase table and versioned, such that snapshot queries over prior versions will automatically use the correct schema. Direct use of the HBase API, along with coprocessors and custom filters, results in performance on the order of milliseconds for small queries, or seconds for tens of millions of rows.

 

 

Now let’s see how to install apache phoenix on Cloudera Hadoop cluster CDH5  :-)

 

 

 

Step 1: Download Latest version of Phoenix using command given below

 

 

wget http://mirror.reverse.net/pub/apache/phoenix/phoenix-4.3.1/bin/phoenix-4.3.1-bin.tar.gz
--2015-04-10 13:30:41-- http://mirror.reverse.net/pub/apache/phoenix/phoenix-4.3.1/bin/phoenix-4.3.1-bin.tar.gz
Resolving mirror.reverse.net... 208.100.14.200
Connecting to mirror.reverse.net|208.100.14.200|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 72155049 (69M) [application/x-gzip]
Saving to: “phoenix-4.3.1-bin.tar.gz.1”
100%[======================================================================================================================================================>] 72,155,049   614K/s   in 2m 15s
2015-04-10 13:32:56 (521 KB/s) - “phoenix-4.3.1-bin.tar.gz.1” saved [72155049/72155049]

 

 

 

 

Step 2: Extract the downloaded tar file to convenient location

 

 

 

[root@crazyadmins ~]# tar -zxvf phoenix-4.3.1-bin.tar.gz
phoenix-4.3.1-bin/bin/hadoop-metrics2-phoenix.properties
-
-
phoenix-4.3.1-bin/examples/WEB_STAT.sql

 

 

 

 

Step 3: Copy phoenix-4.3.1-server.jar to hbase libs on each reagion server and master server

 

 

 

On master server you should copy “phoenix-4.3.1-server.jar” at “/opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/” location

 

 

 

On Hbase region server you should copy “phoenix-4.3.1-server.jar” at /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/ location

 

 

 

 

Step 4: Copy phoenix-4.3.1-client.jar to each Hbase region server

 

 

 

 

Please make sure to have phoenix-4.3.1-client.jar at /opt/cloudera/parcels/CDH-5.3.0-1.cdh5.3.0.p0.30/lib/hbase/lib/ on each region sever.

 

 

 

 

Step 5: Restart hbase services via Cloudera manager

 

 

 

 

Step 6: Testing – Goto extracted_dir/bin and run below command

 

 

 

[root@crazyadmins bin]# ./psql.py localhost ../examples/WEB_STAT.sql ../examples/WEB_STAT.csv ../examples/WEB_STAT_QUERIES.sql 
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
15/04/10 13:51:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
no rows upserted
Time: 2.297 sec(s)
csv columns from database.
CSV Upsert complete. 39 rows upserted
Time: 0.554 sec(s)
DOMAIN                                                         AVERAGE_CPU_USAGE                         AVERAGE_DB_USAGE---------------------------------------- ---------------------------------------- ----------------------------------------
Salesforce.com                                                           260.727                                 257.636
Google.com                                                              212.875                                   213.75
Apple.com                                                                 114.111                                 119.556
Time: 0.2 sec(s)
DAY                                             TOTAL_CPU_USAGE                           MIN_CPU_USAGE                           MAX_CPU_USAGE
----------------------- ---------------------------------------- ---------------------------------------- ----------------------------------------
2013-01-01 00:00:00.000                                       35                                       35                                       35
2013-01-02 00:00:00.000                                     150                                       25                                      125
2013-01-03 00:00:00.000                                       88                                       88                                       88
-
-
2013-01-04 00:00:00.000                                       26                                      3                                       232013-01-05 00:00:00.000                                     550                                       75                                     475
Time: 0.09 sec(s)
HO                   TOTAL_ACTIVE_VISITORS
-- ----------------------------------------
EU                                     150
NA                                       1
Time: 0.052 sec(s)
Done.

 

 

 

 

Step 7: To get sql shell

 

[root@crazyadmins bin]# ./sqlline.py localhost
Setting property: [isolation, TRANSACTION_READ_COMMITTED]
issuing: !connect jdbc:phoenix:localhost none none org.apache.phoenix.jdbc.PhoenixDriver
Connecting to jdbc:phoenix:localhost
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
15/04/10 13:58:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Connected to: Phoenix (version 4.3)
Driver: PhoenixEmbeddedDriver (version 4.3)
Autocommit status: true
Transaction isolation: TRANSACTION_READ_COMMITTED
Building list of tables and columns for tab-completion (set fastconnect to true to skip)...
77/77 (100%) Done
Done
sqlline version 1.1.8
0: jdbc:phoenix:localhost>

 

facebooktwittergoogle_plusredditpinterestlinkedinmailby feather