Installing guide for Hadoop ,Hive ,Derby on linux centos

Installing guide for Hadoop ,Hive ,Derby on linux centos

Hi, all.. here in OSC we have been testing different technologies for BIG DATA, sql and non-sql solutions..

So i have this super simple guide, related about to install.. in the future i will talk about each program deeply.

  • INSTALLING HADOOP IN centos 6
  • INSTALLING HIVE IN centos 6
  • INSTALLING DERBY IN centos 6

hadoop-0.20.203.0rc1

this is the guide for the installation of Hadoop ecosystem,  is very extended so please follow step by step

INSTALLATION of hadoop

1. Installing java
yum  install sun-java6-jdk

2.Adding a dedicated user for hadoop
This will add the user hdoopuser and the group hdoopgroup to your local machine.
/usr/sbin/useradd hdoopuser
groupadd hdoopgroup
usermod -a -G hdoopgroup hdoopuser

3.Configuring SSH
su – hdoopuser        #login as hdoopuser
ssh-keygen -t rsa -P “”    #generate key without password
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys    #enable the new key
chmod 0600 $HOME/.ssh/authorized_keys    #enable empty password

4.Disabling IPv6
sed -i ‘s/^\(NETWORKING\s*=\s*\).*$/\NETWORKING=NO/’ /etc/sysconfig/network

5.Installation/Conf/startup of Hadoop
mkdir /hadoop
chown -R hdoopuser /hadoop
cd /hadoop/
wget http://mirrors.abdicar.com/Apache-HTTP-Server//hadoop/common/stable/hadoop-0.20.203.0rc1.tar.gz
tar -xvzf hadoop-0.20.203.0rc1.tar.gz
ln -s /hadoop/hadoop-0.20.203.0rc1/ /hadoop/hadoop
cd /hadoop/hadoop

Configuratin Hadoop

1)
vim conf/core-site.xml
#Add the following inside the <configuration> tag
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000/</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
2)
vim conf/hdfs-site.xml
#Add the following inside the <configuration> tag
<property>
<name>dfs.name.dir</name>
<value>/hadoop/hdfs/name</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/hadoop/hdfs/data</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
3)
vim conf/mapred-site.xml
#Add the following inside the <configuration> tag
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
4)
vim conf/hadoop-env.sh
export JAVA_HOME=/opt/jre/
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
5)
Fomart nodes
su – hdoopuser
cd /hadoop/hadoop
bin/hadoop namenode -format
6)Start hadoop
bin/start-all.sh
notes:  HTTP CONSOLE OF HADOOP
http://localhost:50030/ for the jobtrackeR
http://localhost:50070/ for the namenode

5.Installation/Conf/startup of Hive/Derby

5. Installing, configuring and starting hive/derby

cd /hadoop
wget http://mirrors.ucr.ac.cr/apache//hive/stable/hive-0.8.1-bin.tar.gz
tar -xvzf hive-0.8.1-bin.tar.gz
ln -s /hadoop/hive-0.8.1-bin/ /hadoop/hive
export HADOOP_HOME=/hadoop/hadoop/
cd /hadoop/hive
mv conf/hive-default.xml.template conf/hive-site.xml\
Testing hive
bin/hive
> show tables;
Installing derby metadatastore
cd /hadoop
wget http://archive.apache.org/dist/db/derby/db-derby-10.4.2.0/db-derby-10.4.2.0-bin.tar.gz
tar -xzf db-derby-10.4.2.0-bin.tar.gz
ln -s db-derby-10.4.2.0-bin derby
mkdir derby/data
export DERBY_INSTALL=/hadoop/derby/
export DERBY_HOME=/hadoop/derby/
export HADOOP=/hadoop/hadoop/bin/hadoop

vim /hadoop/hadoop/bin/start-dfs.sh
#add to the file start-dfs.sh the next 2 lines
cd /hadoop/derby/data
nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &

vim /hadoop/hadoop/bin/start-all.sh
#add to the file start-all.sh the next 2 lines
cd /hadoop/derby/data
nohup /hadoop/derby/bin/startNetworkServer -h 0.0.0.0 &

 

Configuring hive

installing web panel for hive , search and replace

vim /hadoop/hive/conf/hive-site.xml
#search for “javax.jdo.option.ConnectionURL” and edit like the following
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
HTTP CONSOLE OF HIVE
bin/hive –service hwi &
URL: http://localhost:9999/

#create new file
vim /hadoop/hive/conf/jpox.properties
#add the following
javax.jdo.PersistenceManagerFactoryClass=org.jpox.PersistenceManagerFactoryImpl
org.jpox.autoCreateSchema=false
org.jpox.validateTables=false
org.jpox/usr/share/javadoc/java-1.6.0-openjdk/jre/.validateColumns=false
org.jpox.validateConstraints=false
org.jpox.storeManagerType=rdbms
org.jpox.autoCreateSccp /hadoop/derby/lib/derbytools.jar  /hadoop/hive/libhema=true
org.jpox.autoStartMechanismMode=checked
org.jpox.transactionIsolation=read_committed
javax.jdo.option.DetachAllOnCommit=true
javax.jdo.option.NontransactionalRead=true
javax.jdo.option.ConnectionDriverName=org.apache.derby.jdbc.ClientDriver
javax.jdo.option.ConnectionURL=jdbc:derby://localhost:1527/metastore_db;create=true
javax.jdo.option.ConnectionUserName=APP
javax.jdo.option.ConnectionPassword=mine
#now copy derby jar sources to Hive lib
cp /hadoop/derby/lib/derbyclient.jar /hadoop/hive/lib
cp /hadoop/derby/lib/derbytools.jar  /hadoop/hive/lib

HTTP CONSOLE OF HIVE
http://localhost:9999/hwi/ for the hive

6.START CLUSTER

/hadoop/hadoop/bin/start-all.sh
/hadoop/hive/bin/hive –service hwi &   #hwi=webpanel

7. FOR NEXT TIME AND EVER. Create a bash profile

vi /etc/profile
export JAVA_HOME=/opt/jre/
export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true
export HADOOP_HOME=/hadoop/hadoop/
export DERBY_INSTALL=/hadoop/derby/
export DERBY_HOME=/hadoop/derby/
export HADOOP=/hadoop/hadoop/bin/hadoop

 

Running all ecosystem of hadoop + hive

PANELS:
http://localhost:50030/ for the jobtrackeR
http://localhost:50060/ for the  tasktracker
http://localhost:50070/ for the namenode
http://localhost:9999/hwi/ for the hive

 

thanks for the reading,  @jamesjara , O.S.C

Facebook Comments

comments

Tagged with: , ,
Posted in post