Cloud
Cloud services, intelligent integrations
Oracle Cloud – CDI
-
Kubernetes – 1 (OKE)
-
Kubernetes – 2 (OKE)
–
-
OCI Certification – 1
–
-
Oracle Storage
– NOSQL – Pricing / Price Calculator / RU – WU Estimation -
OCI Data Catalog – 1 (Link)
Always Free Services:
https://docs.oracle.com/en-us/iaas/Content/FreeTier/freetier_topic-Always_Free_Resources.htm
Microsoft Cognitive Services

Microsoft Cloud Services
Vision,
Computer Vision,
Content Moderator,
Emotion,
Face,
Video,
Speech,
Bing Speech,
Custom Speech Service,
Speaker Recognition,
Language,
Bing Spell Check,
Language Understanding,
Linguistic Analysis,
Text Analytics,
Translator,
WebLM,
Knowledge,
Academic,
Entity Linking,
Knowledge Exploration,
QnA Maker,
Recommendations,
Search,
Bing Autosuggest,
Bing Image Search,
Bing News Search,
Bing Video Search,
Bing Web Search,
Installing hadoop 2.4.0 on Ubuntu server
Installing hadoop 2.4.0
First of all, THESE ARE IMPORTANT FOR INDRODUCTION
*Ubuntu server 14.04
*VM in VirtualBox with 1GB RAM
*SWAP SPACE : SELECT more than 2 GB
Up-to-date system.
1 2 |
kutayzorlu@coder_telekom:~$ sudo apt-get update kutayzorlu@coder_telekom:~$ sudo apt-get upgrade |
Moreover, it is advisable not to run Hadoop services through a general-purpose user, so the next step consists in adding a group hadoop
and a user hadoop-user
belonging to that group
1 2 |
kutayzorlu@coder_telekom:~$ sudo addgroup hadoop kutayzorlu@coder_telekom:~$ sudo adduser --ingroup hadoop hadoop-user |
Installing Java
The mentioned tutorials suggest a potentially unsafe procedure in order to install the jdk through apt-get
,
1 2 3 4 5 6 7 |
kutayzorlu@coder_telekom:~$ wget "http://server12.kutayzorlu.com/7u45-b18/jdk-7u45-linux-x64.tar.gz" ... kutayzorlu@coder_telekom:~$ tar -xvzf jdk-7-linux-x64.tar.gz kutayzorlu@coder_telekom:~$ sudo mkdir /usr/local/java kutayzorlu@coder_telekom:~$ sudo cp -r jdk1.7.0_45 /usr/local/java kutayzorlu@coder_telekom:~$ sudo update-alternatives --install "/usr/bin/javac" "javac" "/usr/local/java/jdk1.7.0_45/bin/javac" 1 kutayzorlu@coder_telekom:~$ sudo update-alternatives --set javac /usr/local/java/jdk1.7.0_45/bin/javac |
Finally, a couple of environment variables should be set up so that the java executables are in $PATH
and hadoop knows where java has been installed: this is easily accomplished adding
1 2 3 4 |
JAVA_HOME=/usr/local/java/jdk1.7.0_45 PATH=$PATH:$HOME/bin:$JAVA_HOME/bin export JAVA_HOME export PATH |
at the end of /etc/profile <<< We need to edit that to set automatically JAVA PATH
1 2 3 |
kutayzorlu@coder_telekom:~$ . /etc/profile kutayzorlu@coder_telekom:~$ javac -version javac 1.7.0_45 |
Setup SSH
All communications with Hadoop are encrypted via SSH, thus the corresponding server should be installed:
1 |
kutayzorlu@coder_telekom:~$ sudo apt-get install openssh-server |
and the hadoop-user
must be associated to a key pair and subsequently granting its access to the local machine:
1 2 3 4 5 |
kutayzorlu@coder_telekom:~$ su - hadoop-user hadoop-user@coder_telekom:~$ ssh-keygen -t rsa -P "" #<<<<<<<<<<<< Generating public/private rsa key pair. The key's randomart image is: hadoop-user@coder_telekom:~$ cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys |
Now hadoop-user
should be able to access via ssh to localhost
without ( WE SET THE PASS = “null”)providing a password:
1 2 3 4 |
hadoop-user@coder_telekom:~$ ssh localhost The authenticity of host 'localhost (::1)' can't be established. Last login: $ |
Disable IPV6
Hadoop and IPV6 do not agree on the meaning of 0.0.0.0
address,
LOOK AT : ??>>>>> /etc/sysctl.conf
1 2 3 |
net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1 |
REBOOT SYSTEM
CONTROL
1 2 |
cat /proc/sys/net/ipv6/conf/all/disable_ipv6 # should be 1, #1 meaning that IPV6 is actually disabled. |
Hadoop
Download and install Hadoop
Download hadoop-2.4.0.tar.gz, unpack it and move the results in /usr/local
, adding a symlink using the more friendly name hadoop
and changing ownership of the directory contents to the hadoop-user
user:
1 2 3 4 5 6 7 |
kutayzorlu@coder_telekom:~$ wget wget http://apache.mirrors.pair.com/hadoop/common/hadoop-2.4.0/hadoop-2.4.0.tar.gz ... kutayzorlu@coder_telekom:~$ tar -xzvf hadoop-2.4.0.tar.gz kutayzorlu@coder_telekom:~$ sudo mv hadoop-2.4.0 /usr/local kutayzorlu@coder_telekom:~$ cd /usr/local kutayzorlu@coder_telekom:/usr/local$ sudo ln -s hadoop-2.4.0 hadoop kutayzorlu@coder_telekom:/usr/local$ sudo chown -R hadoop-user:hadoop hadoop-2.4.0 |
Setup the dedicated user environment
Switch to the hadoop-user
user and add the following lines at the end of ~/.bashrc
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# Set Hadoop-related environment variables export HADOOP_PREFIX=/usr/local/hadoop export HADOOP_HOME=/usr/local/hadoop export HADOOP_MAPRED_HOME=${HADOOP_HOME} export HADOOP_COMMON_HOME=${HADOOP_HOME} export HADOOP_HDFS_HOME=${HADOOP_HOME} export YARN_HOME=${HADOOP_HOME} export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop # Native Path export HADOOP_COMMON_LIB_NATIVE_DIR=${HADOOP_PREFIX}/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_PREFIX/lib" #Java path export JAVA_HOME='/usr/local/java/jdk1.7.0_45' # Add Hadoop bin/ directory to PATH export PATH=$PATH:$HADOOP_HOME/bin:$JAVA_PATH/bin:$HADOOP_HOME/sbin |
To have the new environment variables in place,
1- reload .bashrc
through “source .bashrc"
2- then open /usr/local/hadoop/etc/hadoop/hadoop-env.sh
,
3- uncomment the line setting JAVA_HOME
and set its value to the jdk directory:
1 |
export JAVA_HOME=/usr/local/java/jdk1.7.0_45 |
Configure Hadoop
Before being able to actually use the hadoop file system it is necessary to modify some configuration files inside /usr/local/hadoop/etc/hadoop
All such files follow the an XML format, and the updates should concern the top-level node configuration
(likely empty after the hadoop installation). Specifically:
- in
yarn-site.xml
:
123456yarn.nodemanager.aux-servicesmapreduce_shuffleyarn.nodemanager.aux-services.mapreduce.shuffle.classorg.apache.hadoop.mapred.ShuffleHandler
- look in
core-site.xml
:
12fs.default.namehdfs://localhost:9000
- look in
mapred-site.xml
12mapreduce.framework.nameyarn
- nano
hdfs-site.xml
:
12345678910dfs.replication1dfs.namenode.name.dirfile:/usr/local/hadoop/yarn_data/hdfs/namenodedfs.datanode.data.dirfile:/usr/local/hadoop/yarn_data/hdfs/datanode
Run these commands.
1 2 |
hadoop-user@coder_telekom:~$ mkdir -p /usr/local/hadoop/yarn_data/hdfs/namenode hadoop-user@coder_telekom:~$ mkdir -p /usr/local/hadoop/yarn_data/hdfs/datanode |
Formatting the distributed file system
USER should be hadoop user
1 2 |
hadoop-user@coder_telekom:~$ hdfs namenode -format ... |
Find the these 2 file and run
start-dfs.sh
start-yarn.sh
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
hadoop-user@coder_telekom:~$ start-dfs.sh 8/08/10 13:18:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [localhost] localhost: starting namenode, logging to /usr/local/hadoop-2.4.0/logs/hadoop-hadoop-user-namenode-coder_telekom.out localhost: starting datanode, logging to /usr/local/hadoop-2.4.0/logs/hadoop-hadoop-user-datanode-coder_telekom.out Starting secondary namenodes [0.0.0.0] 0.0.0.0: starting secondarynamenode, logging to /usr/local/hadoop-2.4.0/logs/hadoop-hadoop-user-secondarynamenode-coder_telekom.out 8/08/10 13:18:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable hadoop-user@coder_telekom:~$ start-yarn.sh starting yarn daemons starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hadoop-user-resourcemanager-coder_telekom.out localhost: starting nodemanager, logging to /usr/local/hadoop-2.4.0/logs/yarn-hadoop-user-nodemanager-coder_telekom.out .... hadoop-user@coder_telekom:~$ hdfs dfs -mkdir /user hadoop-user@coder_telekom:~$ hdfs dfs -mkdir /user/hadoop-user ... hadoop-user@coder_telekom:~$ hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.0.jar pi 10 1000 ... Job Finished in 22.136 seconds Estimated value of Pi is 3.142857.... |
1 2 3 4 |
... # For stop daemons $ stop-dfs.sh $ stop-yarn.sh. |
Hadoop security File and directory permissions
Permissions for both HDFS and local fileSystem paths
The following table lists various paths on HDFS and local filesystems (on all nodes) and recommended permissions:
Filesystem | Path | User:Group | Permissions |
---|---|---|---|
local | dfs.namenode.name.dir | hdfs:hadoop | drwx------ |
local | dfs.datanode.data.dir | hdfs:hadoop | drwx------ |
local | $HADOOP_LOG_DIR | hdfs:hadoop | drwxrwxr-x |
local | $YARN_LOG_DIR | yarn:hadoop | drwxrwxr-x |
local | yarn.nodemanager.local-dirs | yarn:hadoop | drwxr-xr-x |
local | yarn.nodemanager.log-dirs | yarn:hadoop | drwxr-xr-x |
local | container-executor | root:hadoop | --Sr-s--* |
local | conf/container-executor.cfg | root:hadoop | r-------* |
hdfs | / | hdfs:hadoop | drwxr-xr-x |
hdfs | /tmp | hdfs:hadoop | drwxrwxrwxt |
hdfs | /user | hdfs:hadoop | drwxr-xr-x |
hdfs | yarn.nodemanager.remote-app-log-dir | yarn:hadoop | drwxrwxrwxt |
hdfs | mapreduce.jobhistory.intermediate-done-dir | mapred:hadoop | drwxrwxrwxt |
hdfs | mapreduce.jobhistory.done-dir | mapred:hadoop | drwxr-x--- |
For More please Visit that page :
http://hadoop.apache.org/docs/r2.7.0/hadoop-project-dist/hadoop-common/SecureMode.html#Proxy_user