Configure Hadoop and start cluster services using Ansible Playbook.

To configure hadoop using playbook ,We first need to install ansible on our controller node .

Prasantmahato

6 min readDec 6, 2020

Ansible playbook only supports yaml file .

In the ansible playbook

We have to ,

First configure and start the services of NameNode .

We need to update the inventory file with the IP’s and Authentication details .

After updating the inventory file, We have to configure ansible configuration file.

vim /etc/ansible/ansible.cfg

In ansible configuration file, We need to update the location of inventory files and also we need to configure ansible not to check host-key ,This is because whenever first time we do remote login they always ask for host-key checking for the first time .

To check all the list of hosts ,So that we can confirm our configuration has gone right.

ansible all — list-hosts

To check all IP on the inventory is ping able or not ,So that we can continue with the next step.

ansible all -m ping

CHECKING CONNECTIVITY TO THE TARGET NODE

To do this Task I have created 192.168.43.233 as NameNode and 192.168.43.76 as DataNode .

So lets start ,

I created a separate variable file to keep the ansible playbook organised so that if in future if any changes is required . Anyone can do it easily.
I created a file named nn.yml also called as parameter file.

vim /root/ansible_wp/nn.yml

I created a file named namenode.yml to configure any server as Name Node .

vim /root/ansible_wp/namenode.yml

In namenode.yml

I created 192.168.43.233 as host.

HOST

I attached the attribute file in this playbook.

INSERTING ATTRIBUTE FILE

Then assigned Tasks

STEP 1

Copying jdk rpm from controller node to the managed node.

STEP 2

Installing that jdk rpm in Managed node .

STEP 3

Copying hadoop rpm to the Managed node.

STEP 4

Installing hadoop rpm on the Managed node .

STEP 5

Copying the configured core-site.xml file and hdfs-site.xml to the managed node as it overwrites that file if any.

**copying** **configured core-site.xml file and hdfs-site.xml**

STEP 6

Deleting the Name Node folder if any named /nn1 ,it will be deleted and created a new one with the same name.

STEP 7

Formatting the Name Node inorder to start the Hadoop Cluster ,We have to format Name Node folder.

STEP 8

Stopping the NameNode service ,so if any NameNode service is in running it will stop.

STEP 9

Starting the NameNode service.

STARTING THE NAME NODE FOLDER

STEP 10

Checking whether the NameNode is ready or not.

STEP 11

Creating a Firewall rule for port no 9001 .To let Data Node to connect to the Name Node portno 9001.

Now ,

ansible-playbook namenode.yml

I created a file named dn.yml also called as parameter file.

I created a file named datanode.yml to configure any server as Data Node.

In datanode.yml

I created 192.168.43.76 as host

HOST

I attached the attribute file in this playbook.

INSERTING ATTRIBUTE FILE

Then,

I used variable_prompt because i wanted anyone who run this playbook first check whether ip on the attribute file is correct as Name Node.

Tasks

STEP 1

Stopping the Data Node ,if any Data Node is running .

STEP 2

Copying jdk rpm from controller node to the managed node.

STEP 3

Installing that java rpm in Managed node .

STEP 4

Copying Hadoop rpm to the managed node.

STEP 5

Installing Hadoop rpm on the managed node .

STEP 6

Copying the configured core-site.xml file and hdfs-site.xml to the managed node as it overwrites that file if any.

In core-site.xml file

I created a variable ip for the future use if any Name Node ip changes in future we can dynamically change the ip just by going inside the attributes file ,ie dn.yml