To prove Client go to Datanode directly and read the data stored on DataNode .

As we know Hadoop cluster consist of NameNode , DataNodes , and Client .And the myth in market is Client go to NameNode and read the file on DataNode via NameNode ,But this is not True I have done a small research on this myth — Is actually the Client go to NameNode and then read the file on DataNode via NameNode or Does Client go to DataNode directly and read the data .

Prasantmahato
2 min readOct 14, 2020

Practical Demonstration

I have a created a Hadoop cluster with 4 DataNodes ,1 NameNode , and with 1 Client.

4 DataNodes with 1 Client .

All DataNodes are now connected to Namenode and the Client too is waiting too upload a file ,

Next I do is ,As the instances I have been using are having Centos7 so I runned tcpdump -i eth0 -n tcp port 50010 in all 4 DataNodes . As by this command we are commanding if any packet from outside comes inside through my network card eth0 to the port no 50010 just show me ,We use tcp as hdfs protocol comes inside tcp and whole hadoop cluster works on hdfs protocol .

Running tcpdump command in all 4 DataNodes.

Then after running the tcpdump command in all 4 DataNodes we read the file from the client side by using hadoop fs -cat /filename .

Reading File from Client .

Now we can see 2nd data node is sending packets and we can confirm by seeing the ip of DataNode which is sending packet back to our Client .

2nd DataNode is sending packets to Client .

Hence,it is now clear actually that the Client go to Datanode directly and read the data stored on DataNode .

I really appreciate that you are reading my post .

THANKYOU

--

--

No responses yet