To prove Client uploads data in Datanode.
As we know Hadoop cluster consist of NameNode , DataNodes , and Client .And the myth in market is when Client uploads the data the , it is uploaded to DataNodes by NameNode ,But this is not True I have done a small research on this myth — Is actually the NameNode who is uploading data or the client is the one who gets ip addresses of DataNodes so that Client can easily upload the file to DataNodes.
Practical Demonstration.
I have a created a Hadoop cluster with 4 DataNodes ,1 NameNode , and with 1 Client.
All DataNodes are now connected to Namenode and the Client too is waiting too upload a file ,
Next I do is ,As the instances I have been using are having CentOS 7 so I runned tcpdump -i eth0 -n tcp port 50010 in all 4 DataNodes . As by this command we are commanding if any packet from outside comes inside through my network card eth0 to the port no 50010 just show me ,We use tcp as hdfs protocol comes inside tcp and whole hadoop cluster works on hdfs protocol .
Then after running the tcpdump command in all 4 DataNodes we upload a file from the client side .
Now we can see in 2nd and 3rd data node is receiving packets and we can confirm by seeing the ip of Client which is transferring packet to our DataNodes.
Hence,it is now clear actually the client is the one who is uploading data to the DataNodes.
I really appreciate that you are reading my post.