Hadoop FS Commands
Hadoop fs commands enable the interaction with the Hadoop Distributed File System (HDFS) software in order to work with big data using a smart replication strategy. This article presents some of the most important commands for HDFS below. Please find more details about YARN and other Hadoop commands in our article Hadoop Commands.
Start and Stop HDFS
The HDFS is one key element of Apache Hadoop. The following command can be used to start HDFS as follows:
$ start-dfs.sh
Output:
17/04/02 17:46:40 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [localhost] The authenticity of host 'localhost (::1)' can't be established. ECDSA key fingerprint is SHA256:0paYZ2E0RvPsE1bDbnA0FCchCebuJUvQOyj/MsL6Ifk. Are you sure you want to continue connecting (yes/no)? yes localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts. localhost: starting namenode, logging to /home/ubuntu/hadoop-2.8.0/logs/hadoop-ubuntu-namenode-i-3992cc97.out localhost: starting datanode, logging to /home/ubuntu/hadoop-2.8.0/logs/hadoop-ubuntu-datanode-i-3992cc97.out Starting secondary namenodes [0.0.0.0] The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established. ECDSA key fingerprint is SHA256:0paYZ2E0RvPsE1bDbnA0FCchCebuJUvQOyj/MsL6Ifk. Are you sure you want to continue connecting (yes/no)? yes 0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts. 0.0.0.0: starting secondarynamenode, logging to /home/ubuntu/hadoop-2.8.0/logs/hadoop-ubuntu-secondarynamenode-i-3992cc97.out 17/04/02 17:49:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
In the process of starting there might be questions about the identity that we answer with ‘yes’. If there are no critical error messages we confirm that HDFS has been started and is ready to be used. HDFS can be stopped using the following command:
$ stop-dfs.sh
Output:
17/04/02 22:10:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Stopping namenodes on [localhost] localhost: stopping namenode localhost: namenode did not stop gracefully after 5 seconds: killing with kill -9 localhost: stopping datanode localhost: datanode did not stop gracefully after 5 seconds: killing with kill -9 Stopping secondary namenodes [0.0.0.0] 0.0.0.0: stopping secondarynamenode 0.0.0.0: secondarynamenode did not stop gracefully after 5 seconds: killing with kill -9 17/04/02 22:11:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Format Name Node
In order to prepare HDFS for being used we need to format the HDFS ‘namenode’. This can be done with the following command:
$ hdfs namenode -format
Output:
17/04/02 17:38:32 INFO namenode.NameNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG: user = ubuntu STARTUP_MSG: host = i-3992cc97.csdc3cloud.internal/10.1.1.240 STARTUP_MSG: args = [-format] STARTUP_MSG: version = 2.8.0 STARTUP_MSG: classpath = /home/ubuntu/hadoop-2.8.0/etc/hadoop:/ ... 17/04/02 17:38:42 INFO common.Storage: Storage directory /home/ubuntu/hadoopinfra/hdfs/namenode has been successfully formatted. ... 17/04/02 17:38:43 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at i-3992cc97.csdc3cloud.internal/10.1.1.240 ************************************************************/
As shown above the formatting was successful but the HDFS ‘namenode’ is shut down again after this process.
Create Directory
Like in standard file systems also HDFS support the creation of directories. The following example create a directory ‘/user/ubuntu/smalldata’ using the ‘-mkdir’ option as follows:
$ hdfs dfs -mkdir -p /user/ubuntu/smalldata
Output:
17/04/03 00:41:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
We can check the creation of the directory using the ‘-ls’ command below:
$ hdfs dfs -ls /user/ubuntu
Output:
17/04/03 00:46:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items drwxr-xr-x - ubuntu supergroup 0 2017-04-03 00:41 /user/ubuntu/smalldata
Load Data into HDFS
In order to use Hadoop processing capabilities we need to upload data from a standard file system into the HDFS file system. The following example puts a file into a directory ‘/user/ubuntu/smalldata’ using the ‘-put’ option as follows:
$ hdfs dfs -put LICENSE.txt /user/ubuntu/smalldata/
Output:
17/04/03 00:51:37 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
We can check whether the upload was successful by using the following command:
$ hdfs dfs -ls /user/ubuntu/smalldata
Output:
17/04/03 00:52:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 1 items -rw-r--r-- 1 ubuntu supergroup 99253 2017-04-03 00:51 /user/ubuntu/smalldata/LICENSE.txt
The LICENSE.txt file is now accessible in the Hadoop environment.
Hadoop FS Commands Details
We refer to the following video about this subject: