Linux Training in Coimbatore & Best Linux Server Administration Training Institute: August 2023

Wednesday 30 August 2023

Linux interview questions

1.What is Apache HTTP Server?

The Apache HTTP Server, commonly referred to as Apache, is an open-source web server software widely used to serve websites and web applications.

You can use the following commands:

To start: sudo systemctl start apache2 (or httpd depending on the distribution)

To stop: sudo systemctl stop apache2 (or httpd)

2.What is web server configuration?

A web server is software that serves web content (HTML files, images, etc.) to clients over the internet using HTTP or HTTPS protocols.

Which popular web servers are commonly used in Linux?

Apache HTTP Server (httpd) and Nginx are two of the most commonly used web servers in Linux.

What is a virtual host in the context of a web server?

A virtual host allows a single physical server to host multiple websites or domains, each with its own configuration and content.

How can you install the Apache web server on a Linux system?

For Debian-based distributions: sudo apt-get install apache2

For Red Hat-based distributions: sudo yum install httpd

3.What is LVM?

LVM, or Logical Volume Manager, is a storage management technology that allows you to manage and organize physical storage devices (disks) as logical volumes.

What are the main components of LVM?

The main components of LVM are Physical Volumes (PVs), Volume Groups (VGs), and Logical Volumes (LVs).

What is a Physical Volume (PV)?

A Physical Volume is a physical storage device, such as a hard disk or a partition, that is used as a building block for LVM.

What is a Volume Group (VG)?

A Volume Group is a collection of one or more Physical Volumes. It acts as a pool of storage that can be divided into Logical Volumes.

4.What is FTP?

FTP (File Transfer Protocol) is a standard network protocol used to transfer files between a client and a server over a network, typically the internet.

How do you install an FTP server on a Linux system?

Common FTP server software includes vsftpd (Very Secure FTP Daemon) and proftpd. Installation commands may vary by Linux distribution.

What are the default ports used by FTP for data and control connections?

FTP control connections use port 21 by default, and data connections use additional ports (active mode) or a range of passive ports (passive mode).

5.What is SMTP?

SMTP (Simple Mail Transfer Protocol) is a protocol used for sending and relaying email messages between email servers and clients.

What are the standard ports used by SMTP for sending and receiving emails?

The standard port for outgoing SMTP (SMTP client to server) is port 25, and the standard port for incoming SMTP (server to server) is port 25.

How do you install an SMTP server on a Linux system?

Popular SMTP server software includes Postfix, Sendmail, and Exim. Installation methods may vary depending on the Linux distribution.

6.What is Samba?

Samba is an open-source software suite that allows Linux and Unix systems to share files, folders, and printers with Windows systems using the SMB/CIFS protocol.

How do you install Samba on a Linux system?

You can install Samba using package management tools. For example, on Debian-based systems: sudo apt-get install samba.

7.What is RPM?

RPM (Red Hat Package Manager) is a package management system used in Red Hat-based Linux distributions to manage software packages.

What is the purpose of RPM packages?

RPM packages contain software, files, and metadata required for installation, upgrading, or removal of software on a Linux system.

How do you install an RPM package?

You can use the rpm command to install RPM packages.

8.What is NFS?

NFS (Network File System) is a protocol that allows clients to access files and directories on remote servers as if they were local. It's used for sharing files and resources across a network.

How does NFS work?

NFS works by allowing a client system to mount remote directories as if they were local filesystems. This enables users to access and manipulate remote files seamlessly.

What is an NFS export?

An NFS export refers to a directory on a server that is made available for access by NFS clients. Clients can mount this export to access its contents.

How do you install and configure NFS on a Linux system?

Install the NFS server software (e.g., nfs-utils package), configure the exports in the /etc/exports file, and start the NFS server service.

9.What are the common methods of installing software on a Linux system?

Common methods include package managers (apt, yum, dnf), downloading and compiling source code, using software centers, and using graphical installers.

What is a package manager?

A package manager is a tool that automates the process of installing, upgrading, configuring, and removing software packages on a Linux system.

How do you install software using a package manager on a Debian-based system?

Use the apt-get or apt command followed by the package name. For example: sudo apt-get install package-name.

10.What is the booting process in Linux?

The booting process is the sequence of events that occur when a computer is powered on, leading to the loading of the operating system and the initiation of user sessions.

Explain the main stages of the Linux booting process.

The main stages are: BIOS/UEFI, bootloader (e.g., GRUB), kernel initialization, initial RAM disk (initramfs), user space initialization (init/systemd), and user login.

Intermediate Booting Process Questions:

What is the role of the BIOS/UEFI in the booting process?

BIOS (Basic Input/Output System) or UEFI (Unified Extensible Firmware Interface) performs hardware initialization, self-tests, and loads the bootloader from the boot device.

11.What are the file permissions in Linux?

There are 3 types of permissions in Linux OS that are given below:

Read: User can read the file and list the directory.

Write: User can write new files in the directory.

Execute: User can access and run the file in a directory.

12.What are port numbers in networking?

HTTP (Hypertext Transfer Protocol) - Port 80

HTTPS (HTTP Secure) - Port 443

FTP (File Transfer Protocol) - Port 21

SSH (Secure Shell) - Port 22

SMTP (Simple Mail Transfer Protocol) - Port 25

13.What is GUI and CLI?

GUI: GUI stands for Graphical User Interface. It is a human-computer interface that permits users to collaborate with electronic devices through visual indicators and graphical icons. The use of these graphical icons or elements makes it convenient for the user to collaborate with the system. It's visually intuitive and permits higher productivity. It uses the images and the icons which are clicked by the users to communicate with the system. It is more attractive and user-friendly because of the use of the images and icons.

CLI: CLI stands for Command Line Interface. It is a command-line program that takes text as input to run the tasks of the OS. It permits users to enter declarative commands to give instructions to the system to perform operations. It needs less memory than other interfaces, and it doesn't need Windows because a low-resolution monitor can also be used. It is an interface that allows users to type declarative commands to instruct the computer to perform operations.

14.What is a root account?

The root account is like a system administrator account. It provides you full control of the system. You can create and maintain user accounts, assign different permission for each account, etc.

15.How to exit from vi editors?

The following commands are used to exit from vi editors.

:wq saves the current work and exits the VI.

:q! exits the VI without saving current work.

Tuesday 29 August 2023

Linux Interview Questions :

1) What is Linux? Discuss its features.

Linux is a UNIX based operating system. Linus Torvalds first introduced it. It is an open source operating system that was designed to provide free and a low-cost operating system for the computer users.

Linux is an open-source Unix-like computer OS that directly manages the resources and hardware of a system, like storage, memory, and CPU, and handles the communication between hardware and software. First, it was published on 5 October 1991 by Linus Torvalds for systems and is considered faster and more secure than Windows. It's freely distributable and is basically established around Linux Kernel. Additionally, it can be installed in notebooks, computers, laptops, mobiles, etc. Linux operating system flavors include Gentoo, SUSE Linux, Debian, Ubuntu, etc.

2) What is the difference between UNIX and Linux?

UNIX was originally started as a propriety operating system for Bell Laboratories, which later release their commercial version while Linux is a free, open source and a non-propriety operating system for the mass uses.

3) What is Linux Kernel? Discuss its functions.

Linux Kernel is low-level system software. It is used to manage the hardware resources for the users. It provides an interface for user-level interaction.

Linux kernel is the primary component of the Linux operating system. Simply, it is a resource manager that works as a bridge between software and hardware. Its primary role is to handle hardware resources and is generally used to offer an interface for interaction. A Linux kernel is the initial program that's loaded whenever a system starts. Also, it is called low-level system software.

4) Is it legal to edit Linux Kernel?

Yes. You can edit Linux Kernel because it is released under General Public License (GPL) and anyone can edit it. It comes under the category of free and open source software.

5) What is LILO?

LILO is a boot loader for Linux. It is used to load the Linux operating system into the main memory to begin its operations. It is a bootloader that is used for loading Linux into memory and dawning the OS. Also, it is called a boat manager that offers a dual-boot of a system. It can work as either a secondary boot program or a master boot program that implements several functions, such as starting the kernel, loading memory, recognizing other supporting programs, and locating the kernel. We need to install a unique bootloader, i.e., LILO, if we wish to use Linux operating system because it permits Linux OS boot.

6) What is the advantage of open source?

Open source facilitates you to distribute your software, including source codes freely to anyone who is interested. So, you can add features and even debug and correct errors of the source code.

7) What are the basic components of Linux?

Just like other operating systems, Linux has all components like kernel, shells, GUIs, system utilities and application programs. Generally, Linux is composed of five components or elements, which are listed and explained as follows:

8) What is the advantage of Linux?

Every aspect comes with additional features, and it provides a free downloading facility for all codes.

9) What do you mean by Linux Shell? Explain its types.

Linux Shell is called the user interface available between the kernel and the user. It is used to execute communication and commands with Linux operating system. Linux shell takes human-readable commands as input and transforms them into a language that is kernel understandable.

Different shell types are used on classic Linux systems, as mentioned below:

Bourns Shell

ZSH

TCSH

Bourns Again Shell or BASH

Korn Shell or KSH

C Shell or CSH

10) Name the Linux which is specially designed by the Sun Microsystems.

Solaris is the Linux of Sun Microsystems.

11) Name the Linux loader.

LILO is the Linux loader.

12) If you have saved a file in Linux. Later you wish to rename that file, what command is designed for it?

The 'mv' command is used to rename a file.

13) Write about an internal command.

The commands which are built in the shells are called as the internal commands.

14) Explain Process Id and INODE.

Process Id: It's a unique Id provided to all processes. It is used to identify a running process uniquely throughout the computer until the process ends.

INODE: It's a unique name provided to all files by the operating system. All inodes have a unique inode number in a file system. INODE stores many details about files, including the number of links, access mode, file type, file size, ownership, etc.

15) If the programmer wishes to execute an instruction at the specified time. Which command is used?

The 'at' command is used for the same.

16) Name some Linux variants.

Some of the Linux commands are:

CentOS

Ubuntu

Redhat

Debian

Fedora

17) What is Swap Space?

Swap space is used to specify a space which is used by Linux to hold some concurrent running program temporarily. It is used when RAM does not have enough space to hold all programs that are executing.

As its name implies, swap space is a space on the hard disk used when the RAM or physical memory amount is full. It's a replacement for physical memory. Its primary function is to replace disk space for memory when actual RAM doesn't have sufficient space to hold every program that is running, and more space is needed. In other words, it can be used as a RAM extension by Linux.

18) What is BASH?

BASH is a short form of Bourne Again SHell. It was a replacement to the original Bourne shell, written by Steve Bourne.

It is generally a command language interpreter. BASH was written for GNU OS by Brian Fox and can replace Bourne Shell. It is the same as Bourne Shell but contains some extra features like a command-line amendment that make it more convenient and easier to use. It's the default user shell on almost every Linux installation. Basically, it is a non-compiled and interpreted process that can also execute in the terminal window. Also, it is able to read commands in shell scripts.

19) What is the basic difference between BASH and DOS?

BASH commands are case sensitive while DOS commands are not case sensitive.

DOS follows a convention in naming files. In DOS, 8 character file name is followed by a dot and 3 characters for the extension. BASH doesn't follow such convention.

20) What is a root account?

The root account is like a system administrator account. It provides you full control of the system. You can create and maintain user accounts, assign different permission for each account, etc.

21) What is GUI and CLI?

22) Which popular office suite is available free for both Microsoft and Linux?

Open Office Suite is available free for both Microsoft and Linux. You can install it on both of them.

23) Suppose your company is recently switched from Microsoft to Linux and you have some MS Word document to save and work in Linux, what will you do?

Install Open Office Suite on Linux. It facilitates you to work with Microsoft documents.

24) What is SMTP?

SMTP stands for Simple Mail Transfer Protocol. It is an internet standard for mail transmission.

25) What is Samba? Why is it used?

Samba service is used to connect Linux machines to Microsoft network resources by providing Microsoft SMB support.

26) What are the basic commands for user management?

last,

chage,

chsh,

lsof,

chown,

chmod,

useradd,

userdel,

newusers etc.

27) What is the maximum length for a filename in Linux?

255 characters.

28) Is Linux Operating system virus free?

No, there is no operating system till date that is virus free, but Linux is known to have less number of viruses.

29) Which partition stores the system configuration files in Linux system?

/stc partition.

30) Which command is used to uncompress gzip files?

gunzip command is used to uncompress gzip files.

31) Why do developers use MD5 options on passwords?

MD5 is an encryption method, so it is used to encrypt the passwords before saving.

32) What is a virtual desktop?

The virtual desktop is used as an alternative to minimizing and maximizing different windows on the current desktop. Virtual desktop facilitates you to open one or more programs on a clean slate rather than minimizing or restoring all the needed programs.

33) What is the difference between soft and hard mounting points?

In the soft mount, if the client fails to connect the server, it gives an error report and closes the connection whereas in the hard mount, if the client fails to access the server, the connection hangs; and once the system is up, it again accesses the server.

34) Does the Alt+Ctrl+Del key combination work in Linux?

Yes, it works like windows.

35) What are the file permissions in Linux?

There are 3 types of permissions in Linux OS that are given below:

Read: User can read the file and list the directory.

Write: User can write new files in the directory.

Execute: User can access and run the file in a directory.

36) What are the modes used in VI editor?

Visual Editor or VI editor is a default text editor that comes with almost every Linux operating system. There are three different types of modes utilized in the VI editor, as mentioned below:

Regular Mode or Command Mode: For the VI editor, it's the default mode. It's used for typing commands that perform specific or particular vi functions. To input this mode from other modes, one must click [esc]. It lets us see the content.

Edit Mode or Insertion Mode: It permits us to type text in a file or perform text editing. To input this mode from other modes, one must click [esc]. It lets us insert or delete content or text.

Replacement Mode or Ex Mode: It is used to store the files and implementation of the commands. It runs files with distinct parameters. To input this mode, one must click [:]. It lets us overwrite text or content.

37) How to exit from vi editors?

The following commands are used to exit from vi editors.

:wq saves the current work and exits the VI.

:q! exits the VI without saving current work.

38) How to delete information from a file in vi?

The following commands are used to delete information from vi editors.

x deletes a current character.

dd deletes the current line.

39) How to create a new file or modify an existing file in vi?

vi filename

40) What are the Linux user mode types?

Two Linux user modes are available, which are as follows:

GUI

Command Line

41) What are the Process States?

Linux process is a kind of process that can be in several distinct states. The process comes in these states from start to end. In Linux, the process states are mentioned below:

42) What is the typical swap partition size in the Linux System?

It should be double the amount of RAM or physical memory available in the system.

43) What is the name of the file that's used to mount file systems automatically?

The fstab file is the file that's used to mount file systems automatically.

44) What do you mean by LVM, and why it's important?

Logical Volume Management, or LVM, is a tool that offers the management of logical volume for the Linux kernel. It's simply being introduced to enable physical storage device management convenient. Also, it features allocating disks, resizing, mirroring, and striping logical volumes.

Its primary benefits are enhanced abstraction, control, and flexibility. It permits extensible disk space management. It is needed to online resize the file system. The LVM partition size can be increased with the "lvextend" command and decreased with the "lvreduce" command in Linux.

45) Define the "/proc" file system.

A proc file system is a virtual or pseudo file system that offers an interface to the data structure of the kernel. Generally, it contains useful details about processes that are currently running. Also, it can be used to modify a few kernel parameters during execution or runtime.

46) What are the daemons?

Daemons are also called the background process. They are a long-running program of Linux that executes in the background. They don't have any controlling terminal; hence they execute in the background. Daemons are the process starts when the computer is bootstrapped and ends or terminates only when the computer is shut down. It's simply the form of increasing the base OS functionality. It provides many functions that aren't available in OS.

47) What is the name of the daemon that can control the print spooling process?

The line printing daemon can control the print spooling process.

48) What do you mean by a Zombie Process?

In Linux, a Zombie Process is also known as a dead or defunct process. It is a process that has completed the execution, but its access remains inside the process table. Usually, it happens because of insufficient correspondence between child and parent processes. This process appears for the child process due to the parent process requires reading the child process's status. This process is deleted from the process table once it is finished with the wait system call.

49) What is the difference between an anacron and a cron?

Anacron: In Linux, anacron is a program used to run tasks at some intervals. It works on machines effectively that are powered off in a week or day.

Cron: In Linux, cron is a program used to run tasks at an expected time. It works on machines effectively that continuously run.

Anacron Cron

It's not a daemon. It's a daemon.

Only it can be utilized by super users. It can be expected by normal users.

It's ideal for laptops and desktops. It's ideal for servers.

It doesn't suppose the system to execute 24*7. It supposes the system to run 24*7.

The minimum granularity of an anacron is in days only. The minimum granularity of a cron is in minutes.

It periodically executes commands. It executes scheduled commands.

50) What do you mean by load average under Linux?

As its name implies, load average is the system average loaded on Linux servers being measured over a given time. The Linux server's load average can be detected using "uptime" and "top" commands. Also, it is used to record the system resources.

51) What is the Shell Script?

As its name implies, Shell Script is a script mainly written for the shell. The script describes the programming language being used to manage applications. It permits the execution of distinct commands entered inside the shell. Generally, it helps us to make complex programs, including functions, loops, and conditional statements.

52) What is the name of the first process initiated by the kernel, and what is the process id of it in Linux?

"init" is the first process initiated by the kernel, and 1 is the process id of it.

Linux Networking

53) What are /etc/hosts and /etc/resolv.conf files in Linux?

/etc/hosts: This file translates or maps any domain name or hostname to its corresponding IP address.

/etc/resolv.conf: This file configures DNS name servers because it includes the information of the name server, i.e., information of our DNS server. Then, the DNS server resolves the IP address hostname.

54) What is Network bonding in Linux?

Network bonding is also called NIC teaming. It is a kind of bonding that connects two or more network interfaces to a single interface. Usually, it improves redundancy and performance generally by increasing network bandwidth and throughput.

55) What are the default ports for squid, DHCP, SSH, FTP, SMTP, and DNS?

Default ports for several services are below:

Servers Default Ports

squid 3128

DHCP 68/UDP (dhcp client), 68/UDP (dhcp server)

SSH 22

FTP 21 (Connection established), 22 (Data transfer)

SMTP 25

DNS 53

Tuesday 1 August 2023

How To Install and Configure Hadoop on CentOS/RHEL 8

Hadoop is a free, open-source and Java-based software framework used for storage and processing of large datasets on clusters of machines. It uses HDFS to store its data and process these data using MapReduce. It is an ecosystem of Big Data tools that are primarily used for data mining and machine learning. It has four major components such as Hadoop Common, HDFS, YARN, and MapReduce.

In this guide, we will explain how to install Apache Hadoop on RHEL/CentOS 8.

Step 1 – Disable SELinux

Before starting, it is a good idea to disable the SELinux in your system.

To disable SELinux, open the /etc/selinux/config file:

nano /etc/selinux/config

Change the following line:

SELINUX=disabled

Save the file when you are finished. Next, restart your system to apply the SELinux changes.

Step 2 – Install Java

Hadoop is written in Java and supports only Java version 8. You can install OpenJDK 8 and ant using DNF command as shown below:

dnf install java-1.8.0-openjdk ant -y

Once installed, verify the installed version of Java with the following command:

java -version

You should get the following output:

openjdk version "1.8.0_232"
OpenJDK Runtime Environment (build 1.8.0_232-b09)
OpenJDK 64-Bit Server VM (build 25.232-b09, mixed mode)

Step 3 – Create a Hadoop User

It is a good idea to create a separate user to run Hadoop for security reasons.

Run the following command to create a new user with name hadoop:

useradd hadoop

Next, set the password for this user with the following command:

passwd hadoop

Provide and confirm the new password as shown below:

Changing password for user hadoop.
New password: 
Retype new password: 
passwd: all authentication tokens updated successfully.

Step 4 – Configure SSH Key-based Authentication

Next, you will need to configure passwordless SSH authentication for the local system.

First, change the user to hadoop with the following command:

su - hadoop

Next, run the following command to generate Public and Private Key Pairs:

ssh-keygen -t rsa

You will be asked to enter the filename. Just press Enter to complete the process:

Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:a/og+N3cNBssyE1ulKK95gys0POOC0dvj+Yh1dfZpf8 hadoop@centos8
The key's randomart image is:
+---[RSA 2048]----+
|                 |
|                 |
|              .  |
|     .   o o o   |
|  . . o S o o    |
| o = + O o   .   |
|o * O = B =   .  |
| + O.O.O + +   . |
|  +=*oB.+ o     E|
+----[SHA256]-----+

Next, append the generated public keys from id_rsa.pub to authorized_keys and set proper permission:

cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 640 ~/.ssh/authorized_keys

Next, verify the passwordless SSH authentication with the following command:

ssh localhost

You will be asked to authenticate hosts by adding RSA keys to known hosts. Type yes and hit Enter to authenticate the localhost:

The authenticity of host 'localhost (::1)' can't be established.
ECDSA key fingerprint is SHA256:0YR1kDGu44AKg43PHn2gEnUzSvRjBBPjAT3Bwrdr3mw.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Activate the web console with: systemctl enable --now cockpit.socket

Last login: Sat Feb  1 02:48:55 2020
[hadoop@centos8 ~]$

Step 5 – Install Hadoop

First, change the user to hadoop with the following command:

su - hadoop

Next, download the latest version of Hadoop using the wget command:

wget http://apachemirror.wuchna.com/hadoop/common/hadoop-3.2.1/hadoop-3.2.1.tar.gz

Once downloaded, extract the downloaded file:

tar -xvzf hadoop-3.2.1.tar.gz

Next, rename the extracted directory to hadoop:

mv hadoop-3.2.1 hadoop

Next, you will need to configure Hadoop and Java Environment Variables on your system.

Open the ~/.bashrc file in your favorite text editor:

nano ~/.bashrc

Append the following lines:

export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk-1.8.0.232.b09-2.el8_1.x86_64/
export HADOOP_HOME=/home/hadoop/hadoop
export HADOOP_INSTALL=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"

Save and close the file. Then, activate the environment variables with the following command:

source ~/.bashrc

Next, open the Hadoop environment variable file:

nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh

Update the JAVA_HOME variable as per your Java installation path:

export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk-1.8.0.232.b09-2.el8_1.x86_64/

Save and close the file when you are finished.

Step 6 – Configure Hadoop

First, you will need to create the namenode and datanode directories inside Hadoop home directory:

Run the following command to create both directories:

mkdir -p ~/hadoopdata/hdfs/namenode
mkdir -p ~/hadoopdata/hdfs/datanode

Next, edit the core-site.xml file and update with your system hostname:

nano $HADOOP_HOME/etc/hadoop/core-site.xml

Change the following name as per your system hostname:

Save and close the file. Then, edit the hdfs-site.xml file:

nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml

Change the NameNode and DataNode directory path as shown below:

Save and close the file. Then, edit the mapred-site.xml file:

nano $HADOOP_HOME/etc/hadoop/mapred-site.xml

Make the following changes:

Save and close the file. Then, edit the yarn-site.xml file:

nano $HADOOP_HOME/etc/hadoop/yarn-site.xml

Make the following changes:

Save and close the file when you are finished.

Step 7 – Start Hadoop Cluster

Before starting the Hadoop cluster. You will need to format the Namenode as a hadoop user.

Run the following command to format the hadoop Namenode:

hdfs namenode -format

You should get the following output:

2020-02-05 03:10:40,380 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
2020-02-05 03:10:40,389 INFO namenode.FSImage: FSImageSaver clean checkpoint: txid=0 when meet shutdown.
2020-02-05 03:10:40,389 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at hadoop.tecadmin.com/45.58.38.202
************************************************************/

After formating the Namenode, run the following command to start the hadoop cluster:

start-dfs.sh

Once the HDFS started successfully, you should get the following output:

Starting namenodes on [hadoop.tecadmin.com]
hadoop.tecadmin.com: Warning: Permanently added 'hadoop.tecadmin.com,fe80::200:2dff:fe3a:26ca%eth0' (ECDSA) to the list of known hosts.
Starting datanodes
Starting secondary namenodes [hadoop.tecadmin.com]

Next, start the YARN service as shown below:

start-yarn.sh

You should get the following output:

Starting resourcemanager
Starting nodemanagers

You can now check the status of all Hadoop services using the jps command:

jps

You should see all the running services in the following output:

7987 DataNode
9606 Jps
8183 SecondaryNameNode
8570 NodeManager
8445 ResourceManager
7870 NameNode

Step 8 – Configure Firewall

Hadoop is now started and listening on port 9870 and 8088. Next, you will need to allow these ports through the firewall.

Run the following command to allow Hadoop connections through the firewall:

firewall-cmd --permanent --add-port=9870/tcp
firewall-cmd --permanent --add-port=8088/tcp

Next, reload the firewalld service to apply the changes:

firewall-cmd --reload

Step 9 – Access Hadoop Namenode and Resource Manager

To access the Namenode, open your web browser and visit the URL http://your-server-ip:9870. You should see the following screen:

To access the Resource Manage, open your web browser and visit the URL http://your-server-ip:8088. You should see the following screen:

Step 10 – Verify the Hadoop Cluster

At this point, the Hadoop cluster is installed and configured. Next, we will create some directories in HDFS filesystem to test the Hadoop.

Let’s create some directory in the HDFS filesystem using the following command:

hdfs dfs -mkdir /test1
hdfs dfs -mkdir /test2

Next, run the following command to list the above directory:

hdfs dfs -ls /

You should get the following output:

Found 2 items
drwxr-xr-x   - hadoop supergroup          0 2020-02-05 03:25 /test1
drwxr-xr-x   - hadoop supergroup          0 2020-02-05 03:35 /test2

You can also verify the above directory in the Hadoop Namenode web interface.

Go to the Namenode web interface, click on the Utilities => Browse the file system. You should see your directories which you have created earlier in the following screen:

Step 11 – Stop Hadoop Cluster

You can also stop the Hadoop Namenode and Yarn service any time by running the stop-dfs.sh and stop-yarn.sh script as a Hadoop user.

To stop the Hadoop Namenode service, run the following command as a hadoop user:

stop-dfs.sh

To stop the Hadoop Resource Manager service, run the following command:

stop-yarn.sh

Conclusion

In the above tutorial, you learned how to set up the Hadoop single node cluster on CentOS 8. I hope you have now enough knowledge to install the Hadoop in the production environment.