Linux
Linux foundation
Linux Philosophy and Concepts
Linux structure and installation
Graphical interface
System configuration from graphical interface
Finding Linux documentation
Command Line Operations
File Operations
User Environment
Text Editors
Local Security Principles
Network operations
Manipulating text
Printing
Bash Shell Scripting
Advanced Bash scripting
Processes
Common Applications
Linux foundation
Linus Torvalds started Linux in 1991
- Linux seen everywhere
- is open source
- installation used to be very difficult, requiring the installation of lots of floppy drives; this got better c CD rom installation
Some choices that were made for installation included:
- desktop manager (GNOME or KDE), that controls the look and feel of the desktop
- software selection - wide range of options
Basic steps in installation are:
- deciding Linux distribution to use
- downloading Image
- deciding if you want a native or virtual machine installation, or just to use a live (CD/USB) method
"Linux" refers to the OS kernel (the basic program that underlies everything else that communicates c hardware and applications running on a computer
- when people say "Linux System", really mean "Linux-based system"
Is the job of Linux distribution (distro)to assemble and integrate into a complete entity all components stacked above the Linux kernel
- many different distros exist, such as Fedora, Debian and SUSE
First download from ubuntu.com, then get poweriso and put on USB drive for first time install, youtube videos helpful :)
- must partition computer
Linux is constantly being updated
- powers the NYSE, cell phones, supercomputers, etc
Linux Philosophy and Concepts
Linux is open source OS made originally for x86-based PCs
- Linux Torvaids was a student in Helsinki, Finland when he started designing his own OS kernel in 1991
- IBM and Oracle began Linux support in 1998; powers lots of net servers and smart phones and supercomputers
- largest internet collaboration in history
- patches must ultimately be approved by Linus Torvaids before being put in
- special license that allows to have freedoms
Linux borrows a lot from UNIX OS bc free and was written to be open source version of UNIX
- built in hierarchy, c top node of the system is the root, or "/"
- is a multiuser, multitasking system c service processes known as daemons in UNIX
- if possible, Linux makes components avail via files; processes, devices and network sockets are files and can be used c same utilities as regular files
Linux community has vendors, developers and active users that communicate via Internet Relay Chat (IRC), newsgroups and mailing lists
Linux terminology
Kernel - glue bwt hardware and apps (ie linux kernel)
distribution - collection of software making up a Linux-based OS (ie ubuntu)
- a full Linux distribution consists of the kernel plus a number of other software tools for file-related operations, user mngmt, and software pkg mngmt
boot loader - program that boots the OS (ie GRUB and ISOLINUX)
service - program that runs as background process (ex httpd, nfsd, ntpd, ftpd and NAMED
filesystem - method for storing and organizing files (ie ext3 BTRfs)
X Window system - graphical subsystem on nearly all Linux systems
desktop environment - graphical user interface on top of the OS (ie GNOME, KDE, Xfce and Fluxbox)
command line - interface for typing commands for OS to execute
Shell - command line interpreter that interprets the command line input and instructs the OS to perform any necessary tasks and commands (ie bash, tcsh, and zsh ("zshell")
Linux kernel is the center of support sevices, manuals for apps, distribution-specific apps like Pkg Mng, config utilities, Generic free and proprietary apps
- RHEL 6 uses older version of Linux kernel (2.632) which is still updated;
- tools like C/C++ compiler has been freely distributed by distributions, as well as gdb debugger
Commercial organizations generally rely on commercial orgs for their distributions, such as Red Hat, SUSE and Canonical (Ubuntu), which operate on some kind of fee basis
- CentOS is free version to Red Hat Enterprise Linux (RHEL)
- Ubuntu and Fedora popular in education
- Scientific Linux favored by scientific researchers for compatibility c since and math software pkgs
CentOS and Scientific Linux are binary-compatible c RHEL (binary software pkgs install across distributions properly
Linux structure and installation
Linux filesystems - similar to fridge c lots of shelves; similarly, filesystems organize data into human-usable form
- conventional filesystems = ext2, ext3, ext4, Brtfs, NTFS
- flast storage filesystems = ubifs, YAFFS
- database filesystems
- special puropose filesystems = proofs, sysfs, tmpfs, debugfs
Partitions are logical parts of the disk, but filesustems are methods of storing/finding files on hard disk (usually in a partition)
- can think of filesystems like hierarchy of family tree, where partitions are similar to different families (each has its own tree)
Linux systems store their important files according to a standard layout called the Filesystem Hierarchy Standard, or FHS.
- standardization ensures that users can move between distributions without having to relearn how the system is organized
Linux uses the ‘/’ character to separate paths (unlike Windows, which uses ‘\’), and does not have drive letters. New drives are mounted as directories in the single filesystem, often under /media (so, for example, a CD-ROM disc labeled FEDORA might end up being found at /media/FEDORA, and a file README.txt on that disc would be at /media/FEDORA/README.txt)
Linux filesystem names are case-sensitive, so /boot, /Boot, and /BOOT represent three different directories (or folders). Many distributions distinguish between core utilities needed for proper system operation and other programs, and place the latter in directories under /usr (think "user"). To get a sense for how the other programs are organized, find the /usr directory in the diagram above and compare the subdirectories with those that exist directly under the system root directory (/).
- A possible correct full path for a "photo.jpg" file on a Linux system is /data/user1/images/photo.jpg
The Linux boot process is the procedure for initializing the system. It consists of everything that happens from when the computer power is first switched on until the user interface is fully operational.
Step 1. BIOS
Starting an x86-based Linux system involves a number of steps. When the computer is powered on, the Basic Input/Output System (BIOS) initializes the hardware, including the screen and keyboard, and tests the main memory. This process is also called POST (Power On Self Test).
The BIOS software is stored on a ROM chip on the motherboard. After this, the remainder of the boot process is completely controlled by the operating system
Master Boot Records (MBR) and Boot Loader
Once the POST is completed, the system control passes from the BIOS to the boot loader. The boot loader is usually stored on one of the hard disks in the system, either in the boot sector (for traditional BIOS/MBR systems) or the EFI partition (for more recent (Unified) Extensible Firmware Interface or EFI/UEFI systems). Up to this stage, the machine does not access any mass storage media. Thereafter, information on the date, time, and the most important peripherals are loaded from the CMOS values (after a technology used for the battery-powered memory store - which allows the system to keep track of the date and time even when it is powered off).
A number of boot loaders exist for Linux; the most common ones are GRUB (for GRand Unified Boot loader) and ISOLINUX (for booting from removable media). Most Linux boot loaders can present a user interface for choosing alternative options for booting Linux, and even other operating systems that might be installed. When booting Linux, the boot loader is responsible for loading the kernel image and the initial RAM disk (which contains some critical files and device drivers needed to start the system) into memory
Boot Loader in Action
The boot loader has two distinct stages:
First Stage:
For systems using the BIOS/MBR method, the boot loader resides at the first sector of the hard disk also known as the Master Boot Record (MBR). The size of the MBR is just 512 bytes. In this stage, the boot loader examines the partition table and finds a bootable partition. Once it finds a bootable partition, it then searches for the second stage boot loader e.g, GRUB, and loads it into RAM (Random Access Memory).
For systems using the EFI/UEFI method, UEFI firmware reads its Boot Manager data to determine which UEFI application is to be launched and from where (i.e., from which disk and partition the EFI partition can be found). The firmware then launches the UEFI application, for example, GRUB, as defined in the boot entry in the firmware's boot manager. This procedure is more complicated but more versatile than the older MBR methods.
Second Stage:
The second stage boot loader resides under: /boot. A splash screen is displayed which allows us to choose which Operating System (OS) to boot. After choosing the OS, the boot loader loads the kernel of the selected operating system into RAM and passes control to it.
The boot loader loads the selected kernel image (in the case of Linux) and passes control to it. Kernels are almost always compressed, so its first job is to uncompress itself. After this, it will check and analyze the system hardware and initialize any hardware device drivers built into the kernel
The Linix Kernel
The boot loader loads both the kernel and an initial RAM–based file system (initramfs) into memory so it can be used directly by the kernel.
When the kernel is loaded in RAM, it immediately initializes and configures the computer’s memory and also configures all the hardware attached to the system. This includes all processors, I/O subsystems, storage devices, etc. The kernel also loads some necessary user space applications
Initial RAM Disk
The initramfs filesystem image contains programs and binary files that perform all actions needed to mount the proper root filesystem, like providing kernel functionality for the needed filesystem and device drivers for mass storage controllers with a facility called udev (for User Device) which is responsible for figuring out which devices are present, locating the drivers they need to operate properly, and loading them. After the root filesystem has been found, it is checked for errors and mounted.
The mount program instructs the operating system that a filesystem is ready for use, and associates it with a particular point in the overall hierarchy of the filesystem (the mount point). If this is successful, the initramfs is cleared from RAM and the init program on the root filesystem (/sbin/init) is executed.
init handles the mounting and pivoting over to the final real root filesystem. If special hardware drivers are needed before the mass storage can be accessed, they must be in the initramfs image.
init is the parent of all non-kernel processes in Linux and is responsible for starting system and network services at boot time.
/sbin/init and Services
Once the kernel has set up all its hardware and mounted the root filesystem, the kernel runs the /sbin/init program. This then becomes the initial process, which then starts other processes to get the system running. Most other processes on the system trace their origin ultimately to init; the exceptions are kernel processes, started by the kernel directly for managing internal operating system details.
Traditionally, this process startup was done using conventions that date back to System V UNIX, with the system passing through a sequence of runlevels containing collections of scripts that start and stop services. Each runlevel supports a different mode of running the system. Within each runlevel, individual services can be set to run, or to be shut down if running. Newer distributions are moving away from the System V standard, but usually support the System V conventions for compatibility purposes.
Besides starting the system, init is responsible for keeping the system running and for shutting it down cleanly. It acts as the "manager of last resort" for all non-kernel processes, cleaning up after them when necessary, and restarts user login services as needed when users log in and out.
Test-Mode Login
Near the end of the boot process, init starts a number of text-mode login prompts (done by a program called getty). These enable you to type your username, followed by your password, and to eventually get a command shell.
Usually, the default command shell is bash (the GNU Bourne Again Shell), but there are a number of other advanced command shells available. The shell prints a text prompt, indicating it is ready to accept commands; after the user types the command and presses Enter, the command is executed, and another prompt is displayed after the command is done.
As you'll learn in the chapter 'Command Line Operations', the terminals which run the command shells can be accessed using the ALT key plus a function key. Most distributions start six text terminals and one graphics terminal starting with F1 or F2. If the graphical environment is also started, switching to a text console requires pressing CTRL-ALT + the appropriate function key (with F7 or F1 being the GUI). As you'll see shortly, you may need to run the startx command in order to start or restart your graphical desktop after you have been in pure text mode.
X Window System
Generally, in a Linux desktop system, the X Window System is loaded as the final step in the boot process.
A service called the display manager keeps track of the displays being provided, and loads the X server (so-called because it provides graphical services to applications, sometimes called X clients). The display manager also handles graphical logins, and starts the appropriate desktop environment after a user logs in.
A desktop environment consists of a session manager, which starts and maintains the components of the graphical session, and the window manager, which controls the placement and movement of windows, window title-bars, and controls.
Although these can be mixed, generally a set of utilities, session manager, and window manager are used together as a unit, and together provide a seamless desktop environment.
If the display manager is not started by default in the default runlevel, you can start X a different way, after logging on to a text-mode console, by running startx from the command line
Linux Distribution Installation
Your family is planning to buy its first car. What are the factors you need to consider while purchasing a car? Your planning depends a lot on your requirements. For instance, your budget, available finances, size of the car, type of engine, after-sales services, etc.
Similarly, determining which distribution to deploy also requires some planning. The figure shows some but not all choices, as there are other choices for distributions and standard embedded Linux systems are mostly neither Android or Tizen, but are slimmed down standard distributions.
Some questions worth thinking about before deciding on a distribution include:
What is the main function of the system? (server or desktop)
What types of packages are important to the organization? For example, web server, word processing, etc.
How much hard disk space is available? For example, when installing Linux on an embedded device, there will be space limitations.
How often are packages updated?
How long is the support cycle for each release? For example, LTS releases have long term support.
Do you need kernel customization from the vendor?
What hardware are you running the Linux distribution on? For example, X86, ARM, PPC, etc.
Do you need long-term stability or short-term experimental software?
Linux Installation: Planning
A partition layout needs to be decided at the time of installation because Linux systems handle partitions by mounting them at specific points in the filesystem. You can always modify the design later, but it always easier to try and get it right to begin with.
Nearly all installers provide a reasonable filesystem layout by default, with either all space dedicated to normal files on one big partition and a smaller swap partitition, or with separate partitions for some space-sensitive areas like /home and /var. You may need to override the defaults and do something different if you have special needs, or if you want to use more than one disk.
All installations include the bare minimum software for running a Linux distribution.
Most installers also provide options for adding categories of software. Common applications (such as the Firefox web browser and LibreOffice office suite), developer tools (like the vi and emacs text editors which we will explore later in this course), and other popular services, (such as the Apache web server tools or MySQL database) are usually included. In addition, a desktop environment is installed by default.
All installers secure the system being installed as part of the installation. Usually, this consists of setting the password for the superuser (root) and setting up an initial user. In some cases (such as Ubuntu), only an initial user is set up; direct root login is disabled and root access requires logging in first as a normal user and then using sudo as we will describe later. Some distributions will also install more advanced security frameworks, such as SELinux or AppArmor.
Install Source
Like other operating systems, Linux distributions are provided on optical media such as CDs or DVDs. USB media is also a popular option. Most Linux distributions support booting a small image and downloading the rest of the system over the network; these small images are usable on media or as network boot images, making it possible to install without any local media at all.
Many installers can do an installation completely automatically, using a configuration file to specify installation options. This file is called a Kickstart file for Fedora-based systems, an AutoYAST profile for SUSE-based systems, and a preseed file for the Debian-based systems.
Each distribution provides its own documentation and tools for creating and managing these files.
The Linux Kernel
Graphical interface
You can use either a Command Line Interface (CLI) or a Graphical User Interface (GUI) when using Linux. To work at the CLI, you have to remember which programs and commands are used to perform tasks, and how to quickly and accurately obtain more information about their use and options. On the other hand, using the GUI is often quick and easy. It allows you to interact with your system through graphical icons and screens. For repetitive tasks the CLI is often more efficient, while the GUI is easier to navigate if you don't remember all the details or do something only rarely.
In this section you will learn how to manage sessions using the GUI for the three Linux distribution families that we explicitly cover in this course: CentOS (Fedora family), openSUSE (SUSE family) and Ubuntu (Debian family). As you'll see shortly, openSUSE uses KDE instead of GNOME as the default desktop manager. However, since in many cases we use just a single distro for illustration, we've used GNOME for the openSUSE visuals throughout this course. If you are using KDE your experience will vary somewhat from what is shown.
GNOME desktop environment
GNOME is a popular desktop environment with an easy to use graphical user interface. It is bundled as the default desktop environment for many distributions including Red Hat Enterprise Linux, Fedora, CentOS, SUSE Linux Enterprise, and Debian. GNOME has menu-based navigation and is sometimes an easy transition for at least some Windows users. However, as you'll see, the look and feel can be quite different across distributions, even if they are all using GNOME.
Another common desktop environment very important in the history of Linux and also widely used is KDE, which is used by default in openSUSE.
Other alternatives for a desktop environment include Unity (from Ubuntu, based on GNOME), Xfce, and LXDE. Most desktop environments follow a similar structure to GNOME.
Gui Startup
When you install a desktop environment, the X display manager starts at the end of the boot process. This X display manager is responsible for starting the graphics system, logging in the user, and starting the user’s desktop environment. You can often select from a choice of desktop environments when logging in to the system.
The default display manager for GNOME is called gdm. Other popular display managers include lightdm (used on Ubuntu) and kdm (associated with KDE).
Lock = Ctrl + Alk + L
Shutting down and restarting
Besides normal daily starting and stopping of the computer, a system restart may be required as part of certain major system updates, generally only those involving installing a new Linux kernel.
The init process is responsible for implementing both restarts and shut downs. On systems using System V init, run level 0 is usually used for shutting down, and run level 6 is used to reboot the system.
Suspend
Most modern computers support suspend mode or sleep mode when you stop using your computer for a short while. Suspend mode saves the current system state and allows you to resume your session more quickly while remaining on but using very little power. It works by keeping your system’s applications, desktop, and so on in system RAM, but turning off all of the other hardware. The suspend mode bypasses the time for a full system start-up and continues to use minimal power.
Basic Operations
Even experienced users can forget the precise command that launches an application, or exactly what options and arguments it requires. Fortunately, Linux allows you to quickly open applications using the graphical interface.
- has multiple submenus
Finding Applications
Unlike other operating systems, the initial install of Linux usually comes with a wide range of applications and software archives that contain thousands of programs that enable you to accomplish a wide variety of tasks with your computer. For most key tasks a default application is usually already installed. However, you can always install more applications and try different options.
For example, Firefox is popular as the default browser in many Linux distributions, while Epiphany, Konqueror, and Chromium (the open-source base for Google Chrome) are usually available for install from software repositories. Proprietary web browsers, such as Opera and Chrome are also available.
Default Directories
Every user with an account on the system will have a home directory, usually created under /home and named the same as the username (such as /home/student). By default, files the user saves will be placed in a directory tree starting there. Account creation, whether during system installation or at a later time when a new user is added, also induces default directories to be created under the user's home directory, such as Documents, Desktop, and Downloads.
On the next few screens, you will learn more about the default directories in CentOS, openSUSE, and Ubuntu.
Viewing files
Nautilus (the name of the File Manager or file browser) allows you to view files and directories in several different formats.
To view files in the Icons, List, or Compact formats, click the View drop-down and select your view, or press CTRL-1 , CTRL-2 and CTRL-3 respectively.
In addition you can also arrange the files and directories by Name, Size, Type, or Modification Date for further sorting. To do so, click View and select Arrange Items.
Another useful option is to show hidden files (sometimes imprecisely called system files), which are usually configuration files that are hidden by default and whose name starts with a dot. To show hidden files, click View and select Show Hidden Files or press CTRL- H.
The file browser provides multiple ways to customize your window view to facilitate easy drag and drop file operations. You can also alter the size of the icons by selecting Zoom In and Zoom Out under the View menu.
Searching for Files
Nautilus includes a great search tool inside the file browser window.
Click Search in the toolbar (to bring up a text box).
Enter the keyword in the text box.
Nautilus will perform a recursive search from the current directory for any file or directory which contains a part of this keyword.
To open Nautilus from the command line, simply type nautilus
To open Nautilus in graphical mode, Press ALT-F2 and search for Nautilus. Click the icon that appears.
Note: Both the above methods, will open the graphical interface for the program.
The shortcut key to get to the search text box is CTRL-F. You can exit the search text box view by clicking the Search button again.
Another quick way to access a specific directory is to press CTRL-L, which will give you a Location text box to type in a path to a directory.
Editing
Editing any text file through the graphical interface is easy in the GNOME desktop environment. Simply double-click the file on the desktop or in the Nautilus file browser window to open the file with the default text editor.
The default text editor in GNOME is gedit. It is simple yet powerful, ideal for editing documents, making quick notes, and programming. Although gedit is designed as a general purpose text editor, it offers additional features for spell checking, highlighting, file listings, and statistics.
Deleting stuff
Deleting a file in Nautilus will automatically move the deleted files to the .local/share/Trash/files/ directory (a trash can of sorts) under the user's HOME directory. There are several ways to delete files and directories using Nautilus.
Select all the files and directories that you want to delete.
Press Delete (in Unity/KDE) or CTRL-Delete (in GNOME) on your keyboard. Or, Right-click the file.
Select Move to Trash. Or, Highlight the file.
Click Edit and Move to Trash through the graphical interface
Can empty trash to permanently delete files
System configuration from graphical interface
The X server, which actually provides the GUI, uses the /etc/X11/xorg.conf file as its configuration file if it exists. In modern Linux distributions, this file is usually present only in unusual circumstances, such as when certain less common graphic drivers are in use. Changing this configuration file directly is usually for more advanced users.
Linux always uses Coordinated Universal Time (UTC) for its own internal time-keeping. Displayed or stored time values rely on the system time zone setting to get the proper time. UTC is similar to, but more accurate than, Greenwich Mean Time (GMT).
The Network Time Protocol (NTP) is the most popular and reliable protocol for setting the local time via Internet servers. Most Linux distributions include a working NTP setup which refers to specific time servers run by the distribution. This means that no setup, beyond "on or off", is required for network time synchronization. If desired, more detailed configuration is possible by editing the standard NTP configuration file (/etc/ntp.conf) for Linux NTP utilities.
Wired connections usually do not require complicated or manual configuration. The hardware interface and signal presence are automatically detected, and then Network Manager sets the actual network settings via DHCP (Dynamic Host Control Protocol).
For static configurations that don't use DHCP, manual setup can also be done easily through Network Manager. You can also change the Ethernet Media Access Control (MAC) address if your hardware supports it. (The MAC address is a unique hexadecimal number of your network card.)
Network Manager
All Linux distributions have network configuration files, but file formats and locations can differ from one distribution to another. Hand editing of these files can handle quite complicated setups, but is not very dynamic or easy to learn and use. The Network Manager utility was developed to make things easier and more uniform across distributions. It can list all available networks (both wired and wireless), allow the choice of a wired, wireless or mobile broadband network, handle passwords, and set up Virtual Private Networks (VPNs). Except for unusual situations, it’s generally best to let the Network Manager establish your connections and keep track of your settings.
Wireless networks are not connected to the machine by default. You can view the list of available wireless networks and see which one you are connected to by using Network Manager. You can then add, edit, or remove known wireless networks, and also specify which ones you want connected by default when present.
You can set up a mobile broadband connection with Network Manager, which will launch a wizard to set up the connection details for each connection.
- It supports many VPN technologies, such as native IPSec, Cisco OpenConnect (via either the Cisco client or a native open-source client), Microsoft PPTP, and OpenVPN.
Installing and Updating Software
Each package in a Linux distribution provides one piece of the system, such as the Linux kernel, the C compiler, the shared software code for interacting with USB devices, or the Firefox web browser.
Packages often depend on each other; for example, because Firefox can communicate using SSL/TLS, it will depend on a package which provides the ability to encrypt and decrypt SSL and TLS communication, and will not install unless that package is also installed at the same time.
One utility handles the low-level details of unpacking a package and putting the pieces in the right places. Most of the time, you will be working with a higher-level utility which knows how to download packages from the Internet and can manage dependencies and groups for you.
The Debian Family System
Let’s look at Package Management in the Debian Family System.
dpkg is the underlying package manager for these systems; it can install, remove, and build packages. Unlike higher-level package management systems, it does not automatically download and install packages and satisfy their dependencies.
For Debian-based systems, the higher-level package management system is the apt (Advanced Package Tool) system of utilities. Generally, while each distribution within the Debian family uses apt, it creates its own user interface on top of it (for example, apt-get, aptitude, synaptic, Ubuntu Software Center, Update Manager, etc). Although apt repositories are generally compatible with each other, the software they contain generally isn’t. Therefore, most apt repositories target a particular distribution (like Ubuntu), and often software distributors ship with multiple repositories to support multiple distributions.
The Red Hat Package Manager (RPM)
Red Hat Package Manager (RPM) is the other package management system popular on Linux distributions. It was developed by Red Hat, and adopted by a number of other distributions, including the openSUSE, Mandriva, CentOS, Oracle Linux, and others.
The high-level package manager differs between distributions; most use the basic repository format used in yum (Yellowdog Updater, Modified - the package manager used by Fedora and Red Hat Enterprise Linux), but with enhancements and changes to fit the features they support. Recently, the GNOME project has been developing PackageKit as a unified interface; this is now the default interface for Fedora.
Finding Linux documentation
Important Linux documentation sources include:
- The man pages (short for manual pages)
- GNU Info
- The help command and --help option
- internet (https://www.gentoo.org/doc/en/)
The man Pages
The man pages are the most often-used source of Linux documentation. They provide in-depth documentation about many programs and utilities as well as other topics, including configuration files, system calls, library routines, and the kernel.
Typing man with a topic name as an argument retrieves the information stored in the topic's man pages. Some Linux distributions require every installed program to have a corresponding man page, which explains the depth of coverage. (Note: man is actually an abbreviation for manual.) The man pages structure were first introduced in the early UNIX versions of the early 1970s.
The man pages are often converted to:
Web pages
Published books
Graphical help
Other formats
man
The man program searches, formats, and displays the information contained in the man pages. Because many topics have a lot of information, output is piped through a terminal pager program such as less to be viewed one page at a time; at the same time the information is formatted for a good visual display.
When no options are given, by default one sees only the dedicated page specifically about the topic. You can broaden this to view all man pages containing a string in their name by using the -f option. You can also view a list of man pages that discuss a specified subject (even if the specified subject is not present in the name) by using the –k option.
man –f generates the same result as typing whatis.
man –k generates the same result as typing apropos.
Manual Chapters
The man pages are divided into nine numbered chapters (1 through 9). Sometimes, a letter is appended to the chapter number to identify a specific topic. For example, many pages describing part of the X Window API are in chapter 3X.
The chapter number can be used to force man to display the page from a particular chapter; it is common to have multiple pages across multiple chapters with the same name, especially for names of library functions or system calls.
With the -a parameter, man will display all pages with the given name in all chapters, one after the other.
$ man 3 printf - chapter number finds page
$ man -a printf - shows all pages of all chapters
GNU Info System
The next source of Linux documentation is the GNU Info System.
This is the GNU project's standard documentation format (info) which it prefers as an alternative to man. The info system is more free-form and supports linked sub-sections.
Functionally, the GNU Info System resembles man in many ways. However, topics are connected using links (even though its design predates the World Wide Web). Information can be viewed through either a command line interface, a graphical help utility, printed or viewed online.
Command Line Info Browser
Typing info with no arguments in a terminal window displays an index of available topics. You can browse through the topic list using the regular movement keys: arrows, Page Up, and Page Down.
You can view help for a particular topic by typing info <topic name>. The system then searches for the topic in all available info files.
Some useful keys are: q to quit, h for help, and Enter to select a menu item.
Info page structure
The topic which you view in the info page is called a node.
Nodes are similar to sections and subsections in written documentation. You can move between nodes or view each node sequentially. Each node may contain menus and linked subtopics, or items.
Items can be compared to Internet hyperlinks. They are identified by an asterisk (*) at the beginning of the item name. Named items (outside a menu) are identified with double-colons (::) at the end of the item name. Items can refer to other nodes within the file or to other files. The table lists the basic keystrokes for moving between nodes.
n = goto next node
p = goto prev node
u = move a node up in index
Intro to the help Option
The third source of Linux documentation is use of the help option.
Most commands have an available short description which can be viewed using the --help or the -h option along with the command or application. For example, to learn more about the man command, you can run the following command:
$ man --help
The --help option is useful as a quick reference and it displays information faster than the man or info pages.
Some popular commands (such as echo) when run in a bash command shell silently run their own built-in versions of system programs or utilities, because it is more efficient to do so. To view a synopsis of these built-in commands, you can simply type help.
For these built-in commands, help performs the same basic function as the -h and --help arguments perform for stand-alone programs.
Other Documentation Sources
In addition to the man pages, the GNU Info System, and the help command, there are other sources of Linux documentation
Desktop Help Systems
All Linux desktop systems have a graphical help application. This application is usually displayed as a question-mark icon or an image of a ship’s life-preserver. These programs usually contain custom help for the desktop itself and some of its applications, and will often also include graphically rendered info and man pages.
You can also start the graphical help system from a graphical terminal using the following commands:
GNOME: gnome-help
KDE: khelpcenter
Package Documentation
Linux documentation is also available as part of the package management system. Usually this documentation is directly pulled from the upstream source code, but it can also contain information about how the distribution packaged and set up the software.
Such information is placed under the /usr/share/doc directory in a subdirectory named after the package, perhaps including the version number in the name.
- online resources also avail
Exercises
Lab 1: Finding Info Documentation
From the command line, bring up the info page for the grep command. Bring up the usage section.
$ info grep
Lab 2: Finding man pages
From the command line, bring up the man page for the man command. Scroll
down to the EXAMPLES section.
$ man man
Lab 3: Finding man pages by Topic
What man pages are available that document file compression?
man -k compress or apropos compress will bring up a long list of commands, including possibly gzip, bzip2, xz, and a number of file utilities that work with compressed files (zless, zgrep, bzcat, xzdiff, etc.)
Lab 4: Finding man pages by Section
From the command line, bring up the man page for the printf library function.
Which manual page section are library functions found?
$ man 3 printf
Lab 5: Command-Line Help
List the available options for the mkdir command. How can you do this? There is
more than just one way.
$ mkdir --help
or
$ man mkdir
Command Line Operations
Linux system administrators spend a significant amount of their time at a command line prompt. They often automate and troubleshoot tasks in this text environment. There is a saying, "graphical user interfaces make easy tasks easier, while command line interfaces make difficult tasks possible." Linux relies heavily on the abundance of command line tools. The command line interface provides the following advantages:
- No GUI overhead.
- Virtually every task can be accomplished using the command line.
- You can script tasks and series of procedures.
- You can log on remotely to networked machines anywhere on the Internet.
- You can initiate graphical apps directly from the command line.
Using a text erminal on the Graphical Desktop
A terminal emulator program emulates (simulates) a stand alone terminal within a window on the desktop. By this we mean it behaves essentially as if you were logging into the machine at a pure text terminal with no running graphical interface. Most terminal emulator programs support multiple terminal sessions by opening additional tabs or windows.
By default, on GNOME desktop environments, the gnome-terminal application is used to emulate a text-mode terminal in a window. Other available terminal programs include:
xterm
rxvt
konsole
terminator
To open a terminal in CentOS:
On the CentOS desktop, in the upper-left corner, click Applications.
From the System Tools menu, select Terminal.
To open a terminal in openSUSE:
On the openSUSE desktop, in the upper-left corner of the screen, click Activities.
From the left pane, click Show Applications.
Scroll-down and select the required terminal.
To open a terminal in Ubuntu:
In the left panel, click the Ubuntu icon.
Type terminal in the Search box.
If the nautilus-open-terminal package is installed on any of these distributions, you can always open a terminal by right clicking anywhere on the desktop background and selecting Open in Terminal.
The X window system
The customizable nature of Linux allows you to drop (temporarily or permanently) the X Window graphical interface, or to start it up after the system has been running. Certain Linux distributions distinguish versions of the install media between desktop (with X) and server (usually without X); Linux production servers are usually installed without X and even if it is installed, usually do not launch it during system start up. Removing X from a production server can be very helpful in maintaining a lean system which can be easier to support and keep secure.
Virtual Terminals
Virtual Terminals (VT) are console sessions that use the entire display and keyboard outside of a graphical environment. Such terminals are considered "virtual" because although there can be multiple active terminals, only one terminal remains visible at a time. A VT is not quite the same as a command line terminal window; you can have many of those visible at once on a graphical desktop.
One virtual terminal (usually number one or seven) is reserved for the graphical environment, and text logins are enabled on the unused VTs. Ubuntu uses VT 7, but CentOS/RHEL and openSUSE use VT 1 for the graphical display.
An example of a situation where using the VTs is helpful when you run into problems with the graphical desktop. In this situation, you can switch to one of the text VTs and troubleshoot.
To switch between the VTs, press CTRL-ALT-corresponding function key for the VT. For example, press CTRL-ALT-F6 for VT 6. (Actually you only have to press ALT-F6 key combination if you are in a VT not running X and want to switch to another VT.)
The Command Line
Most input lines entered at the shell prompt have three basic elements:
1. Command
2. Options
3. Arguments
The command is the name of the program you are executing. It may be followed by one or more options (or switches) that modify what the command may do. Options usually start with one or two dashes, for example,-p or--print, in order to differentiate them from arguments, which represent what the command operates on.
However, plenty of commands have no options, no arguments, or neither. You can also type other things at the command line besides issuing commands, such as setting environment variables.
Turning off the Graphical Desktop
Linux distributions can start and stop the graphical desktop in various ways. For Debian-based systems, the Desktop Manager runs as a service which can be simply stopped. For RPM-based systems, the Desktop Manager is run directly by init when set to run level 5; switching to a different runlevel stops the desktop.
Use the sudo service gdm stop or sudo service lightdm stop commands, to stop the graphical user interface in Debian-based systems. On RPM-based systems typing sudo telinit 3 may have the same effect of killing the GUI.
sudo
All the demonstrations created have a user configured with sudo capabilities to provide the user with administrative (admin) privileges when required. sudo allows users to run programs using the security privileges of another user, generally root (superuser). The functionality of sudo is similar to that of run as in Windows.
On your own systems, you may need to set up and enable sudo to work correctly. To do this, you need to follow some steps that we won’t explain in much detail now, but you will learn about later in this course. When running on Ubuntu, sudo is already always set up for you during installation. If you are running something in the Fedora or openSUSE families of distributions, you will likely need to set up sudo to work properly for you after initial installation.
ie: $ shutdown now ...will shutdown computer if logged in as admin
Next, you will learn the steps to setup and run sudo on your system.
Steps for setting up sudo
If your system does not already have sudo set up and enabled, you need to do the following steps:
1. You will need to make modifications as the administrative or super user, root. While sudo will become the preferred method of doing this, we don’t have it set up yet, so we will use su (which we will discuss later in detail) instead. At the command line prompt, type su and press Enter. You will then be prompted for the root password, so enter it and press Enter. You will notice that nothing is printed; this is so others cannot see the password on the screen. You should end up with a different looking prompt, often ending with ‘#’. For example: $ su Password: #
2. Now you need to create a configuration file to enable your user account to use sudo. Typically, this file is created in the /etc/sudoers.d/ directory with the name of the file the same as your username. For example, for this demo, let’s say your username is “student”. After doing step 1, you would then create the configuration file for “student” by doing this: # echo "student ALL=(ALL) ALL" > /etc/sudoers.d/student
3. Finally, some Linux distributions will complain if you don’t also change permissions on the file by doing: # chmod 440 /etc/sudoers.d/student
Basic Operations
In this section we will discuss how to accomplish basic operations from the command line. These include how to log in and log out from the system, restart or shutdown the system, locate applications, access directories, identify the absolute and relative paths, and explore the filesystem
Logging In and Out
An available text terminal will prompt for a username (with the string login:) and password. When typing your password, nothing is displayed on the terminal (not even a * to indicate that you typed in something) to prevent others from seeing your password. After you have logged in to the system, you can perform basic operations.
Once your session is started (either by logging in to a text terminal or via a graphical terminal program) you can also connect and log in to remote systems via the Secure Shell (SSH) utility. For example, by typing ssh username@remote-server.com, SSH would connect securely to the remote machine and give you a command line terminal window, using passwords (as with regular logins) or cryptographic keys (a topic we won't discuss) to prove your identity.
Rebooting and Shutting Down
The preferred method to shut down or reboot the system is to use the shutdown command. This sends a warning message and then prevents further users from logging in. The init process will then control shutting down or rebooting the system. It is important to always shut down properly; failure to do so can result in damage to the system and/or loss of data.
The halt and poweroff commands issue shutdown -h to halt the system; reboot issues shutdown -r and causes the machine to reboot instead of just shutting down. Both rebooting and shutting down from the command line requires superuser (root) access.
When administering a multiuser system, you have the option of notifying all users prior to shutdown as in:
$ sudo shutdown -h 10:00 "Shutting down for scheduled maintenance."
Locating Applications
Depending on the specifics of your particular distribution's policy, programs and software packages can be installed in various directories. In general, executable programs should live in the /bin, /usr/bin,/sbin,/usr/sbin directories or under /opt.
One way to locate programs is to employ the which utility. For example, to find out exactly where the diff program resides on the filesystem:
$ which diff
If which does not find the program, whereis is a good alternative because it looks for packages in a broader range of system directories:
$ whereis diff
Accessing directories
When you first log into a system or open a terminal, the default directory should be your home directory; you can print the exact path of this by typing echo $HOME. (Note that some Linux distributions actually open new graphical terminals in $HOME/Desktop.) The following commands are useful for directory navigation.
Absolute and Relative Paths
There are two ways to identify paths:
- Absolute pathname: An absolute pathname begins with the root directory and follows the tree, branch by branch, until it reaches the desired directory or file. Absolute paths always start with /.
- Relative pathname: A relative pathname starts from the present working directory. Relative paths never start with /.
Multiple slashes (/) between directories and files are allowed, but all but one slash between elements in the pathname is ignored by the system. ////usr//bin is valid, but seen as /usr/bin by the system.
Most of the time it is most convenient to use relative paths, which require less typing. Usually you take advantage of the shortcuts provided by: . (present directory), .. (parent directory) and ~ (your home directory).
For example, suppose you are currently working in your home directory and wish to move to the /usr/bin directory. The following two ways will bring you to the same directory from your home directory:
Absolute pathname method: $ cd /usr/bin
Relative pathname method: $ cd ../../usr/bin
In this case, the absolute pathname method is less typing.
The Filesystem
Traversing up and down the filesystem tree can get tedious. The tree command is a good way to get a bird’s-eye view of the filesystem tree. Use tree -d to view just the directories and to suppress listing file names.
Hard and Soft (Symbolic) Links
ln can be used to create hard links and (with the -s option) soft links, also known as symbolic links or symlinks. These two kinds of links are very useful in UNIX-based operating systems. The advantages of symbolic links are discusssed on the following screen.
Suppose that file1 already exists. A hard link, called file2, is created with the command:
$ ln file1 file2
Note that two files now appear to exist. However, a closer inspection of the file listing shows that this is not quite true.
$ ls -li file1 file2
The -i option to ls prints out in the first column the inode number, which is a unique quantity for each file object. This field is the same for both of these files; what is really going on here is that it is only one file but it has more than one name associated with it, as is indicated by the 3 that appears in the ls output. Thus, there already was another object linked to file1 before the command was executed.
Symbolic Links
Symbolic (or Soft) links are created with the -s option as in:
$ ln -s file1 file4
$ ls -li file1 file4
Notice file4 no longer appears to be a regular file, and it clearly points to file1 and has a different inode number.
Symbolic links take no extra space on the filesystem (unless their names are very long). They are extremely convenient as they can easily be modified to point to different places. An easy way to create a shortcut from your home directory to long pathnames is to create a symbolic link.
Unlike hard links, soft links can point to objects even on different filesystems (or partitions) which may or may not be currently available or even exist. In the case where the link does not point to a currently available or existing object, you obtain a dangling link.
Hard links are very useful and they save space, but you have to be careful with their use, sometimes in subtle ways. For one thing if you remove either file1 or file2 in the example on the previous screen, the inode object (and the remaining file name) will remain, which might be undesirable as it may lead to subtle errors later if you recreate a file of that name.
If you edit one of the files, exactly what happens depends on your editor; most editors including vi and gedit will retain the link by default but it is possible that modifying one of the names may break the link and result in the creation of two objects.
Navigating the Directory History
The cd command remembers where you were last, and lets you get back there with cd -. For remembering more than just the last directory visited, use pushd to change the directory instead of cd; this pushes your starting directory onto a list. Using popd will then send you back to those directories, walking in reverse order (the most recent directory will be the first one retrieved with popd). The list of directories is displayed with the dirs command.
Searching For Files
Standard File Streams
When commands are executed, by default there are three standard file streams (or descriptors) always open for use: standard input (standard in or stdin), standard output (standard out or stdout) and standard error (or stderr). Usually, stdin is your keyboard, stdout and stderr are printed on your terminal; often stderr is redirected to an error logging file. stdin is often supplied by directing input to come from a file or from the output of a previous command through a pipe. stdout is also often redirected into a file. Since stderr is where error messages are written, often nothing will go there.
In Linux, all open files are represented internally by what are called file descriptors. Simply put, these are represented by numbers starting at zero. stdin is file descriptor 0, stdout is file descriptor 1, and stderr is file descriptor 2. Typically, if other files are opened in addition to these three, which are opened by default, they will start at file descriptor 3 and increase from there.
I/O Redirection
Through the command shell we can redirect the three standard filestreams so that we can get input from either a file or another command instead of from our keyboard, and we can write output and errors to files or send them as input for subsequent commands.
For example, if we have a program called do_something that reads from stdin and writes to stdout and stderr, we can change its input source by using the less-than sign ( < ) followed by the name of the file to be consumed for input data:
$ do_something < input-file
If you want to send the output to a file, use the greater-than sign (>) as in:
$ do_something > output-file
Because stderr is not the same as stdout, error messages will still be seen on the terminal windows in the above example.
If you want to redirect stderr to a separate file, you use stderr’s file descriptor number (2), the greater-than sign (>), followed by the name of the file you want to hold everything the running command writes to stderr:
$ do_something 2> error-file
A special shorthand notation can be used to put anything written to file descriptor 2 (stderr) in the same place as file descriptor 1 (stdout): 2>&1
$ do_something > all-output-file 2>&1
bash permits an easier syntax for the above:
$ do_something >& all-output-file
Pipes
The UNIX/Linux philosophy is to have many simple and short programs (or commands) cooperate together to produce quite complex results, rather than have one complex program with many possible options and modes of operation. In order to accomplish this, extensive use of pipes is made; you can pipe the output of one command or program into another as its input.
In order to do this we use the vertical-bar, |, (pipe symbol) between commands as in:
$ command1 | command2 | command3
The above represents what we often call a pipeline and allows Linux to combine the actions of several commands into one. This is extraordinarily efficient because command2 and command3 do not have to wait for the previous pipeline commands to complete before they can begin hacking at the data in their input streams; on multiple CPU or core systems the available computing power is much better utilized and things get done quicker. In addition there is no need to save output in (temporary) files between the stages in the pipeline, which saves disk space and reduces reading and writing from disk, which is often the slowest bottleneck in getting something done.
locate
The locate utility program performs a search through a previously constructed database of files and directories on your system, matching all entries that contain a specified character string. This can sometimes result in a very long list.
To get a shorter more relevant list we can use the grep program as a filter; grep will print only the lines that contain one or more specified strings as in:
$ locate zip | grep bin
which will list all files and directories with both "zip" and "bin" in their name. Notice the use of | to pipe the two commands together.
locate utilizes the database created by another program, updatedb. Most Linux systems run this automatically once a day. However, you can update it at any time by just running updatedb from the command line as the root user.
Wildcards and Matching File Names
You can search for a filename containing specific characters using wildcards.
To search for files using the ? wildcard, replace each unknown character with ?, e.g. if you know only the first 2 letters are 'ba' of a 3-letter filename with an extension of .out, type ls ba?.out .
To search for files using the * wildcard, replace the unknown string with *, e.g. if you remember only that the extension was .out, type ls *.out
Finding files in a directory
find is extremely useful and often-used utility program in the daily life of a Linux system administrator. It recurses down the filesystem tree from any particular directory (or set of directories) and locates files that match specified conditions. The default pathname is always the present working directory.
For example, administrators sometimes scan for large core files (which contain diagnostic information after a program fails) that are more than several weeks old in order to remove them. It is also common to remove files in /tmp (and other temporary directories, such as those containing cached files) that have not been accessed recently. Many distros use automated scripts that run periodically to accomplish such house cleaning.
Using find
When no arguments are given, find lists all files in the current directory and all of its subdirectories. Commonly used options to shorten the list include -name (only list files with a certain pattern in their name), -iname (also ignore the case of file names), and -type (which will restrict the results to files of a certain specified type, such as d for directory, l for symbolic link or f for a regular file, etc).
Searching for files and directories named "gcc":
$ find /usr -name gcc
Searching only for directories named "gcc":
$ find /usr -type d -name gcc
Searching only for regular files named "test1":
$ find /usr -type f -name test1
Advanced find options
Another good use of find is being able to run commands on the files that match your search criteria. The -exec option is used for this purpose.
To find and remove all files that end with .swp:
$ find -name "*.swp" -exec rm {} ’;’
The {} (squiggly brackets) is a place holder that will be filled with all the file names that result from the find expression, and the preceding command will be run on each one individually.
Note that you have to end the command with either ‘;’ (including the single-quotes) or \; Both forms are fine.
One can also use the -ok option which behaves the same as -exec except that find will prompt you for permission before executing the command. This makes it a good way to test your results before blindly executing any potentially dangerous commands.
Finding Files based on time and size
It is sometimes the case that you wish to find files according to attributes such as when they were created, last used, etc, or based on their size. Both are easy to accomplish.
Finding based on time:
$ find / -ctime 3
Here, -ctime is when the inode meta-data (i.e., file ownership, permissions, etc) last changed; it is often, but not necessarily when the file was first created. You can also search for accessed/last read (-atime) or modified/last written (-mtime) times. The number is the number of days and can be expressed as either a number (n) that means exactly that value, +n which means greater than that number, or -n which means less than that number. There are similar options for times in minutes (as in -cmin, -amin, and -mmin).
Finding based on sizes:
$ find / -size 0
Note the size here is in 512-byte blocks, by default; you can also specify bytes (c), kilobytes (k), megabytes (M), gigabytes (G), etc. As with the time numbers above, file sizes can also be exact numbers (n), +n or -n. For details consult the man page for find.
For example, to find files greater than 10 MB in size and running a command on those files:
$ find / -size +10M -exec command {} ’;’
Working with Files
Linux provides many commands that help you in viewing the contents of a file, creating a new file or an empty file, changing the timestamp of a file, and removing and renaming a file or directory. These commands help you in managing your data and files and in ensuring that the correct data is available at the correct location.
touch and mkdir
touch is often used to set or update the access, change, and modify times of files. By default it resets a file's time stamp to match the current time.
However, you can also create an empty file using touch:
$ touch <filename>
This is normally done to create an empty file as a placeholder for a later purpose.
touch provides several options, but here is one of interest:
The -t option allows you to set the date and time stamp of the file.
To set the time stamp to a specific time:
$ touch -t 03201600 myfile
This sets the file, myfile's, time stamp to 4 p.m., March 20th (03 20 1600).
mkdir is used to create a directory.
To create a sample directory named sampdir under the current directory, type mkdir sampdir.
To create a sample directory called sampdir under /usr, type mkdir /usr/sampdir.
Removing a directory is simply done with rmdir. The directory must be empty or it will fail. To remove a directory and all of its contents you have to do rm -rf
Renaming or Removing a Directory
rmdir works only on empty directories; otherwise you get an error.
While typing rm –rf is a fast and easy way to remove a whole filesystem tree recursively, it is extremely dangerous and should be used with the utmost care, especially when used by root (recall that recursive means drilling down through all sub-directories, all the way down a tree)
Modifying the Command Line Prompt
The PS1 variable is the character string that is displayed as the prompt on the command line. Most distributions set PS1 to a known default value, which is suitable in most cases. However, users may want custom information to show on the command line. For example, some system administrators require the user and the host system name to show up on the command line as in:
student@quad32 $
This could prove useful if you are working in multiple roles and want to be always reminded of who you are and what machine you are on. The prompt above could be implemented by setting the PS1 variable to: \u@\h \$
For example:
$ echo $PS1
\$
$ PS1="\u@\h \$ "
coop@quad64 $ echo $PS1
\u@\h \$
coop@quad64 $
Installing Software
Package Management Systems in Linux
The core parts of a Linux distribution and most of its add-on software are installed via the Package Management System. Each package contains the files and other instructions needed to make one software component work on the system. Packages can depend on each other. For example, a package for a Web-based application written in PHP can depend on the PHP package.
There are two broad families of package managers: those based on Debian and those which use RPM as their low-level package manager. The two systems are incompatible, but provide the same features at a broad level.
Package Managers: 2 Levels
Both package management systems provide two tool levels: a low-level tool (such as dpkg or rpm), takes care of the details of unpacking individual packages, running scripts, getting the software installed correctly, while a high-level tool (such as apt-get, yum, or zypper) works with groups of packages, downloadst packages from the vendor, and figures out dependencies.
Most of the time users need work only with the high-level tool, which will take care of calling the low-level tool as needed. Dependency tracking is a particularly important feature of the high-level tool, as it handles the details of finding and installing each dependency for you. Be careful, however, as installing a single package could result in many dozens or even hundreds of dependent packages being installed.
Working with different Package Management Systems
The Advanced Packaging Tool (apt) is the underlying package management system that manages software on Debian-based systems. While it forms the backend for graphical package managers, such as the Ubuntu Software Center and synaptic, its native user interface is at the command line, with programs that include apt-get and apt-cache.
- Yellowdog Updater Modified (yum) is an open-source command-line package-management utility for RPM-compatible Linux systems, basically what we have called the Fedora family. yum has both command line and graphical user interfaces.
- zypper is a package management system for openSUSE that is based on RPM. zypper also allows you to manage repositories from the command line. zypper is fairly straightforward to use and resembles yum quite closely.
Managing package software
To search lynx package:
$ sudo apt-chache search lynx (Ubuntu)
$ sudo yum search lynx (CentOS)
$ sudo zypper search lynx (openSUSE)
To install lynx package
$ sudo apt-get install lynx (Ubuntu)
$ sudo yum install lynx (CentOS)
$ sudo zypper install lynx (CentOS)
To check status of lynx package
$ sudo apt-cache policy lynx (Ubuntu)
$ sudo yum info lynx (CentOS)
$ sudo zypper info lynx (openSUSE)
To remove lynx package
$ sudo apt-get remove lynx (Ubuntu)
$ sudo yum remove lynx (CentOS)
$ sudo zypper remove lynx (openSUSE)
To get list of installed packages
$ sudo dpkg -l (Ubuntu)
$ sudo yum list installed (CentOS)
$ sudo rpm -qa (openSUSE)
To update installed packages
$ sudo apt-get upgrade (Ubuntu)
$ sudo yum update (CentOS)
$ sudo zypper update (CentOS)
Absolute and Relative Paths
Symbolic (Soft) Links
Viewing files
Removing files
Renaming / Removing Directories
Working in directories
Levels of Package Management
Basic Packaging Operations
File Operations
Filesystems
In Linux (and all UNIX-like operating systems) it is often said “Everything is a file”, or at least it is treated as such. This means whether you are dealing with normal data files and documents, or with devices such as sound cards and printers, you interact with them through the same kind of Input/Output (I/O) operations. This simplifies things: you open a “file” and perform normal operations like reading the file and writing on it (which is one reason why text editors, which you will learn about in an upcoming section, are so important.)
On many systems (including Linux), the filesystem is structured like a tree. The tree is usually portrayed as inverted, and starts at what is most often called the root directory, which marks the beginning of the hierarchical filesystem and is also some times referred to as the trunk, or simply denoted by /. The root directory is not the same as the root user. The hierarchical filesystem also contains other elements in the path (directory names) which are separated by forward slashes (/) as in /usr/bin/awk, where the last element is the actual file name.
Filesystem Hierarchy Standard
The Filesystem Hierarchy Standard (FHS) grew out of historical standards from early versions of UNIX, such as the Berkeley Software Distribution (BSD) and others. The FHS provides Linux developers and system administrators with a standard directory structure for the filesystem, which provides consistency between systems and distributions.
Visit http://www.pathname.com/fhs/ for a list of the main directories and their contents in Linux systems.
Linux supports various filesystem types created for Linux, along with compatible filesystems from other operating systems such as Windows and MacOS. Many older, legacy filesystems, such as FAT, are supported.
Some examples of filesystem types that Linux supports are:
1. ext3, ext4, btrfs, xfs (native Linux filesystems)
2. vfat, ntfs, hfs (filesystems from other operating systems)
Partitions in Linux
Each filesystem resides on a hard disk partition. Partitions help to organize the contents of disks according to the kind of data contained and how it is used. For example, important programs required to run the system are often kept on a separate partition (known as root or /) than the one that contains files owned by regular users of that system (/home). In addition, temporary files created and destroyed during the normal operation of Linux are often located on a separate partition; in this way, using all available space on a particular partition may not fatally affect the normal operation of the system.
Mount Points
Before you can start using a filesystem, you need to mount it to the filesystem tree at a mount point. This is simply a directory (which may or may not be empty) where the filesystem is to be attached (mounted). Sometimes you may need to create the directory if it doesn't already exist.
Warning: If you mount a filesystem on a non-empty directory, the former contents of that directory are covered-up and not accessible until the filesystem is unmounted. Thus mount points are usually empty directories.
The mount command is used to attach a filesystem (which can be local to the computer or, as we shall discuss, on a network) somewhere within the filesystem tree. Arguments include the device node and mount point. For example,
$ mount /dev/sda5 /home
will attach the filesystem contained in the disk partition associated with the /dev/sda5 device node, into the filesystem tree at the /home mount point. (Note that unless the system is otherwise configured only the root user has permission to run mount.) If you want it to be automatically available every time the system starts up, you need to edit the file /etc/fstab accordingly (the name is short for Filesystem Table). Looking at this file will show you the configuration of all pre-configured filesystems. man fstab will display how this file is used and how to configure it.
Typing mount without any arguments will show all presently mounted filesystems. Also shows if read-obly or writable.
The command df -Th (disk-free) will display information about mounted filesystems including usage statistics about currently used and available space.
The Network Filesystem
Using NFS (the Network Filesystem) is one of the methods used for sharing data across physical systems. Many system administrators mount remote users' home directories on a server in order to give them access to the same files and configuration files across multiple client systems. This allows the users to log in to different computers yet still have access to the same files and resources.
NFS on the Server
On the server machine, NFS daemons (built-in networking and service processes in Linux) and other system servers are typically started with the following command: sudo service nfs start
The text file /etc/exports contains the directories and permissions that a host is willing to share with other systems over NFS. An entry in this file may look like the following:
/projects *.example.com(rw)
This entry allows the directory /projects to be mounted using NFS with read and write (rw) permissions and shared with other hosts in the example.com domain. As we will detail in the next chapter, every file in Linux has 3 possible permissions: read (r), write (w) and execute (x).
After modifying the /etc/exports file, you can use the exportfs -av command to notify Linux about the directories you are allowing to be remotely mounted using NFS (restarting NFS with sudo service nfs restart will also work, but is heavier as it halts NFS for a short while before starting it up again).
NFS on the Client
On the client machine, if it is desired to have the remote filesystem mounted automatically upon system boot, the /etc/fstab file is modified to accomplish this. For example, an entry in the client's /etc/fstab file might look like the following:
servername:/projects /mnt/nfs/projects nfs defaults 0 0
You can also mount the remote filesystem without a reboot or as a one-time mount by directly using the mount command:
$ mount servername:/projects /mnt/nfs/projects
Remember, if /etc/fstab is not modified, this remote mount will not be present the next time the system is restarted.
proc Filesystem
Certain filesystems like the one mounted at /proc are called pseudo filesystems because they have no permanent presence anywhere on disk.
The /proc filesystem contains virtual files (files that exist only in memory) that permit viewing constantly varying kernel data. This filesystem contains files and directories that mimic kernel structures and configuration information. It doesn't contain real files but runtime system information (e.g. system memory, devices mounted, hardware configuration, etc). Some important files in /proc are:
/proc/cpuinfo
/proc/interrupts
/proc/meminfo
/proc/mounts
/proc/partitions
/proc/version
/proc has subdirectories as well, including:
/proc/<Process-ID-#>
/proc/sys
The first example shows there is a directory for every process running on the system which contains vital information about it. The second example shows a virtual directory that contains a lot of information about the entire system, in particular its hardware and configuration. The /proc filesystem is very useful because the information it reports is gathered only as needed and never needs storage on disk.
Filesystem Architecture
Home Directories overview
Now that you know about the basics of filesystems, let's learn about the filesystem architecture and directory structure in Linux.
Each user has a home directory, usually placed under /home. The /root (slash-root) directory on modern Linux systems is no more than the root user's home directory.
The /home directory is often mounted as a separate filesystem on its own partition, or even exported (shared) remotely on a network through NFS.
Sometimes you may group users based on their department or function. You can then create subdirectories under the /home directory for each of these groups. For example, a school may organize /home with something like the following:
/home/faculty/
/home/staff/
/home/students/
/bin and /sbin Directories
The /bin directory contains executable binaries, essential commands used in single-user mode, and essential commands required by all system users, such as in the figure.
To view a list of programs in the /bin directory, type: ls /bin
Commands that are not essential for the system in single-user mode are placed in the /usr/bin directory, while the /sbin directory is used for essential binaries related to system administration, such as ifconfig and shutdown. There is also a /usr/sbin directory for less essential system administration programs.
Sometimes /usr is a separate filesystem that may not be available/mounted in single-user mode. This was why essential commands were separated from non-essential commands. However, in some of the most modern Linux systems this distinction is considered obsolete, and /usr/bin and /bin are actually just linked together as are /usr/sbin and /sbin
/dev Directory
The /dev directory contains device nodes, a type of pseudo-file used by most hardware and software devices, except for network devices. This directory is:
- Empty on the disk partition when it is not mounted
- Contains entries which are created by the udev system, which creates and manages device nodes on Linux, creating them dynamically when devices are found. The /dev directory contains items such as:
/dev/sda1 (first partition on the first hard disk)
/dev/lp1 (second printer)
/dev/dvd1 (first DVD drive)
/var and /etc Directories
The /var directory contains files that are expected to change in size and content as the system is running (var stands for variable) such as the entries in the following directories:
-System log files: /var/log
- Packages and database files: /var/lib
- Print queues: /var/spool
- Temp files: /var/tmp
The /var directory may be put in its own filesystem so that growth of the files can be accommodated and the file sizes do not fatally affect the system. Network services directories such as /var/ftp (the FTP service) and /var/www (the HTTP web service) are also found under /var.
The /etc directory is the home for system configuration files. It contains no binary programs, although there are some executable scripts. For example, the file resolv.conf tells the system where to go on the network to obtain host name to IP address mappings (DNS). Files like passwd,shadow and group for managing user accounts are found in the /etc directory. System run level scripts are found in subdirectories of /etc. For example, /etc/rc2.d contains links to scripts for entering and leaving run level 2. The rc directory historically stood for Run Commands. Some distros extend the contents of /etc. For example, Red Hat adds the sysconfig subdirectory that contains more configuration files.
/boot Directory
The /boot directory contains the few essential files needed to boot the system. For every alternative kernel installed on the system there are four files:
1. vmlinuz: the compressed Linux kernel, required for booting
2. initramfs: the initial ram filesystem, required for booting, sometimes called initrd, not initramfs
3. config: the kernel configuration file, only used for debugging and bookkeeping
4. System.map: kernel symbol table, only used for debugging
Each of these files has a kernel version appended to its name.
The Grand Unified Bootloader (GRUB) files (such as /boot/grub/grub.conf or /boot/grub2/grub2.cfg) are also found under the /boot directory.
/lib and /media Directories
/lib contains libraries (common code shared by applications and needed for them to run) for the essential programs in /bin and /sbin. These library filenames either start with ld or lib, for example, /lib/libncurses.so.5.7.
Most of these are what are known as dynamically loaded libraries (also known as shared libraries or Shared Objects (SO)). On some Linux distributions there exists a /lib64 directory containing 64-bit libraries, while /lib contains 32-bit versions.
Kernel modules (kernel code, often device drivers, that can be loaded and unloaded without re-starting the system) are located in /lib/modules/<kernel-version-number>.
The /media directory is typically located where removable media, such as CDs, DVDs and USB drives are mounted. Unless configuration prohibits it, Linux automatically mounts the removable media in the /media directory when they are detected.
Comparing Files and File Types
Comparing Files
diff is used to compare files and directories. This often-used utiility program has many useful options (see man diff) including those in the figure.
To compare two files, at the command prompt, type diff <filename1> <filename2>
Using diff3 and patch
You can compare three files at once using diff3, which uses one file as the reference basis for the other two. For example, suppose you and a co-worker both have made modifications to the same file working at the same time independently. diff3 can show the differences based on the common file you both started with. The syntax for diff3 is as follows:
$ diff3 MY-FILE COMMON-FILE YOUR-FILE
The graphic shows the use of diff3.
Many modifications to source code and configuration files are distributed utilizing patches, which are applied, not suprisingly, with the patch program. A patch file contains the deltas (changes) required to update an older version of a file to the new one. The patch files are actually produced by running diff with the correct options, as in:
$ diff -Nur originalfile newfile > patchfile
Distributing just the patch is more concise and efficient than distributing the entire file. For example, if only one line needs to change in a file that contains 1,000 lines, the patch file will be just a few lines long.
To apply a patch you can just do either of the two methods below:
$ patch -p1 < patchfile
$ patch originalfile patchfile
The first usage is more common as it is often used to apply changes to an entire directory tree, rather than just one file as in the second example. To understand the use of the -p1 option and many others, see the man page for patch.
Using the 'file' utility
In Linux, a file's extension often does not categorize it the way it might in other operating systems. One can not assume that a file named file.txt is a text file and not an executable program. In Linux a file name is generally more meaningful to the user of the system than the system itself; in fact most applications directly examine a file's contents to see what kind of object it is rather than relying on an extension. This is very different from the way Windows handles filenames, where a filename ending with .exe, for example, represents an executable binary file.
The real nature of a file can be ascertained by using the file utility. For the file names given as arguments, it examines the contents and certain characteristics to determine whether the files are plain text, shared libraries, executable programs, scripts, or something else.
Backing Up and Compressing Data
Backing Up Data
There are many ways you can back up data or even your entire system. Basic ways to do so include use of simple copying with cp and use of the more robust rsync.
Both can be used to synchronize entire directory trees. However, rsync is more efficient because it checks if the file being copied already exists. If the file exists and there is no change in size or modification time, rsync will avoid an unnecessary copy and save time. Furthermore, because rsync copies only the parts of files that have actually changed, it can be very fast.
cp can only copy files to and from destinations on the local machine (unless you are copying to or from a filesystem mounted using NFS), but rsync can also be used to copy files from one machine to another. Locations are designated in the target:path form where target can be in the form of [user@]host. The user@ part is optional and used if the remote user is different from the local user.
rsync is very efficient when recursively copying one directory tree to another, because only the differences are transmitted over the network. One often synchronizes the destination directory tree with the origin, using the -r option to recursively walk down the directory tree copying all files and directories below the one listed as the source.
rsync
rsync is a very powerful utility. For example, a very useful way to back up a project directory might be to use the following command:
$ rsync -r project-X archive-machine:archives/project-X
Note that rsync can be very destructive! Accidental misuse can do a lot of harm to data and programs by inadvertently copying changes to where they are not wanted. Take care to specify the correct options and paths. It is highly recommended that you first test your rsync command using the -dry-run option to ensure that it provides the results that you want.
To use rsync at the command prompt, type rsync sourcefile destinationfile, where either file can be on the local machine or on a networked machine.
The contents of sourcefile are copied to destinationfile.
Compressing Data
File data is often compressed to save disk space and reduce the time it takes to transmit files over networks.
Linux uses a number of methods to perform this compression including those in the figure.
These techniques vary in the efficiency of the compression (how much space is saved) and in how long they take to compress; generally the more efficient techniques take longer. Decompression time doesn't vary as much across different methods.
In addition the tar utility is often used to group files in an archive and then compress the whole archive at once.
gzip
gzip is the most oftenly used Linux compression utility. It compresses very well and is very fast.
bzip2
bzip2 has syntax that is similar to gzip but it uses a different compression algorithm and produces significantly smaller files, at the price of taking a longer time to do its work. Thus, It is more likely to be used to compress larger files.
Examples of common usage are also similar to gzip
xz
xz is the most space efficient compression utility used in Linux and is now used by www.kernel.org to store archives of the Linux kernel. Once again it trades a slower compression speed for an even higher compression ratio.
Handling files using zip
The zip program is not often used to compress files in Linux, but is often required to examine and decompress archives from other operating systems. It is only used in Linux when you get a zipped file from a Windows user. It is a legacy program.
Archiving and Compressing using tar
Historically, tar stood for "tape archive" and was used to archive files to a magnetic tape. It allows you to create or extract files from an archive file, often called a tarball. At the same time you can optionally compress while creating the archive, and decompress while extracting its contents.
You can separate out the archiving and compression stages, as in:
$ tar mydir.tar mydir ; gzip mydir.tar
$ gunzip mydir.tar.gz ; tar xvf mydir.tar
but this is slower and wastes space by creating an unneeded intermediary .tar file.
Disk-to-Disk Copying
The dd program is very useful for making copies of raw disk space. For example, to back up your Master Boot Record (MBR) (the first 512 byte sector on the disk that contains a table describing the partitions on that disk), you might type:
dd if=/dev/sda of=sda.mbr bs=512 count=1
To use dd to make a copy of one disk onto another, (WARNING!) deleting everything that previously existed on the second disk, type:
dd if=/dev/sda of=/dev/sdb
An exact copy of the first disk device is created on the second disk device.
Do not experiment with this command as written above as it can erase a hard disk!
Exactly what the name dd stands for is an often-argued item. The words data definition is the most popular theory and has roots in early IBM history. Often people joke that it means disk destroyer and other variants such as delete data!
Labs
Lab1: Mounted Filesystems
To see filesystems that Linux has mounted:
cat /etc/fstab
-or-
mount
A way to show the mounted filesystems besides the mount command is :
cat /proc/mounts
Lab 2 Archive / Back-up your login directory
To back-up the login directory using tar:
tar -cvf /tmp/backup.tar ~
The same kind of back-up using tar-z looks like:
tar -zcvf /tmp/bacjup.tgz ~
The file /tmp/bacjup.tgz ~ should be smaller than the file /tmp/backup.tar
Lab3: diff and patching
1. change to /tmp directory
cd /tmp
2. Copy a text file to /tmp (ie the file /etc/group to /tmp
cp /etc/group /tmp
3. The dd command can copy regular files as well as disk devices. dd can also perform conversions (ie conv=ucase converts all characers to upper-case letters). To copy the text file in /tmp to another file in /tmp with cupper case conversion:
dd if=/tmp/group of=/tmp/GROUP conv=ucase
4. To compare 2 files using diff:
diff -au /tmp/group /tmp/GROUP > /tmp/group.diff
5. Use patch command to patch your original file (/tmp/group) to be the ame as the new file (the file after using dd to make it all upper-case
patch /tmp/group /tmp/group.diff
6. Prove that original file is patched to be the same as the converted to all upper-case file, using the diff command between the 2 files:
diff /tmp/group /tmp/GROUP
(nothing should print bc there are no differences)
Partitions
/bin directories
additional directories
subdirectories under /usr
Comparing Files
Compressing Data
gzip
bzip2
xz
zip
tar
User Environment
Accounts
Identifying the Current User(s)
As you know, Linux is a multiuser operating system; i.e., more than one user can log on at the same time.
- To list the currently logged-on users, type who
- To identify the current user, type whoami
Giving who the -a option will give more detailed information.
Basics of Users and Groups
Linux uses groups for organizing users. Groups are collections of accounts with certain shared permissions. Control of group membership is administered through the /etc/group file, which shows a list of groups and their members. By default, every user belongs to a default or primary group. When a user logs in, the group membership is set for their primary group and all the members enjoy the same level of access and privilege. Permissions on various files and directories can be modified at the group level.
All Linux users are assigned a unique user ID (uid), which is just an integer, as well as one or more group ID’s (gid), including a default one which is the same as the user ID.
Historically Fedora-family systems start uid's at 500; other distributions begin at 1000.
These numbers are associated with names through the files /etc/passwd and /etc/group.
For example, the first file might contain:
george:x:1002:1002:George Metesky:/home/george:/bin/bash
and the second george:x:1002
Groups are used to establish a set of users who have common interests for the purposes of access rights, privileges, and security considerations. Access rights to files (and devices) are granted on the basis of the user and the group they belong to.
Adding and Removing Users
Distributions have straightforward graphical interfaces for creating and removing users and groups and manipulating group membership. However, it is often useful to do it from the command line or from within shell scripts. Only the root user can add and remove users and groups.
Adding a new user is done with useradd and removing an existing user is done with userdel. In the simplest form an account for the new user turkey would be done with:
$ sudo useradd turkey
Note that for openSUSE, useradd is not in the normal user's PATH so the command should be:
$ sudo /usr/sbin/useradd turkey
which by default sets the home directory to /home/turkey, populates it with some basic files (copied from /etc/skel) and adds a line to /etc/passwd such as:
turkey:x:502:502::/home/turkey:/bin/bash
and sets the default shell to /bin/bash. Removing a user account is as easy as typing userdel turkey However, this will leave the /home/turkey directory intact. This might be useful if it is a temporary inactivation. To remove the home directory while removing the account one needs to use the -r option to userdel.
Typing id with no argument gives information about the current user, as in:
$ id
uid=500(george) gid=500(george) groups=106(fuse),500(george)
If given the name of another user as an argument, id will report information about that other user.
Adding and Removing Groups
Adding a new group is done with groupadd:
$ sudo /usr/sbin/groupadd anewgroup
The group can be removed with
$ sudo /usr/sbin/groupdel anewgroup
Adding a user to an already existing group is done with usermod. For example, you would first look at what groups the user already belongs to:
$ groups turkey
turkey : turkey
and then add the new group:
$ sudo /usr/sbin/usermod -G anewgroup turkey
$ groups turkey
turkey: turkey anewgroup
These utilities update /etc/group as necessary. groupmod can be used to change group properties such as the Group ID (gid) with the -g option or its name with the -n option.
Removing a user from the group is a somewhat trickier. The -G option to usermod must give a complete list of groups. Thus if you do:
$ sudo /usr/sbin/usermod -G turkey turkey
$ groups turkey
turkey : turkey
only the turkey group will be left.
root Account
The root account is very powerful and has full access to the system. Other operating systems often call this the administrator account; in Linux it is often called the superuser account. You must be extremely cautious before granting full root access to a user; it is rarely if ever justified. External attacks often consist of tricks used to elevate to the root account.
However, you can use the sudo feature to assign more limited privileges to user accounts:
on only a temporary basis.
only for a specific subset of commands.
su and sudo
When assigning elevated privileges, you can use the command su (switch or substitute user) to launch a new shell running as another user (you must type the password of the user you are becoming). Most often this other user is root, and the new shell allows the use of elevated privileges until it is exited. It is almost always a bad (dangerous for both security and stability) practice to use su to become root. Resulting errors can include deletion of vital files from the system and security breaches.
Granting privileges using sudo is less dangerous and is preferred. By default, sudo must be enabled on a per-user basis. However, some distributions (such as Ubuntu) enable it by default for at least one main user, or give this as an installation option.
Elevating to root Account
To fully become root, one merely types su and then is prompted for the root password.
To execute just one command with root privilege type sudo <command>. When the command is complete you will return to being a normal unprivileged user.
sudo configuration files are stored in the /etc/sudoers file and in the /etc/sudoers.d/ directory. By default, the sudoers.d directory is empty.
Startup Files
In Linux, the command shell program (generally bash) uses one or more startup files to configure the environment. Files in the /etc directory define global settings for all users while Initialization files in the user's home directory can include and/or override the global settings.
The startup files can do anything the user would like to do in every command shell, such as:
Customizing the user's prompt
Defining command-line shortcuts and aliases
Setting the default text editor
Setting the path for where to find executable programs
Order of the Startup Files
When you first login to Linux, /etc/profile is read and evaluated, after which the following files are searched (if they exist) in the listed order:
~/.bash_profile
~/.bash_login
~/.profile
The Linux login shell evaluates whatever startup file that it comes across first and ignores the rest. This means that if it finds ~/.bash_profile, it ignores ~/.bash_login and ~/.profile. Different distributions may use different startup files.
However, every time you create a new shell, or terminal window, etc., you do not perform a full system login; only the ~/.bashrc file is read and evaluated. Although this file is not read and evaluated along with the login shell, most distributions and/or users include the ~/.bashrc file from within one of the three user-owned startup files. In the Ubuntu, openSuse, and CentOS distros, the user must make appropriate changes in the ~/.bash_profile file to include the ~/.bashrc file.
The .bash_profile will have certain extra lines, which in turn will collect the required customization parameters from .bashrc.
Environmental Variables
Environment variables are simply named quantities that have specific values and are understood by the command shell, such as bash. Some of these are pre-set (built-in) by the system, and others are set by the user either at the command line or within startup and other scripts. An environment variable is actually no more than a character string that contains information used by one or more applications.
There are a number of ways to view the values of currently set environment variables; one can type set, env, or export. Depending on the state of your system, set may print out many more lines than the other two methods.
$ set
BASH=/bin/bash
BASHOPTS=checkwinsize:cmdhist:expand_aliases:extglob:extquote:force_fignore
BASH_ALIASES=()
...
$ env
SSH_AGENT_PID=1892
GPG_AGENT_INFO=/run/user/me/keyring-Ilf3vt/gpg:0:1
TERM=xterm
SHELL=/bin/bash
...
$ export
declare -x COLORTERM=gnome-terminal
declare -x COMPIZ_BIN_PATH=/usr/bin /
declare -x COMPIZ_CONFIG_PROFILE=ubuntu
Setting Environmental Variables
By default, variables created within a script are only available to the current shell; child processes (sub-shells) will not have access to values that have been set or modified. Allowing child processes to see the values, requires use of the export command.
HOME variable
HOME is an environment variable that represents the home (or login) directory of the user. cd without arguments will change the current working directory to the value of HOME. Note the tilde character (~) is often used as an abbreviation for $HOME. Thus cd $HOME and cd ~ are completely equivalent statements.
PATH variable
PATH is an ordered list of directories (the path) which is scanned when a command is given to find the appropriate program or script to run. Each directory in the path is separated by colons (:). A null (empty) directory name (or ./) indicates the current directory at any given time.
:path1:path2
path1::path2
In the example :path1:path2, there is null directory before the first colon (:). Similarly, for path1::path2 there is null directory between path1 and path2.
To prefix a private bin directory to your path:
$ export PATH=$HOME/bin:$PATH
$ echo $PATH
/home/me/bin:/usr/local/bin:/usr/bin:/bin/usr
PS1 variable
Prompt Statement (PS) is used to customize your prompt string in your terminal windows to display the information you want.
PS1 is the primary prompt variable which controls what your command line prompt looks like. The following special characters can be included in PS1 :
\u - User name
\h - Host name
\w - Current working directory
\! - History number of this command
\d - Date
They must be surrounded in single quotes when they are used as in the following example:
$ echo $PS1
$
$ export PS1='\u@\h:\w$ '
me@example.com:~$ # new prompt
me@example.com:~$
To revert the changes:
me@example.com:~$ export PS1='$ '
$
Even better practice would be to save the old prompt first and then restore, as in:
$ OLD_PS1=$PS1
change the prompt, and eventually change it back with:
$ PS1=$OLD_PS1
$
SHELL variable
The environment variable SHELL points to the user's default command shell (the program that is handling whatever you type in a command window, usually bash) and contains the full pathname to the shell:
$ echo $SHELL
/bin/bash
$
Recalling Previous Commands
bash keeps track of previously entered commands and statements in a history buffer; you can recall previously used commands simply by using the Up and Down cursor keys. To view the list of previously executed commands, you can just type history at the command line.
The list of commands is displayed with the most recent command appearing last in the list. This information is stored in ~/.bash_history.
Using History Environment Variables
Several associated environment variables can be used to get information about the history file.
HISTFILE stores the location of the history file.
HISTFILESIZE stores the maximum number of lines in the history file.
HISTSIZE stores the maximum number of lines in the history file for the current session.
Finding and Using Previous Commands
If you want to recall a command in the history list, but do not want to press the arrow key repeatedly, you can press CTRL-R to do a reverse intelligent search.
As you start typing the search goes back in reverse order to the first command that matches the letters you've typed. By typing more successive letters you make the match more and more specific.
The following is an example of how you can use the CTRL-R command to search through the command history
$ ^R # This all happens on 1 line
(reverse-i-search)'s': sleep 1000 # Searched for 's'; matched "sleep"
$ sleep 1000 #Pressed Enter to execute the searched command
$
Executing previous commands
All history substitutions start with !. In the line $ ls -l /bin /etc /var !$ refers to /var, which is the last argument in the line.
Here are more examples:
$ history
echo $SHELL
echo $HOME
echo $PS1
ls -a
ls -l /etc/ passwd
sleep 1000
history
$ !1 # Execute command #1 above
echo $SHELL
/bin/bash
$ !sl # Execute the command beginning with "sl"
sleep 1000
$
Command Aliases
Creating Aliases
You can create customized commands or modify the behavior of already existing ones by creating aliases. Most often these aliases are placed in your ~/.bashrc file so they are available to any command shells you create.
Typing alias with no arguments will list currently defined aliases.
Please note there should not be any spaces on either side of the equal sign and the alias definition needs to be placed within either single or double quotes if it contains any spaces.
File Permissions
File Ownership
In Linux and other UNIX-based operating systems, every file is associated with a user who is the owner. Every file is also associated with a group (a subset of all users) which has an interest in the file and certain rights, or permissions: read, write, and execute.
File Permission Modes and chmod
Files have three kinds of permissions: read (r), write (w), execute (x). These are generally represented as in rwx. These permissions affect three groups of owners: user/owner (u), group (g), and others (o).
As a result, you have the following three groups of three permissions:
rwx: rwx: rwx
u: g: o
There are a number of different ways to use chmod. For instance, to give the owner and others execute permission and remove the group write permission:
$ ls -l test1
-rw-rw-r-- 1 coop coop 1601 Mar 9 15:04 test1
$ chmod uo+x,g-w test1
$ ls -l test1
-rwxr--r-x 1 coop coop 1601 Mar 9 15:04 test1
where u stands for user (owner), o stands for other (world), and g stands for group.
This kind of syntax can be difficult to type and remember, so one often uses a shorthand which lets you set all the permissions in one step. This is done with a simple algorithm, and a single digit suffices to specify all three permission bits for each entity. This digit is the sum of:
4 if read permission is desired.
2 if write permission is desired.
1 if execute permission is desired.
Thus 7 means read/write/execute, 6 means read/write, and 5 means read/execute.
When you apply this to the chmod command you have to give three digits for each degree of freedom, such as in
$ chmod 755 test1
$ ls -l test1
-rwxr-xr-x 1 coop coop 1601 Mar 9 15:04 test1
Example of chown
The first image shows the permissions for owners/groups/all users on 'file1'. The second image shows the change in permissions for the different users on "file1"
$ ls -l
total 4
-rw-rw-r--. 1 bob bob 0 Mar 16 19:04 file-1
-rw-rw-r--. 1 bob bob 0 Mar 16 19:04 file-2
drwxrwxr-x. 2 bob bob 4096 Mar 16 19:04 temp
$ sudo chown root file-1
[sudo] password for bob:
$ ls -l
total 4
-rw-rw-r--. 1 root bob 0 Mar 16 19:04 file-1
-rw-rw-r--. 1 bob bob 0 Mar 16 19:04 file-2
drwxrwxr-x. 2 bob bob 4096 Mar 16 19:04 temp
Example of changing group ownership: chgrp
The image on LHS shows the group with their permissions on 'file1'.
The image on RHS shows the change in groups and thier permissions on "file1"
$ sudo chgrp bin file-2
$ ls -l
total 4
-rw-rw-r--. 1 root bob 0 Mar 16 19:04 file-1
-rw-rw-r--. 1 bob bin 0 Mar 16 19:04 file-2
drwxrwxr-x. 2 bob bob 4096 Mar 16 19:04 temp
Setting Environmental Variables
HOME variable
Finding and Using Previous Commands
Executing previous commands
Keyboard Shortcuts
File Ownership
Text Editors
At some point you will need to manually edit text files. You might be composing an email off-line, writing a script to be used for bash or other command interpreters, altering a system or application configuration file, or developing source code for a programming language such as C or Java.
Linux Administrators quite often sidestep the text editors, by using graphical utilities for creating and modifying system configuration files. However, this can be far more laborious than directly using a text editor. Note that word processing applications such as Notepad or the applications that are part of office suites are not really basic text editors because they add a lot of extra (usually invisible) formatting information that will probably render system administration configuration files unusable for their intended purpose. So using text editors really is essential in Linux.
By now you have certainly realized Linux is packed with choices; when it comes to text editors, there are many choices ranging from quite simple to very complex, including:
- nano
- gedit
- vi
- emacs
Creating Files without Using an Editor
Sometimes you may want to create a short file and don't want to bother invoking a full text editor. In addition, doing so can be quite useful when used from within scripts, even when creating longer files. You'll no doubt find yourself using this method when you start on the later chapters that cover bash scripting!
If you want to create a file without using an editor there are two standard ways to create one from the command line and fill it with content.
The first is to use echo repeatedly:
$ echo line one > myfile
$ echo line two >> myfile
$ echo line three >> myfile
Earlier we learned that a single greater-than sign (>) will send the output of a command to a file. Two greater-than signs (>>) will append new output to an existing file.
The second way is to use cat combined with redirection:
$ cat << EOF > myfile
> line one
> line two
> line three
> EOF
$
Both the above techniques produce a file with the following lines in it:
line one
line two
line three
and are extremely useful when employed by scripts.
nano and gedit
There are some text editors that are pretty obvious; they require no particular experience to learn and are actually quite capable if not robust. One particularly easy one to use is the text-terminal based editor nano. Just invoke nano by giving a file name as an argument. All the help you need is displayed at the bottom of the screen, and you should be able to proceed without any problem.
As a graphical editor, gedit is part of the GNOME desktop system (kwrite is associated with KDE). The gedit and kwrite editors are very easy to use and are extremely capable. They are also very configurable. They look a lot like Notepad in Windows. Other variants such as kedit and kate are also supported by KDE.
nano
nano is easy to use, and requires very little effort to learn. To open a file in nano, type nano <filename> and press Enter. If the file doesn't exist, it will be created.
nano provides a two line “shortcut bar” at the bottom of the screen that lists the available commands. Some of these commands are:
CTRL-G: Display the help screen
CTRL-O: Write to a file
CTRL-X: Exit a file
CTRL-R: Insert contents from another file to the current buffer
CTRL-C: Cancels previous commands
gedit
gedit (pronounced 'g-edit') is a simple-to-use graphical editor that can only be run within a Graphical Desktop environment. It is visually quite similar to the Notepad text editor in Windows, but is actually far more capable and very configurable and has a wealth of plugins available to extend its capabilities further.
To open a new file in gedit, find the program in your desktop's menu system, or from the command line type gedit <filename>. If the file doesn't exist it will be created.
Using gedit is pretty straight-forward and doesn't require much training. Its interface is composed of quite familiar elements.
vi and emacs
Developers and administrators experienced in working on UNIX-like systems almost always use one of the two venerable editing options; vi and emacs. Both are present or easily available on all distributions and are completely compatible with the versions available on other operating systems.
Both vi and emacs have a basic purely text-based form that can run in a non-graphical environment. They also have one or more X-based graphical forms with extended capabilities; these may be friendlier for a less experienced user. While vi and emacs can have significantly steep learning curves for new users, they are extremely efficient when one has learned how to use them.
You need to be aware that fights among seasoned users over which editor is better can be quite intense and are often described as a holy war.
Usually the actual program installed on your system is vim which stands for vi Improved, and is aliased to the name vi. The name is pronounced as “vee-eye”.
Even if you don’t want to use vi, it is good to gain some familiarity with it: it is a standard tool installed on virtually all Linux distributions. Indeed, there may be times where there is no other editor available on the system.
GNOME extends vi with a very graphical interface known as gvim and KDE offers kvim. Either of these may be easier to use at first.
When using vi, all commands are entered through the keyboard; you don’t need to keep moving your hands to use a pointer device such as a mouse or touchpad, unless you want to do so when using one of the graphical versions of the editor.
vimtutuor
Typing vimtutor launches a short but very comprehensive tutorial for those who want to learn their first vi commands. This tutorial is a good place to start learning vi. Even though it provides only an introduction and just seven lessons, it has enough material to make you a very proficient vi user because it covers a large number of commands. After learning these basic ones, you can look up new tricks to incorporate into your list of vi commands because there are always more optimal ways to do things in vi with less typing.
Modes in vi
vi provides three modes as described in the table below. It is vital to not lose track of which mode you are in. Many keystrokes and commands behave quite differently in different modes.
Searching for texts in vi
/pattern Search forward for a pattern
?pattern Search backward for pattern
n Move to next occurrence of search pattern
N Move to prev occurrence of search pattern
Can go here for list of commands in vi:
https://d37djvu3ytnwxt.cloudfront.net/assets/courseware/v1/dd0b84d079c38cca37826462d16a904e/asset-v1:LinuxFoundationX+LFS101x+1T2016+type@asset+block/VI_Editor.pdf
Using External Commands
Typing :sh command opens an external command shell. When you exit the shell, you will resume your vi editing session.
Typing :! executes a command from within vi. The command follows the exclamation point. This technique best suited for non-interactive commands such as:
:! wc %
Typing this will run the wc (word count) command on the file; the character % represents the file currently being edited.
The fmt command does simple formatting of text. If you are editing a file and want the file to look nice, you can run the file through fmt. One way to do this while editing is by using: %!fmt, which runs the entire file (the % part) through fmt and replaces the file with the results.
Introduction to emacs
The emacs editor is a popular competitor for vi. Unlike vi, it does not work with modes. emacs is highly customizable and includes a large number of features. It was initially designed for use on a console, but was soon adapted to work with a GUI as well. emacs has many other capabilities other than simple text editing; it can be used for email, debugging, etc.
Rather than having different modes for command and insert, like vi, emacs uses the CTRL and Meta (Alt or Esc) keys for special commands.
Changing cursor position in emacs
arrow keys Use the arrow keys for up, down, left and right
CTRL-n One line down
CTRL-p One line up
CTRL-f One character forward/right
CTRL-b One character back/left
CTRL-a Move to beginning of line
CTRL-e Move to end of line
Meta-f Move to beginning of next word
Meta-b Move back to beginning of preceding word
Meta-< Move to beginning of file
Meta-g-g-n Move to line n (can also use 'Esc-x Goto-line n')
Meta-> Move to end of file
CTRL-v or Page Down Move forward one page
Meta-v or Page Up Move backward one page
CTRL-l Refresh and center screen
CTRL-o Insert a blank line
CTRL-d Delete character at current position
CTRL-k Delete the rest of the current line
CTRL-_ Undo the previous operation
CTRL- (space or CTRL-@) Mark the beginning of the selected region. The end will be at the cursor position
CTRL-w Delete the current marked text and write it to the buffer
CTRL-y Insert at current cursor location whatever was most recently deleted
Complete(r) list of emacs commands
https://d37djvu3ytnwxt.cloudfront.net/assets/courseware/v1/2d9f15c41340c037c6bc9f335108994f/asset-v1:LinuxFoundationX+LFS101x+1T2016+type@asset+block/emacs.pdf
modes in vi
files in vi
cursor position in vi
Working with emacs
Local Security Principles
The Linux kernel allows properly authenticated users to access files and applications. While each user is identified by a unique integer (the user id or UID), a separate database associates a username with each UID. Upon account creation, new user information is added to the user database and the user's home directory must be created and populated with some essential files. Command line programs such as useradd and userdel as well as GUI tools are used for creating and removing accounts.
For each user, the following seven fields are maintained in the /etc/passwd file, as in the image
Types of accounts
By default, Linux distinguishes between several account types in order to isolate processes and workloads. Linux has four types of accounts:
root
System
Normal
Network
For a safe working environment, it is advised to grant the minimum privileges possible and necessary to accounts, and remove inactive accounts. The last utility, which shows the last time each user logged into the system, can be used to help identify potentially inactive accounts which are candidates for system removal.
Keep in mind that practices you use on multi-user business systems are more strict than practices you can use on personal desktop systems that only affect the casual user. This is especially true with security. We hope to show you practices applicable to enterprise servers that you can use on all systems, but understand that you may choose to relax these rules on your own personal system.
Understanding the root account
root is the most privileged account on a Linux/UNIX system. This account has the ability to carry out all facets of system administration, including adding accounts, changing user passwords, examining log files, installing software, etc. Utmost care must be taken when using this account. It has no security restrictions imposed upon it.
When you are signed in as, or acting as root, the shell prompt displays '#' (if you are using bash and you haven’t customized the prompt as we discuss elsewhere in this course). This convention is intended to serve as a warning to you of the absolute power of this account.
Understanding the usage of the root account
Operations that require root privileges
root privileges are required to perform operations such as:
Creating, removing and managing user accounts.
Managing software packages.
Removing or modifying system files.
Restarting system services.
Regular account users of Linux distributions may be allowed to install software packages, update some settings, and apply various kinds of changes to the system. However, root privilege is required for performing administration tasks such as restarting services, manually installing packages and managing parts of the filesystem that are outside the normal user’s directories.
Creating a new user
To create a new user account:
At the command prompt, as root type useradd <username> and press the ENTER key.
To set the initial password, type passwd <username> and press the ENTER key. The New password: prompt is displayed.
Enter the password and press the ENTER key.
To confirm the password, the prompt Retype new password: is displayed.
Enter the password again and press the ENTER key.
The message passwd: all authentication tokens updated successfully. is displayed.
Operations that do not require root privileges
A regular account user can perform some operations requiring special permissions; however, the system configuration must allow such abilities to be exercised.
SUID (Set owner User ID upon execution—similar to the Windows "run as" feature) is a special kind of file permission given to a file. SUID provides temporary permissions to a user to run a program with the permissions of the file owner (which may be root) instead of the permissions held by the user.
Using sudo, the importance of process isolation, limiting hardware access and keeping systems current
Comparing sudo and su
In Linux you can use either su or sudo to temporarily grant root access to a normal user; these methods are actually quite different.
sudo Features
sudo has the ability to keep track of unsuccessful attempts at gaining root access. Users' authorization for using sudo is based on configuration information stored in the /etc/sudoers file and in the /etc/sudoers.d directory.
A message such as the following would appear in a system log file (usually /var/log/secure) when trying to execute sudo bash without successfully authenticating the user:
authentication failure; logname=op uid=0 euid=0 tty=/dev/pts/6 ruser=op rhost= user=op
conversation failed
auth could not identify password for [op]
op : 1 incorrect password attempt ;
TTY=pts/6 ; PWD=/var/log ; USER=root ; COMMAND=/bin/bash
the sudoers File
Whenever sudo is invoked, a trigger will look at /etc/sudoers and the files in /etc/sudoers.d to determine if the user has the right to use sudo and what the scope of their privilege is. Unknown user requests and requests to do operations not allowed to the user even with sudo are reported. You can edit the sudoers file by using visudo, which ensures that only one person is editing the file at a time, has the proper permissions, and refuses to write out the file and exit if there is an error in the changes made.
The basic structure of an entry is:
who where = (as_whom) what
The file has a lot of documentation in it about how to customize. Most Linux distributions now prefer you add a file in the directory /etc/sudoers.d with a name the same as the user. This file contains the individual user's sudo configuration, and one should leave the master configuration file untouched except for changes that affect all users.
Command logging
By default, sudo commands and any failures are logged in /var/log/auth.log under the Debian distribution family, and in /var/log/messages or /var/log/secure on other systems. This is an important safeguard to allow for tracking and accountability of sudo use. A typical entry of the message contains:
Calling username
Terminal info
Working directory
User account invoked
Command with arguments
Running a command such as sudo whoami results in a log file entry such as:
Dec 8 14:20:47 server1 sudo: op : TTY=pts/6 PWD=/var/log USER=root COMMAND=/usr/bin/whoami
Process isolation
Linux is considered to be more secure than many other operating systems because processes are naturally isolated from each other. One process normally cannot access the resources of another process, even when that process is running with the same user privileges. Linux thus makes it difficult (though certainly not impossible) for viruses and security exploits to access and attack random resources on a system.
Additional security mechanisms that have been recently introduced in order to make risks even smaller are:
Control Groups (cgroups): Allows system administrators to group processes and associate finite resources to each cgroup.
Linux Containers (LXC): Makes it possible to run multiple isolated Linux systems (containers) on a single system by relying on cgroups.
Virtualization: Hardware is emulated in such a way that not only processes can be isolated, but entire systems are run simultaneously as isolated and insulated guests (virtual machines) on one physical host.
Hardware device access
Linux limits user access to non-networking hardware devices in a manner that is extremely similar to regular file access. Applications interact by engaging the filesystem layer (which is independent of the actual device or hardware the file resides on). This layer will then opens a device special file (often called a device node) under the /dev directory that corresponds to the device being accessed. Each device special file has standard owner, group and world permission fields. Security is naturally enforced just as it is when standard files are accessed.
Hard disks, for example, are represented as /dev/sd*. While a root user can read and write to the disk in a raw fashion (for example, by doing something like:
$ echo hello world > /dev/sda1
the standard permissions as shown in the figure make it impossible for regular users to do so. Writing to a device in this fashion can easily obliterate the filesystem stored on it in a way that cannot be repaired without great effort, if at all. The normal reading and writing of files on the hard disk by applications is done at a higher level through the filesystem, and never through direct access to the device node.
Keeping current
When security problems in either the Linux kernel or applications and libraries are discovered, Linux distributions have a good record of reacting quickly and pushing out fixes to all systems by updating their software repositories and sending notifications to update immediately. The same thing is true with bug fixes and performance improvements that are not security related.
However, it is well known that many systems do not get updated frequently enough and problems which have already been cured are allowed to remain on computers for a long time; this is particularly true with proprietary operating systems where users are either uninformed or distrustful of the vendor's patching policy as sometimes updates can cause new problems and break existing operations. Many of the most successful attack vectors come from exploiting security holes for which fixes are already known but not universally deployed.
So the best practice is to take advantage of your Linux distribution's mechanism for automatic updates and never postpone them. It is extremely rare that such an update will cause new problems.
How Passwords are Stored
The system verifies authenticity and identity using user credentials. Originally, encrypted passwords were stored in the /etc/passwd file, which was readable by everyone. This made it rather easy for passwords to be cracked. On modern systems, passwords are actually stored in an encrypted format in a secondary file named /etc/shadow. Only those with root access can modify/read this file.
Password encryption
Protecting passwords has become a crucial element of security. Most Linux distributions rely on a modern password encryption algorithm called SHA-512 (Secure Hashing Algorithm 512 bits), developed by the U.S. National Security Agency (NSA) to encrypt passwords.
The SHA-512 algorithm is widely used for security applications and protocols. These security applications and protocols include TLS, SSL, PHP, SSH, S/MIME and IPSec. SHA-512 is one of the most tested hashing algorithms.
For example, if you wish to experiment with SHA-512 encoding, the word “test” can be encoded using the program sha512sum to produce the SHA-512 form
Good password practices
IT professionals follow several good practices for securing the data and the password of every user.
Password aging is a method to ensure that users get prompts that remind them to create a new password after a specific period. This can ensure that passwords, if cracked, will only be usable for a limited amount of time. This feature is implemented using chage, which configures the password expiry information for a user.
Another method is to force users to set strong passwords using Pluggable Authentication Modules (PAM). PAM can be configured to automatically verify that a password created or modified using the passwd utility is sufficiently strong. PAM configuration is implemented using a library called pam_cracklib.so, which can also be replaced by pam_passwdqc.so for more options.
One can also install password cracking programs, such as John The Ripper, to secure the password file and detect weak password entries. It is recommended that written authorization be obtained before installing such tools on any system that you do not own.
Requiring Boot Loader Passwords
You can secure the boot process with a secure password to prevent someone from bypassing the user authentication step. For systems using the GRUB boot loader, for the older GRUB version 1, you can invoke grub-md5-crypt which will prompt you for a password and then encrypt as shown on the adjoining screen.
You then must edit /boot/grub/grub.conf by adding the following line below the timeout entry:
password --md5 $1$Wnvo.1$qz781HRVG4jUnJXmdSCZ30
You can also force passwords for only certain boot choices rather than all.
For the now more common GRUB version 2 things are more complicated, and you have more flexibility and can do things like use user-specific passwords, which can be their normal login password. Also you never edit the configuration file, /boot/grub/grub.cfg, directly, rather you edit system configuration files in /etc/grub.d and then run update-grub. One explanation of this can be found at https://help.ubuntu.com/community/Grub2/Passwords
Hardware vulnerability
When hardware is physically accessible, security can be compromised by:
Key logging: Recording the real time activity of a computer user including the keys they press. The captured data can either be stored locally or transmitted to remote machines
Network sniffing: Capturing and viewing the network packet level data on your network
Booting with a live or rescue disk
Remounting and modifying disk content
Your IT security policy should start with requirements on how to properly secure physical access to servers and workstations. Physical access to a system makes it possible for attackers to easily leverage several attack vectors, in a way that makes all operating system level recommendations irrelevant.
The guidelines of security are:
Lock down workstations and servers
Protect your network links such that it cannot be accessed by people you do not trust
Protect your keyboards where passwords are entered to ensure the keyboards cannot be tampered with
Ensure a password protects the BIOS in such a way that the system cannot be booted with a live or rescue DVD or USB key
For single user computers and those in a home environment some of the above features (like preventing booting from removable media) can be excessive, and you can avoid implementing them. However, if sensitive information is on your system that requires careful protection, either it shouldn't be there or it should be better protected by following the above guidelines
Like all software, hackers occasionally find weaknesses in the Linux ecosystem. The strength of the Linux (and open source community in general) is the speed with which such vulnerabilities are exposed and remediated.
Network operations
Introduction to networking
A network is a group of computers and computing devices connected together through communication channels, such as cables or wireless media. The computers connected over a network may be located in the same geographical area or spread across the world.
A network is used to:
Allow the connected devices to communicate with each other.
Enable multiple users to share devices over the network, such as printers and scanners.
Share and manage information across computers easily.
Most organizations have both an internal network and an Internet connection for users to communicate with machines and people outside the organization. The Internet is the largest network in the world and is often called "the network of networks".
IP Addresses
Devices attached to a network must have at least one unique network address identifier known as the IP (Internet Protocol) address. The address is essential for routing packets of information through the network.
Exchanging information across the network requires using streams of bite-sized packets, each of which contains a piece of the information going from one machine to another. These packets contain data buffers together with headers which contain information about where the packet is going to and coming from, and where it fits in the sequence of packets that constitute the stream. Networking protocols and software are rather complicated due to the diversity of machines and operating systems they must deal with, as well as the fact that even very old standards must be supported.
IPv4 and IPv6
There are two different types of IP addresses available: IPv4 (version 4) and IPv6 (version 6). IPv4 is older and by far the more widely used, while IPv6 is newer and is designed to get past the limitations of the older standard and furnish many more possible addresses.
IPv4 uses 32-bits for addresses; there are only 4.3 billion unique addresses available. Furthermore, many addresses are allotted and reserved but not actually used. IPv4 is becoming inadequate because the number of devices available on the global network has significantly increased over the past years.
IPv6 uses 128-bits for addresses; this allows for 3.4 X 10^38 unique addresses. If you have a larger network of computers and want to add more, you may want to move to IPv6, because it provides more unique addresses. However, it is difficult to move to IPv6 as the two protocols do not inter-operate. Due to this, migrating equipment and addresses to IPv6 requires significant effort and hasn't been as fast as was originally intended.
Decoding IPv4 Addresses
32-bit IPv4 addresses divided into 4 8-bit sections called octets
Network addresses are divided into five classes: A, B, C, D, and E. Classes A, B, and C are classified into two parts: Network addresses (Net ID) and Host address (Host ID). The Net ID is used to identify the network, while the Host ID is used to identify a host in the network. Class D is used for special multicast applications (information is broadcast to multiple computers simultaneously) and Class E is reserved for future use.
Class A Network Addresses
Class A addresses use the first octet of an IP address as their Net ID and use the other three octets as the Host ID. The first bit of the first octet is always set to zero. So you can use only 7-bits for unique network numbers. As a result, there are a maximum of 126 Class A networks available (the addresses 0000000 and 1111111 are reserved). Not surprisingly, this was only feasible when there were very few unique networks with large numbers of hosts. As the use of the Internet expanded, Classes B and C were added in order to accomodate the growing demand for independent networks.
Each Class A network can have up to 16.7 million unique hosts on its network. The range of host address is from 1.0.0.0 to 127.255.255.255.
Note: The value of an octet, or 8-bits, can range from 0 to 255.
Class B Network Addresses
Class B addresses use the first two octets of the IP address as their Net ID and the last two octets as the Host ID. The first two bits of the first octet are always set to binary 10, so there are a maximum of 16,384 (14-bits) Class B networks. The first octet of a Class B address has values from 128 to 191. The introduction of Class B networks expanded the number of networks but it soon became clear that a further level would be needed.
Each Class B network can support a maximum of 65,536 unique hosts on its network. The range of host address is from 128.0.0.0 to 191.255.255.255.
Class C Network Addresses
Class C addresses use the first three octets of the IP address as their Net ID and the last octet as their Host ID. The first three bits of the first octet are set to binary 110, so almost 2.1 million (21-bits) Class C networks are available. The first octet of a Class C address has values from 192 to 223. These are most common for smaller networks which don't have many unique hosts.
Each Class C network can support up to 256 (8-bits) unique hosts. The range of host address is from 192.0.0.0 to 223.255.255.255.
IP Address Allocation
Typically, a range of IP addresses are requested from your Internet Service Provider (ISP) by your organization's network administrator. Often your choice of which class of IP address you are given depends on the size of your network and expected growth needs.
You can assign IP addresses to computers over a network manually or dynamically. When you assign IP addresses manually, you add static (never changing) addresses to the network. When you assign IP addresses dynamically (they can change every time you reboot or even more often), the Dynamic Host Configuration Protocol (DHCP) is used to assign IP addresses.
Manually Allocating an IP Address
Before an IP address can be allocated manually, one must identify the size of the network by determining the host range; this determines which network class (A, B, or C) can be used. The ipcalc program can be used to ascertain the host range.
Note: The version of ipcalc supplied in the Fedora family of distributions does not behave as described below, it is really a different program.
Assume that you have a Class C network. The first three octets of the IP address are 192.168.0. As it uses 3 octets (i.e. 24 bits) for the network mask, the shorthand for this type of address is 192.168.0.0/24. To determine the host range of the address you can use for this new host, at the command prompt, type: ipcalc 192.168.0.0/24 and press Enter.
From the result, you can check the HostMin and HostMax values to manually assign a static address available from 1 to 254 (192.168.0.1 to 192.168.0.254).
Name Resolution
Name Resolution is used to convert numerical IP address values into a human-readable format known as the hostname. For example, 140.211.169.4 is the numerical IP address that refers to the linuxfoundation.org hostname. Hostnames are easier to remember.
Given an IP address, you can obtain its corresponding hostname. Accessing the machine over the network becomes easier when you can type the hostname instead of the IP address.
You can view your system’s hostname simply by typing hostname with no argument.
Note: If you give an argument, the system will try to change its hostname to match it, however, only root users can do that.
The special hostname localhost is associated with the IP address 127.0.0.1, and describes the machine you are currently on (which normally has additional network-related IP addresses).
Network interfaces
Network interfaces are a connection channel between a device and a network. Physically, network interfaces can proceed through a network interface card (NIC) or can be more abstractly implemented as software. You can have multiple network interfaces operating at once. Specific interfaces can be brought up (activated) or brought down (de-activated) at any time.
A list of currently active network interfaces is reported by the ifconfig utility which you may have to run as the superuser, or at least, give the full path, i.e., /sbin/ifconfig, on some distributions.
Network Configuration Files
Network configuration files are essential to ensure that interfaces function correctly.
For Debian family configuration, the basic network configuration file is /etc/network/interfaces. You can type /etc/init.d/networking start to start the networking configuration.
For Fedora family system configuration, the routing and host information is contained in /etc/sysconfig/network. The network interface configuration script is located at /etc/sysconfig/network-scripts/ifcfg-eth0.
For SUSE family system configuration, the routing and host information and network interface configuration scripts are contained in the /etc/sysconfig/network directory.
You can type /etc/init.d/network start to start the networking configuration for Fedora and SUSE families.
Network Configuration Commands
To view the IP address:
$ /sbin/ip addr show
To view the routing information:
$ /sbin/ip route show
ip is a very powerful program that can do many things. Older (and more specific) utilities such as ifconfig and route are often used to accomplish similar tasks. A look at the relevant man pages can tell you much more about these utilities.
ping
ping is used to check whether or not a machine attached to the network can receive and send data; i.e., it confirms that the remote host is online and is responding.
To check the status of the remote host, at the command prompt, type ping <hostname>.
ping is frequently used for network testing and management; however, its usage can increase network load unacceptably. Hence, you can abort the execution of ping by typing CTRL-C, or by using the -c option, which limits the number of packets that ping will send before it quits. When execution stops, a summary is displayed.
route
A network requires the connection of many nodes. Data moves from source to destination by passing through a series of routers and potentially across multiple networks. Servers maintain routing tables containing the addresses of each node in the network. The IP Routing protocols enable routers to build up a forwarding table that correlates final destinations with the next hop addresses.
route is used to view or change the IP routing table. You may want to change the IP routing table to add, delete or modify specific (static ) routes to specific hosts or networks. The table explains some commands that can be used to manage IP routing.
traceroute
traceroute is used to inspect the route which the data packet takes to reach the destination host which makes it quite useful for troubleshooting network delays and errors. By using traceroute you can isolate connectivity issues between hops, which helps resolve them faster.
To print the route taken by the packet to reach the network host, at the command prompt, type traceroute <domain>.
Browsers
Graphical and Non-Graphical Browsers
Browsers are used to retrieve, transmit, and explore information resources, usually on the World Wide Web. Linux users commonly use both graphical and non-graphical browser applications.
The common graphical browsers used in Linux are:
Firefox
Google Chrome
Chromium
Epiphany
Opera
Sometimes you either do not have a graphical environment to work in (or have reasons not to use it) but still need to access web resources. In such a case, you can use non-graphical browsers such as in the figure
wget
Sometimes you need to download files and information but a browser is not the best choice, either because you want to download multiple files and/or directories, or you want to perform the action from a command line or a script. wget is a command line utility that can capably handle the following types of downloads:
Large file downloads
Recursive downloads, where a web page refers to other web pages and all are downloaded at once
Password-required downloads
Multiple file downloads
To download a webpage, you can simply type wget <url>, and then you can read the downloaded page as a local file using a graphical or non-graphical browser.
curl
Besides downloading you may want to obtain information about a URL, such as the source code being used. curl can be used from the command line or a script to read such information. curl also allows you to save the contents of a web page to a file as does wget.
You can read a URL using curl <URL>. For example, if you want to read http://www.linuxfoundation.org , type curl http://www.linuxfoundation.org.
To get the contents of a web page and store it to a file, type curl -o saved.html http://www.mysite.com. The contents of the main index file at the website will be saved in saved.html.
FTP (File Transfer Protocol)
When you are connected to a network, you may need to transfer files from one machine to another. File Transfer Protocol (FTP) is a well-known and popular method for transferring files between computers using the Internet. This method is built on a client-server model. FTP can be used within a browser or with standalone client programs.
FTP Clients
FTP clients enable you to transfer files with remote computers using the FTP protocol. These clients can be either graphical or command line tools. Filezilla, for example, allows use of the drag-and-drop approach to transfer files between hosts. All web browsers support FTP, all you have to do is give a URL like : ftp://ftp.kernel.org where the usual http:// becomes ftp://.
Some command line FTP clients are:
ftp
sftp
ncftp
yafc (Yet Another FTP Client)
sftp is a very secure mode of connection, which uses the Secure Shell (ssh) protocol, which we will discuss shortly. sftp encrypts its data and thus sensitive information is transmitted more securely. However, it does not work with so-called anonymous FTP (guest user credentials). Both ncftp and yafc are also powerful FTP clients which work on a wide variety of operating systems including Windows and Linux.
SSH: Executing Commands Remotely
Secure Shell (SSH) is a cryptographic network protocol used for secure data communication. It is also used for remote services and other secure services between two devices on the network and is very useful for administering systems which are not easily available to physically work on but to which you have remote access.
To run my_command on a remote system via SSH, at the command prompt, type, ssh <remotesystem> my_command and press Enter. ssh then prompts you for the remote password. You can also configure ssh to securely allow your remote access without typing a password each time.
Copying Files Securely with scp
We can also move files securely using Secure Copy (scp) between two networked hosts. scp uses the SSH protocol for transferring data.
To copy a local file to a remote system, at the command prompt, type scp <localfile> <user@remotesystem>:/home/user/ and press Enter.
You will receive a prompt for the remote password. You can also configure scp so that it does not prompt for a password for each transfer.
Summary
The IP (Internet Protocol) address is a unique logical network address that is assigned to a device on a network.
IPv4 uses 32-bits for addresses and IPv6 uses 128-bits for addresses.
Every IP address contains both a network and a host address field.
There are five classes of network addresses available: A, B, C, D & E.
DNS (Domain Name System) is used for converting Internet domain and host names to IP addresses.
The ifconfig program is used to display current active network interfaces.
The commands ip addr show and ip route show can be used to view IP address and routing information.
You can use ping to check if the remote host is alive and responding.
You can use the route utility program to manage IP routing.
You can monitor and debug network problems using networking tools.
Firefox, Google Chrome, Chromium, and Epiphany are the main graphical browsers used in Linux.
Non-graphical or text browsers used in Linux are Lynx, Links, and w3m.
You can use wget to download webpages.
You can use curl to obtain information about URL's.
FTP (File Transfer Protocol) is used to transfer files over a network.
ftp, sftp, ncftp, and yafc are command line FTP clients used in Linux.
You can use ssh to run commands on remote systems.
other networking tools
Manipulating text
Command line tools
Irrespective of the role you play with Linux (system administrator, developer, or user) you often need to browse through and parse text files, and/or extract data from them. These are file manipulation operations. Thus it is essential for the Linux user to become adept at performing certain operations on files.
Most of the time such file manipulation is done at the command line which allows users to perform tasks more efficiently than while using a GUI. Furthermore the command line is more suitable for automating often executed tasks.
Indeed, experienced system administrators write customized scripts to accomplish such repetitive tasks, standardized for each particular environment.
cat
cat is short for concatenate and is one of the most frequently used Linux command line utilities. It is often used to read and print files as well as for simply viewing file contents. To view a file, use the following command:
$ cat <filename>
For example, cat readme.txt will display the contents of readme.txt on the terminal. Often the main purpose of cat, however, is to combine (concatenate) multiple files together. You can perform the actions listed in the following table using cat:
The tac command (cat spelled backwards) prints the lines of a file in reverse order. (Each line remains the same but the order of lines is inverted.) The syntax of tac is exactly the same as for cat as in:
$ tac file
$ tac file1 file2 > newfile
Using cat interactively
cat can be used to read from standard input (such as the terminal window) if no files are specified. You can use the > operator to create and add lines into a new file, and the >> operator to append lines (or files) to an existing file.
To create a new file, at the command prompt type cat > <filename> and press the Enter key.
This command creates a new file and waits for the user to edit/enter the text. After you finish typing the required text, press CTRL-D at the beginning of the next line to save and exit the editing.
Another way to create a file at the terminal is cat > <filename> << EOF. A new file is created and you can type the required input. To exit, enter EOF at the beginning of a line.
Note that EOF is case sensitive. (One can also use another word, such as STOP.)
echo
echo simply displays (echoes) text. It is used simply as in:
$ echo string
echo can be used to display a string on standard output (i.e., the terminal) or to place in a new file (using the > operator) or append to an already existing file (using the >> operator).
The –e option along with the following switches is used to enable special character sequences, such as the newline character or horizontal tab.
\n represents newline
\t represents horizontal tab
echo is particularly useful for viewing the values of environment variables (built-in shell variables). For example, echo $USERNAME will print the name of the user who has logged into the current terminal.
The table lists echo commands and their usage
Introduction to sed and awk
It is very common to create and then repeatedly edit and/or extract contents from a file. Let’s learn how to use sed and awk to easily perform such operations.
Note that many Linux users and administrators will write scripts using more comprehensive language utilities such as python and perl, rather than use sed and awk (and some other utilities we'll discuss later.) Using such utilities is certainly fine in most circumstances; one should always feel free to use the tools one is experienced with. However, the utilities that are described here are much lighter; i.e., they use fewer system resources, and execute faster. There are times (such as during booting the system) where a lot of time would be wasted using the more complicated tools, and the system may not even be able to run them. So the simpler tools will always be needed.
sed
sed is a powerful text processing tool and is one of the oldest earliest and most popular UNIX utilities. It is used to modify the contents of a file, usually placing the contents into a new file. Its name is an abbreviation for stream editor.
sed can filter text as well as perform substitutions in data streams, working like a churn-mill.
Data from an input source/file (or stream) is taken and moved to a working space. The entire list of operations/modifications is applied over the data in the working space and the final contents are moved to the standard output space (or stream).
the -e command option allows you to specify multiple editing commands simultaneously at the command line.
sed Basic Operations
Now that you know that you can perform multiple editing and filtering operations with sed, let’s explain some of them in more detail. The table explains some basic operations, where pattern is the current string and replace_string is the new string, as in the figure
You must use the -i option with care, because the action is not reversible. It is always safer to use sed without the –i option and then replace the file yourself, as shown in the following example:
$ sed s/pattern/replace_string/g file1 > file2
The above command will replace all occurrences of pattern with replace_string in file1 and move the contents to file2. The contents of file2 can be viewed with cat file2. If you approve you can then overwrite the original file with mv file2 file1.
Example: To convert 01/02/… to JAN/FEB/…
sed -e 's/01/JAN/' -e 's/02/FEB/' -e 's/03/MAR/' -e 's/04/APR/' -e 's/05/MAY/' \
-e 's/06/JUN/' -e 's/07/JUL/' -e 's/08/AUG/' -e 's/09/SEP/' -e 's/10/OCT/' \
-e 's/11/NOV/' -e 's/12/DEC/'
awk
awk is used to extract and then print specific contents of a file and is often used to construct reports. It was created at Bell Labs in the 1970s and derived its name from the last names of its authors: Alfred Aho, Peter Weinberger, and Brian Kernighan.
awk has the following features:
- powerful utility and interpreted programming language.
- manipulates data files, retrieving, and processing text.
- works well with fields (containing a single piece of data, essentially a column) and records (a collection of fields, essentially a line in a file).
awk is invoked as shown in the following table
As with sed, short awk commands can be specified directly at the command line, but a more complex script can be saved in a file that you can specify using the -f option
awk Basic Operations
The table explains the basic tasks that can be performed using awk. The input file is read one line at a time, and for each line, awk matches the given pattern in the given order and performs the requested action. The -F option allows you to specify a particular field separator character. For example, the /etc/passwd file uses : to separate the fields, so the -F: option is used with the /etc/passwd file.
The command/action in awk needs to be surrounded with apostrophes (or single-quote (')). awk can be used as in the table
File Manipulation Utilities
In managing your files you may need to perform many tasks, such as sorting data and copying data from one location to another. Linux provides several file manipulation utilities that you can use while working with text files. In this section, you will learn about the following file manipulation programs:
sort
uniq
paste
join
split
You will also learn about regular expressions and search patterns.
sort
sort is used to rearrange the lines of a text file either in ascending or descending order, according to a sort key. You can also sort by particular fields of a file. The default sort key is the order of the ASCII characters (i.e., essentially alphabetically).
When used with the -u option, sort checks for unique values after sorting the records (lines). It is equivalent to running uniq (which we shall discuss) on the output of sort.
uniq
uniq is used to remove duplicate lines in a text file and is useful for simplifying text display. uniq requires that the duplicate entries to be removed are consecutive. Therefore one often runs sort first and then pipes the output into uniq; if sort is passed the -u option it can do all this in one step. In the example shown, the file is called names and was originally Ted, Bob, Alice, Bob, Carol, Alice.
To remove duplicate entries from some files, use the following command: sort file1 file2 | uniq > file3
OR
sort -u file1 file2 > file3
To count the number of duplicate entries, use the following command: uniq -c filename
paste
Suppose you have a file that contains the full name of all employees and another file that lists their phone numbers and Employee IDs. You want to create a new file that contains all the data listed in three columns: name, employee ID, and phone number. How can you do this effectively without investing too much time?
paste can be used to create a single file containing all three columns. The different columns are identified based on delimiters (spacing used to separate two fields). For example, delimiters can be a blank space, a tab, or an Enter. In the image provided, a single space is used as the delimiter in all files.
paste accepts the following options:
-d delimiters, which specify a list of delimiters to be used instead of tabs for separating consecutive values on a single line. Each delimiter is used in turn; when the list has been exhausted, paste begins again at the first delimiter.
-s, which causes paste to append the data in series rather than in parallel; that is, in a horizontal rather than vertical fashion.
Using paste
paste can be used to combine fields (such as name or phone number) from different files as well as combine lines from multiple files. For example, line one from file1 can be combined with line one of file2, line two from file1 can be combined with line two of file2, and so on.
To paste contents from two files one can do:
$ paste file1 file2
The syntax to use a different delimiter is as follows:
$ paste -d, file1 file2
Common delimiters are 'space', 'tab', '|', 'comma', etc.
join
Suppose you have two files with some similar columns. You have saved employees’ phone numbers in two files, one with their first name and the other with their last name. You want to combine the files without repeating the data of common columns. How do you achieve this?
The above task can be achieved using join, which is essentially an enhanced version of paste. It first checks whether the files share common fields, such as names or phone numbers, and then joins the lines in two files based on a common field.
Using join
To combine two files on a common field, at the command prompt type join file1 file2 and press the Enter key.
For example, the common field (i.e., it contains the same values) among the phonebook and directory files is the phone number, as shown by the output of the following cat commands:
$ cat phonebook
555-123-4567 Bob
555-231-3325 Carol
555-340-5678 Ted
555-289-6193 Alice
$ cat directory
555-123-4567 Anytown
555-231-3325 Mytown
555-340-5678 Yourtown
555-289-6193 Youngstown
The result of joining these two file is as shown in the output of the following command:
$ join phonebook directory
555-123-4567 Bob Anytown
555-231-3325 Carol Mytown
555-340-5678 Ted Yourtown
555-289-6193 Alice Youngstown
split
split is used to break up (or split) a file into equal-sized segments for easier viewing and manipulation, and is generally used only on relatively large files. By default split breaks up a file into 1,000-line segments. The original file remains unchanged, and set of new files with the same name plus an added prefix is created. By default, the x prefix is added. To split a file into segments, use the command split infile.
To split a file into segments using a different prefix, use the command split infile <Prefix>.
Using split
To demonstrate the use of split, we'll apply it to an american-english dictionary file of over 99,000 lines:
$ wc -l american-english
99171 american-english
where we have used the wc program (soon to be discussed) to report on the number of lines in the file. Then typing:
$ split american-english dictionary
will split the american-english file into equal-sized segments named 'dictionary'.
$ ls -l dictionary*
-rw-rw-r 1 me me 8552 Mar 23 20:19 dictionaryab
-rw-rw-r 1 me me 8653 Mar 23 20:19 dictionaryaa
. . .
Regular Expressions and Search Patterns
Regular expressions are text strings used for matching a specific pattern, or to search for a specific location, such as the start or end of a line or a word. Regular expressions can contain both normal characters or so-called metacharacters, such as * and $.
Many text editors and utilities such as vi, sed, awk, find and grep work extensively with regular expressions. Some of the popular computer languages that use regular expressions include Perl, Python and Ruby. It can get rather complicated and there are whole books written about regular expressions; we'll only skim the surface here.
These regular expressions are different from the wildcards (or "metacharacters") used in filename matching in command shells such as bash (which were covered in the earlier Chapter on Command Line Operations). The table lists search patterns and their usage.
Using Regular Expressions and Search Patterns
For example, Consider the following sentence:
the quick brown fox jumped over the lazy dog
Some of the patterns that can be applied to this sentence are as follows in the table.
-d is the command option that should be present in the paste command to combine two files with a specified delimiter
grep
grep is extensively used as a primary text searching tool. It scans files for specified patterns and can be used with regular expressions as well as simple strings as shown in the table.
tr
In this section, you will learn about some additional text utilities that you can use for performing various actions on your Linux files, such as changing the case of letters or determining the count of words, lines, and characters in a file.
The tr utility is used to translate specified characters into other characters or to delete them. The general syntax is as follows:
$ tr [options] set1 [set2]
The items in the square brackets are optional. tr requires at least one argument and accepts a maximum of two. The first, designated set1 in the example, lists the characters in the text to be replaced or removed. The second, set2, lists the characters that are to be substituted for the characters listed in the first argument. Sometimes these sets need to be surrounded by apostrophes (or single-quotes (')) in order to have the shell ignore that they mean something special to the shell. It is usually safe (and may be required) to use the single-quotes around each of the sets as you will see in the examples below.
For example, suppose you have a file named city containing several lines of text in mixed case. To translate all lower case characters to upper case, at the command prompt type cat city | tr a-z A-Z and press the Enter key.
tee
tee takes the output from any command, and while sending it to standard output, it also saves it to a file. In other words, it "tees" the output stream from the command: one stream is displayed on the standard output and the other is saved to a file.
For example, to list the contents of a directory on the screen and save the output to a file, at the command prompt type ls -l | tee newfile and press the Enter key.
Typing cat newfile will then display the output of ls –l.
wc
wc (word count) counts the number of lines, words, and characters in a file or list of files. Options are given in the table below.
By default all three of these options are active.
For example, to print the number of lines contained in a file, at the command prompt type wc -l filename and press the Enter key.
cut
cut is used for manipulating column-based files and is designed to extract specific columns. The default column separator is the tab character. A different delimiter can be given as a command option.
For example, to display the third column delimited by a blank space, at the command prompt type ls -l | cut -d" " -f3 and press the Enter key.
Working with Large Files
System administrators need to work with configuration files, text files, documentation files, and log files. Some of these files may be large or become quite large as they accumulate data with time. These files will require both viewing and administrative updating. In this section, you will learn how to manage such large files.
For example, a banking system might maintain one simple large log file to record details of all of one day's ATM transactions. Due to a security attack or a malfunction, the administrator might be forced to check for some data by navigating within the file. In such cases, directly opening the file in an editor will cause issues, due to high memory utilization, as an editor will usually try to read the whole file into memory first. However, one can use less to view the contents of such a large file, scrolling up and down page by page without the system having to place the entire file in memory before starting. This is much faster than using a text editor.
Viewing the file can be done by typing either of the two following commands:
$ less <filename>
$ cat <filename> | less
By default, manual (i.e., the man command) pages are sent through the less command.
head
head reads the first few lines of each named file (10 by default) and displays it on standard output. You can give a different number of lines in an option.
For example, If you want to print the first 5 lines from atmtrans.txt, use the following command:
$ head –n 5 atmtrans.txt
(You can also just say head -5 atmtrans.txt.)
tail
tail prints the last few lines of each named file and displays it on standard output. By default, it displays the last 10 lines. You can give a different number of lines as an option. tail is especially useful when you are troubleshooting any issue using log files as you probably want to see the most recent lines of output.
For example, to display the last 15 lines of atmtrans.txt, use the following command:
$ tail -n 15 atmtrans.txt
(You can also just say tail -15 atmtrans.txt.) To continually monitor new output in a growing log file:
$ tail -f atmtrans.txt
This command will continuously display any new lines of output in atmtrans.txt as soon as they appear. Thus it enables you to monitor any current activity that is being reported and recorded.
strings
strings is used to extract all printable character strings found in the file or files given as arguments. It is useful in locating human readable content embedded in binary files: for text files one can just use grep.
For example, to search for the string my_string in a spreadsheet:
$ strings book1.xls | grep my_string
The z Command Family
When working with compressed files many standard commands can not be used directly. For many commonly-used file and text manipulation programs there is also a version especially designed to work directly with compressed files. These associated utilities have the letter z prefixed to their name. For example, we have utility programs such as zcat, zless, zdiff, and zgrep.
Note that if you run zless on an uncompressed file, it will still work and ignore the decompression stage. There are also equivalent utility programs for other compression methods besides gzip; i.e, we have bzcat and bzless associated with bzip2, and xzcat and xzless associated with xz.
Lab 1: Searching with grep
Searching for username
$ grep your-usernarme /etc/passwd
Lab 2: Parsing files with awk
The field in /etc/passws that holds the shell is #7, to display the field holding the shell in /etc/passwd using awk and produce a unique list:
$ awk -F: '{print $7}' /etc/passwd | sort -u
or
$ awk -F: '{print $7}' /etc/passwd | sort | uniq
Lab 3: Searching and substituting with sed
To search and replace using sed:
$ sed 's/false$/bash/' /etc/passwd
Summary
The command line often allows the users to perform tasks more efficiently than the GUI.
cat, short for concatenate, is used to read, print and combine files.
echo displays a line of text either on standard output or to place in a file.
sed is a popular stream editor often used to filter and perform substitutions on files and text data streams.
awk is a interpreted programming language typically used as a data extraction and reporting tool.
sort is used to sort text files and output streams in either ascending or descending order.
uniq eliminates duplicate entries in a text file.
paste combines fields from different files and can also extract and combine lines from multiple sources.
join combines lines from two files based on a common field. It works only if files share a common field.
split breaks up a large file into equal-sized segments.
Regular expressions are text strings used for pattern matching. The pattern can be used to search for a specific location, such as the start or end of a line or a word.
grep searches text files and data streams for patterns and can be used with regular expressions.
tr translates characters, copies standard input to standard output, and handles special characters.
tee accepts saves a copy of standard output to a file while still displaying at the terminal.
wc (word count) displays the number of lines, words and characters in a file or group of files.
cut extracts columns from a file.
less views files a page at a time and allows scrolling in both directions.
head displays the first few lines of a file or data stream on standard output. By default it displays 10 lines.
tail displays the last few lines of a file or data stream on standard output. By default it displays 10 lines.
strings extracts printable character strings from binary files.
The z command family is used to read and work with compressed files.
sed
Printing
To manage printers and print directly from a computer or across a networked environment, you need to know how to configure and install a printer. Printing itself requires software that converts information from the application you are using to a language your printer can understand. The Linux standard for printing software is the Common UNIX Printing System (CUPS).
CUPS Overview
CUPS is the software that is used behind the scenes to print from applications like a web browser or LibreOffice. It converts page descriptions produced by your application (put a paragraph here, draw a line there, and so forth) and then sends the information to the printer. It acts as a print server for local as well as network printers.
Printers manufactured by different companies may use their own particular print languages and formats. CUPS uses a modular printing system which accommodates a wide variety of printers and also processes various data formats. This makes the printing process simpler; you can concentrate more on printing and less on how to print.
Generally, the only time you should need to configure your printer is when you use it for the first time. In fact, CUPS often figures things out on its own by detecting and configuring any printers it locates.
How Does CUPS Work?
CUPS carries out the printing process with the help of its various components:
Configuration Files
Scheduler
Job Files
Log Files
Filter
Printer Drivers
Backend
Scheduler
CUPS is designed around a print scheduler that manages print jobs, handles administrative commands, allows users to query the printer status, and manages the flow of data through all CUPS components.
As you'll see shortly, CUPS has a browser-based interface which allows you to view and manipulate the order and status of pending print jobs.
Configuration Files
The print scheduler reads server settings from several configuration files, the two most important of which are cupsd.conf and printers.conf. These and all other CUPS related configuration files are stored under the /etc/cups/ directory.
cupsd.conf is where most system-wide settings are located; it does not contain any printer-specific details. Most of the settings available in this file relate to network security, i.e. which systems can access CUPS network capabilities, how printers are advertised on the local network, what management features are offered, and so on.
printers.conf is where you will find the printer-specific settings. For every printer connected to the system, a corresponding section describes the printer’s status and capabilities. This file is generated only after adding a printer to the system and should not be modified by hand.
You can view the full list of configuration files by typing: ls -l /etc/cups/.
Job Files
CUPS stores print requests as files under the /var/spool/cups directory (these can actually be accessed before a document is sent to a printer). Data files are prefixed with the letter d while control files are prefixed with the letter c. After a printer successfully handles a job, data files are automatically removed. These data files belong to what is commonly known as the print queue.
Log Files
Log files are placed in /var/log/cups and are used by the scheduler to record activities that have taken place. These files include access, error, and page records.
To view what log files exist, type:
sudo ls -l /var/log/cups
(Note on some distributions permissions are set such that you don't need the sudo.) You can view the log files with the usual tools.
Filters, Printer Drivers, and Backends
CUPS uses filters to convert job file formats to printable formats. Printer drivers contain descriptions for currently connected and configured printers, and are usually stored under /etc/cups/ppd/. The print data is then sent to the printer through a filter and via a backend that helps to locate devices connected to the system.
So In short, when you execute a print command, the scheduler validates the command and processes the print job creating job files according to the settings specified in configuration files. Simultaneously, the scheduler records activities in the log files. Job files are processed with the help of the filter, printer driver, and backend, and then sent to the printer
Installing CUPS
Due to printing being a relatively important and fundamental feature of any Linux distribution, most Linux systems come with CUPS preinstalled. In some cases, especially for Linux server setups, CUPS may have been left uninstalled. This may be fixed by installing the corresponding package manually. To install CUPS, please ensure that your system is connected to the Internet.
Demonstration of installing cups
You can use the commands shown below to manually install CUPS:
- CentOS: $ sudo yum install cups
- OpenSUSE: $ sudo zypper install cups
- Ubuntu: $ sudo apt-get install cups
The video below demonstrates this procedure for Ubuntu, the other two are similar, once the correct install command is provided.
Note: CUPS features are also supported by other packages such as cups-common and libcups2, which contains the core CUPS libraries. The above install command will make sure any needed packages are also installed.
Managing CUPS
After installing CUPS, you'll need to start and manage the CUPS daemon so that CUPS is ready for configuring a printer. Managing the CUPS daemon is simple; all management features are wrapped around the cups init script, which can be easily started, stopped, and restarted.
Configuring a Printer from the GUI
Each Linux distribution has a GUI application that lets you add, remove, and configure local or remote printers. Using this application, you can easily set up the system to use both local and network printers. The following screens show how to find and use the appropriate application in each of the distribution families covered in this course.
When configuring a printer, make sure the device is currently turned on and connected to the system; if so it should show up in the printer selection menu. If the printer is not visible, you may want to troubleshoot using tools that will determine if the printer is connected. For common USB printers, for example, the lsusb utility will show a line for the printer. Some printer manufacturers also require some extra software to be installed in order to make the printer visible to CUPS, however, due to the standardization these days, this is rarely required.
Adding Printers from the CUPS Web Interface
A fact that few people know is that CUPS also comes with its own web server, which makes a configuration interface available via a set of CGI scripts.
This web interface allows you to:
Add and remove local/remote printers
Configure printers:
– Local/remote printers
– Share a printer as a CUPS server
Control print jobs:
– Monitor jobs
– Show completed or pending jobs
– Cancel or move jobs
The CUPS web interface is available on your browser at: http://localhost:631
Some pages require a username and password to perform certain actions, for example to add a printer. For most Linux distributions, you must use the root password to add, modify, or delete printers or classes.
Printing from the Graphical Interface
Many graphical applications allow users to access printing features using the CTRL-P shortcut. To print a file, you first need to specify the printer (or a file name and location if you are printing to a file instead) you want to use; and then select the page setup, quality, and color options. After selecting the required options, you can submit the document for printing. The document is then submitted to CUPS. You can use your browser to access the CUPS web interface at http://localhost:631/ to monitor the status of the printing job. Now that you have configured the printer, you can print using either the Graphical or Command Line interfaces.
Printing from the Command-Line Interface
CUPS provides two command-line interfaces, descended from the System V and BSD flavors of UNIX. This means that you can use either lp (System V) or lpr (BSD) to print. You can use these commands to print text, PostScript, PDF, and image files.
These commands are useful in cases where printing operations must be automated (from shell scripts, for instance, which contain multiple commands in one file). You will learn more about the shell scripts in the upcoming chapters on bash scripts.
lp is just a command line front-end to the lpr utility that passes input to lpr. Thus, we will discuss only lp in detail. In the example shown here, the task is to print the file called test1.txt.
Using lp
lp and lpr accept command line options that help you perform all operations that the GUI can accomplish. lp is typically used with a file name as an argument.
The lpoptions utility can be used to set printer options and defaults. Each printer has a set of tags associated with it, such as the default number of copies and authentication requirements. You can execute the command lpoptions help to obtain a list of supported options. lpoptions can also be used to set system-wide values, such as the default printer.
Managing Print Jobs
You send a file to the shared printer. But when you go there to collect the printout, you discover another user has just started a 200 page job that is not time sensitive. Your file cannot be printed until this print job is complete. What do you do now?
In Linux, command line print job management commands allow you to monitor the job state as well as managing the listing of all printers and checking their status, and cancelling or moving print jobs to another printer.
Working with PostScript
PostScript is a standard page description language. It effectively manages scaling of fonts and vector graphics to provide quality printouts. It is purely a text format that contains the data fed to a PostScript interpreter. The format itself is a language that was developed by Adobe in the early 1980s to enable the transfer of data to printers.
Features of PostScript are:
It can be used on any printer that is PostScript-compatible; i.e., any modern printer
Any program that understands the PostScript specification can print to it
Information about page appearance, etc. is embedded in the page
Working with enscript
enscript is a tool that is used to convert a text file to PostScript and other formats. It also supports Rich Text Format (RTF) and HyperText Markup Language (HTML). For example, you can convert a text file to two column (-2) formatted PostScript using the command: enscript -2 -r -p psfile.ps textfile.txt. This command will also rotate (-r) the output to print so the width of the paper is greater than the height (aka landscape mode) thereby reducing the number of pages required for printing.
Viewing PDF Content
Linux has many standard programs that can read PDF files as well as many applications that can easily create them, including all available office suites such as LibreOffice.
The most common Linux PDF readers are:
1. Evince is available on virtually all distributions and the most widely used program.
2. Okular is based on the older kpdf and available on any distribution that provides the KDE environment.
3. GhostView is one of the first open source PDF readers and is universally available.
4. Xpdf is one of the oldest open source PDF readers and still has a good user base.
All of these open source PDF readers support and can read files following the PostScript standard unlike the proprietary Adobe Acrobat Reader, which was once widely used on Linux systems but with the growth of these excellent programs, very few Linux users use it today.
Modifying PDF's with pdftk
At times, you may want to merge, split, or rotate PDF files; not all of these operations can be achieved while using a PDF viewer. A great way to do this is to use the "PDF Toolkit", pdftk, to perform a very large variety of sophisticated tasks. Some of these operations include:
Merging/Splitting/Rotating PDF documents
Repairing corrupted PDF pages
Pulling single pages from a file
Encrypting and decrypting PDF files
Adding, updating, and exporting a PDF’s metadata
Exporting bookmarks to a text file
Filling out PDF forms
In short, there’s very little pdftk cannot do when it comes to working with PDF files; it is indeed the Swiss Army knife of PDF tools.
Installing pdftk on Different Family Systems
To install pdftk on Ubuntu, use the following command:
$ sudo apt-get install pdftk
On CentOS:
$ sudo yum install pdftk
On openSUSE:
$ sudo zypper install pdftk
You may find that CentOS (and RHEL) don't have pdftk in their packaging system, but you can obtain the PDF Toolkit directly from the PDF Lab’s website by downloading from:
http://www.pdflabs.com/docs/install-pdftk-on-redhat-or-centos/
Encrypting PDF Files
If you’re working with PDF files that contain confidential information and you want to ensure that only certain people can view the PDF file, you can apply a password to it using the user_pw option. One can do this by issuing a command such as:
$ pdftk public.pdf output private.pdf user_pw PROMPT
When you run this command, you will receive a prompt to set the required password, which can have a maximum of 32 characters. A new file, private.pdf, will be created with the identical content as public.pdf, but anyone will need to type the password to be able to view it.
Using Additional Tools
You can use other tools, such as pdfinfo, flpsed, and pdfmod to work with PDF files.
pdfinfo can extract information about PDF files, especially when the files are very large or when a graphical interface is not available.
flpsed can add data to a PostScript document. This tool is specifically useful for filling in forms or adding short comments into the document.
pdfmod is a simple application that provides a graphical interface for modifying PDF documents. Using this tool, you can reorder, rotate, and remove pages; export images from a document; edit the title, subject, and author; add keywords; and combine documents using drag-and-drop action.
For example, to collect the details of a document, you can use the following command:
$ pdfinfo /usr/share/doc/readme.pdf
Converting Between PostScript and PDF
Most users today are far more accustomed to working with files in PDF format, viewing them easily either on the Internet through their browser or locally on their machine. The PostScript format is still important for various technical reasons that the general user will rarely have to deal with.
From time to time you may need to convert files from one format to the other, and there are very simple utilities for accomplishing that task. ps2pdf and pdf2ps are part of the ghostscript package installed on or available on all Linux distributions. As an alternative, there are pstopdf and pdftops which are usually part of the poppler package which may need to be added through your package manager. Unless you are doing a lot of conversions or need some of the fancier options (which you can read about in the man pages for these utilities) it really doesn't matter which ones you use.
Managing cups, Ubuntu
Managing cups, CentOS
Bash Shell Scripting
Features and Capabilities
Intro to Scripts
Suppose you want to look up a filename, check if the associated file exists, and then respond accordingly, displaying a message confirming or not confirming the file's existence. If you only need to do it once, you can just type a sequence of commands at a terminal. However, if you need to do this multiple times, automation is the way to go. In order to automate sets of commands you’ll need to learn how to write shell scripts, the most common of which are used with bash. The graphic illustrates several of the benefits of deploying scripts.
Intro to Shell Scripts
Remember from our earlier discussion, a shell is a command line interpreter which provides the user interface for terminal windows. It can also be used to run scripts, even in non-interactive sessions without a terminal window, as if the commands were being directly typed in. For example typing: find . -name "*.c" -ls at the command line accomplishes the same thing as executing a script file containing the lines:
#!/bin/bash
find . -name "*.c" -ls
The #!/bin/bash in the first line should be recognized by anyone who has developed any kind of script in UNIX environments. The first line of the script, that starts with #!, contains the full path of the command interpreter (in this case /bin/bash) that is to be used on the file. As we will see on the next screen, you have a few choices depending upon which scripting language you use.
Command Shell Choices
The command interpreter is tasked with executing statements that follow it in the script. Commonly used interpreters include: /usr/bin/perl, /bin/bash, /bin/csh, /usr/bin/python and /bin/sh.
Typing a long sequence of commands at a terminal window can be complicated, time consuming, and error prone. By deploying shell scripts, using the command-line becomes an efficient and quick way to launch complex sequences of steps. The fact that shell scripts are saved in a file also makes it easy to use them to create new script variations and share standard procedures with several users.
Linux provides a wide choice of shells; exactly what is available on the system is listed in /etc/shells. Typical choices are:
/bin/sh
/bin/bash
/bin/tcsh
/bin/csh
/bin/ksh
Most Linux users use the default bash shell, but those with long UNIX backgrounds with other shells may want to override the default.
bash Scripts
Let's write a simple bash script that displays a two-line message on the screen. Either type
$ cat > exscript.sh
#!/bin/bash
echo "HELLO"
echo "WORLD"
and press ENTER and CTRL-D to save the file, or just create exscript.sh in your favorite text editor. Then, type chmod +x exscript.sh to make the file executable. (The chmod +x command makes the file executable for all users.) You can then run it by simply typing ./exscript.sh or by doing:
$ bash exscript.sh
HELLO
WORLD
Note if you use the second form, you don't have to make the file executable.
Interactive Examples Using bash Scripts
Now, let's see how to create a more interactive example using a bash script. The user will be prompted to enter a value, which is then displayed on the screen. The value is stored in a temporary variable, sname. We can reference the value of a shell variable by using a $ in front of the variable name, such as $sname. To create this script, you need to create a file named ioscript.sh in your favorite editor with the following content:
#!/bin/bash
# Interactive reading of variables
echo "ENTER YOUR NAME"
read sname
# Display of variable values
echo $sname
Once again, make it executable by doing chmod +x ioscript.sh.
In the above example, when the script ./ioscript.sh is executed, the user will receive a prompt ENTER YOUR NAME. The user then needs to enter a value and press the Enter key. The value will then be printed out.
Additional note: The hash-tag/pound-sign/number-sign (#) is used to start comments in the script and can be placed anywhere in the line (the rest of the line is considered a comment).
Return Values
All shell scripts generate a return value upon finishing execution; the value can be set with the exit statement. Return values permit a process to monitor the exit state of another process often in a parent-child relationship. This helps to determine how this process terminated and take any appropriate steps necessary, contingent on success or failure.
Viewing Return Values
As a script executes, one can check for a specific value or condition and return success or failure as the result. By convention, success is returned as 0, and failure is returned as a non-zero value. An easy way to demonstrate success and failure completion is to execute ls on a file that exists and one that doesn't, as shown in the following example, where the return value is stored in the environment variable represented by $?:
$ ls /etc/passwd
/etc/ passwd
$ echo $?
0
In this example, the system is able to locate the file /etc/passwd and returns a value of 0 to indicate success; the return value is always stored in the $? environment variable. Applications often translate these return values into meaningful messages easily understood by the user.
Basic Syntax and Special Characters
Scripts require you to follow a standard language syntax. Rules delineate how to define variables and how to construct and format allowed statements, etc. The table lists some special character usages within bash scripts:
Splitting Long Commands Over Multiple\ Lines
Users sometimes need to combine several commands and statements and even conditionally execute them based on the behaviour of operators used in between them. This method is called chaining of commands.
The concatenation operator (\) is used to concatenate large commands over several lines in the shell.
For example, you want to copy the file /var/ftp/pub/userdata/custdata/read from server1.linux.com to the /opt/oradba/master/abc directory on server3.linux.co.in. To perform this action, you can write the command using the \ operator as:
scp abc@server1.linux.com:\
/var/ftp/pub/userdata/custdata/read \
abc@server3.linux.co.in:\
/opt/oradba/master/abc/
The command is divided into multiple lines to make it look readable and easier to understand. The \ operator at the end of each line combines the commands from multiple lines and executes it as one single command.
Multiple Commands on a Single Line
Sometimes you may want to group multiple commands on a single line. The ; (semicolon) character is used to separate these commands and execute them sequentially as if they had been typed on separate lines.
The three commands in the following example will all execute even if the ones preceding them fail:
$ make ; make install ; make clean
However, you may want to abort subsequent commands if one fails. You can do this using the && (and) operator as in:
$ make && make install && make clean
If the first command fails the second one will never be executed. A final refinement is to use the || (or) operator as in:
$ cat file1 || cat file2 || cat file3
In this case, you proceed until something succeeds and then you stop executing any further steps.
Functions
A function is a code block that implements a set of operations. Functions are useful for executing procedures multiple times perhaps with varying input variables. Functions are also often called subroutines. Using functions in scripts requires two steps:
1. Declaring a function
2. Calling a function
The function declaration requires a name which is used to invoke it. The proper syntax is:
function_name () {
command...
}
For example, the following function is named display:
display () {
echo "This is a sample function"
}
The function can be as long as desired and have many statements. Once defined, the function can be called later as many times as necessary. In the full example shown in the figure, we are also showing an often-used refinement: how to pass an argument to the function. The first argument can be referred to as $1, the second as $2, etc.
Built-in Shell Commands
Shell scripts are used to execute sequences of commands and other types of statements. Commands can be divided into the following categories:
Compiled applications
Built-in bash commands
Other scripts
Compiled applications are binary executable files that you can find on the filesystem. The shell script always has access to compiled applications such as rm, ls, df, vi, and gzip.
bash has many built-in commands which can only be used to display the output within a terminal shell or shell script. Sometimes these commands have the same name as executable programs on the system, such as echo which can lead to subtle problems. bash built-in commands include and cd, pwd, echo, read, logout, printf, let, and ulimit.
A complete list of bash built-in commands can be found in the bash man page, or by simply typing help.
Command Substitution
At times, you may need to substitute the result of a command as a portion of another command. It can be done in two ways:
By enclosing the inner command with backticks (`)
By enclosing the inner command in $( )
No matter the method, the innermost command will be executed in a newly launched shell environment, and the standard output of the shell will be inserted where the command substitution was done.
Virtually any command can be executed this way. Both of these methods enable command substitution; however, the $( ) method allows command nesting. New scripts should always use this more modern method. For example:
$ cd /lib/modules/$(uname -r)/
In the above example, the output of the command "uname –r" becomes the argument for the cd command.
Environment Variables
Almost all scripts use variables containing a value, which can be used anywhere in the script. These variables can either be user or system defined. Many applications use such environment variables (covered in the "User Environment" chapter) for supplying inputs, validation, and controlling behaviour.
Some examples of standard environment variables are HOME, PATH, and HOST. When referenced, environment variables must be prefixed with the $ symbol as in $HOME. You can view and set the value of environment variables. For example, the following command displays the value stored in the PATH variable:
$ echo $PATH
However, no prefix is required when setting or modifying the variable value. For example, the following command sets the value of the MYCOLOR variable to blue:
$ MYCOLOR=blue
You can get a list of environment variables with the env, set, or printenv commands.
Exporting Variables
By default, the variables created within a script are available only to the subsequent steps of that script. Any child processes (sub-shells) do not have automatic access to the values of these variables. To make them available to child processes, they must be promoted to environment variables using the export statement as in:
export VAR=value
or
VAR=value ; export VAR
While child processes are allowed to modify the value of exported variables, the parent will not see any changes; exported variables are not shared, but only copied.
Script Parameters
Users often need to pass parameter values to a script, such as a filename, date, etc. Scripts will take different paths or arrive at different values according to the parameters (command arguments) that are passed to them. These values can be text or numbers as in:
$ ./script.sh /tmp
$ ./script.sh 100 200
Within a script, the parameter or an argument is represented with a $ and a number. The table lists some of these parameters.
Using Script Parameters
Using your favorite text editor, create a new script file named script3.sh with the following contents:
#!/bin/bash
echo "The name of this program is: $0"
echo "The first argument passed from the command line is: $1"
echo "The second argument passed from the command line is: $2"
echo "The third argument passed from the command line is: $3"
echo "All of the arguments passed from the command line are : $*"
echo
echo "All done with $0"
Make the script executable with chmod +x. Run the script giving it three arguments as in: script3.sh one two three, and the script is processed as follows:
$0 prints the script name: script3.sh
$1 prints the first parameter: one
$2 prints the second parameter: two
$3 prints the third parameter: three
$* prints all parameters: one two three
The final statement becomes: All done with script3.sh
Output Redirection
Most operating systems accept input from the keyboard and display the output on the terminal. However, in shell scripting you can send the output to a file. The process of diverting the output to a file is called output redirection.
The > character is used to write output to a file. For example, the following command sends the output of free to the file /tmp/free.out:
$ free > /tmp/free.out
To check the contents of the /tmp/free.out file, at the command prompt type cat /tmp/free.out.
Two > characters (>>) will append output to a file if it exists, and act just like > if the file does not already exist.
Input redirection
Just as the output can be redirected to a file, the input of a command can be read from a file. The process of reading input from a file is called input redirection and uses the < character. If you create a file called script8.sh with the following contents:
#!/bin/bash
echo “Line count”
wc -l < /temp/free.out
and then execute it with chmod +x script8.sh ; ./script8.sh, it will count the number of lines from the /temp/free.out file and display the results.
Constructs
if Statements
Conditional decision making using an if statement, is a basic construct that any useful programming or scripting language must have.
When an if statement is used, the ensuing actions depend on the evaluation of specified conditions such as:
-Numerical or string comparisons
-Return value of a command (0 for success)
-File existence or permissions
In compact form, the syntax of an if statement is:
if TEST-COMMANDS; then CONSEQUENT-COMMANDS; fi
A more general definition is:
if condition
then
statements
else
statements
fi
Using if Statements
The following if statement checks for the /etc/passwd file, and if the file is found it displays the message
/etc/passwd exists.:
if [ -f /etc/passwd ]
then
echo "/etc/passwd exists."
fi
Notice the use of the square brackets ( [ ] ) to delineate the test condition. There are many other kinds of tests you can perform, such as checking whether two numbers are equal to, greater than, or less than each other and make a decision accordingly; we will discuss these other tests.
In modern scripts you may see doubled brackets as in [[ -f /etc/passwd ]]. This is not an error. It is never wrong to do so and it avoids some subtle problems such as referring to an empty environment variable without surrounding it in double quotes; we won't talk about this here.
Testing for Files
You can use the if statement to test for file attributes such as:
- File or directory existence
- Read or write permission
- Executable permission
For example, in the following example:
if [ -f /etc/passwd ] ; then
ACTION
fi
the if statement checks if the file /etc/passwd is a regular file.
Note the very common practice of putting “; then” on the same line as the if statement.
bash provides a set of file conditionals, that can used with the if statement, including those in th table
You can view the full list of file conditions using the command man 1 test.
Example of string testing
You can use the if statement to compare strings using the operator == (two equal signs). The syntax is as follows:
if [ string1 == string2 ] ; then
ACTION
fi
Let’s now consider an example of testing strings.
In the example illustrated in the figure, the if statement is used to compare the input provided by the user and accordingly display the result.
Numerical Tests
You can use specially defined operators with the if statement to compare numbers. The various operators that are available are listed in the table.
The syntax for comparing numbers is as follows:
exp1 -op exp2
Arithmetic Expressions
Arithmetic expressions can be evaluated in the following three ways (spaces are important!):
Using the expr utility: expr is a standard but somewhat deprecated program. The syntax is as follows:
expr 8 + 8
echo $(expr 8 + 8)
Using the $((...)) syntax: This is the built-in shell format. The syntax is as follows:
echo $((x+1))
Using the built-in shell command let. The syntax is as follows:
let x=( 1 + 2 ); echo $x
In modern shell scripts the use of expr is better replaced with var=$((...))
Labs
Lab 1: Exit Status Code Usage
Write a script which does an Is for a nonexistent file, followed by a display of the exit status code. Then create a file and display the new exit status code. In each task send the ls output to /dev/null
#!/bin/bash
# exitlab
#
# example of exit status
# check for non-existent file
# exit status will be 2
# create file and check for it
# exit status will be 0
#
ls xyzzy.345 > /dev/null 2>&1
status='echo $?'
echo "status is $status"
# create the file and check again
# status will not be 0
touch xyzzy.345
ls xyzzy.345 > /dev/null 2>&1
status='echo $?'
echo "status is $status"
# remove the file
rm xyzzy.345
Lab 2: Working with Files
Write a script that will ask the user for a directory name, which the script will create. Change the working directory to the new directory and tell the user where you are using the pwd command.
Next use touch to create a few files followed by displaying the filenames. Use echo and redirection to put some content into the files and show the user the content. Finally, say goodbye to the user to end the script.
#!/bin/bash
# fileslab
# demos several simple commands to create a directory, # cd to that directory, echo 'pwd'
# create several files,
# put data into the files and list the files
echo "Welcome to File Creator"
echo "What name do you want to give to the new directory?"
# Get the new name from the user
read dirName
# make the directory
mkdir $dirName
# change the current working directory to the new directory cd $dirName
# announce where we are
echo "This directory is called 'pwd'"
# create some files
touch file1 file2 file3
# put content into the files, the directory and file name echo "This 1s $dirName/file1" > file1
echo "This 1s $dirName/file2" > file2
echo "This 1s $dirName/file3" > file3
# announce the file names
echo "The files in $dirName are: "
ls -hl
# show the content of the files
echo "The content of the files are: "
cat file1
cat file2
cat file3
echo "Goodbye"
Lab 3: Environment Variables
Write a script which asks the user for a number (1 or 2) which the script will use to determine whether an environment variable will be set to yes or no, then set the environment variable and export it. Tell the user what needs to be entered when the script starts. Check to see if the user put in a number. If not, set the environment variable to unknown
#!/bin/bash
# iflab demonstrate the use of environmental variables
# and the if-then-else clause
# Does not check to see if the user puts in 1
# something other than a number
# It takes a number (1 or 2) then sets the
# variable MYANS to yes or no
# If 1 or 2 is not entered, the variable is set to unknown
# declare the variables
no="No" # 1
yes="Yes" # 2
unknown="Unknown" # default
# set a default value
dValue=$unknown
echo""
echo "This program accepts a number used to set the environmental"
echo "variable MYANS to yes (1) or no (2)."
echo ""
echo "Enter the number 1 or 2:"
# this variable contains the number input by the user read aValue
# check to see if the user input a value
# set the value to the user's input, if there 1s one # otherwise set it the a default value
if [ $aValue -eq 1 ]
then
MYANS=$yes
else
if [ $aValue -eq 2 ]
then
MYANS=$no
else
MYANS=$dValue
fi
fi
export MYANS
echo "The value of MYANS 1s: $MYANS"
Lab 4: Functions
Write a script that asks the user for a number (1, 2 or 3) which is used to call a function with that number in its name. The function then displays a message with the function number within it, for example, "This message is from function 3."
#!/bin/bash
# functionlab
# demonstrates functions and script parameters #
# Functions
func1() {
echo " This message is from function 1"
}
#-----------------------------
func2() {
echo " This message is from function 2"
}
#-----------------------------
func3() {
echo " This message is from function 3"
}
# Main script
# prompt the user
echo "Enter a number from 1 to 3"
# get the user's choice
read choice
# the number chosen by the user 1s added to
# the name func to select
# funcl, func2 or func3
# the function call is simply the name
# of the function
func$choice
#put out a line feed
echo " "
Lab 5: Arithmetic
Write a script that will act as a simple calculator for add, subtract, multiply, and divide; each operation should be in a function of its own. Any of the three methods for bash arithmetic, let, expr or$(( .. )), may be used. The user should pass as an argument on the command line a letter (a,s,m or d) and two numbers. The letter will be used to determine which operation will be performed on the two numbers entered. If one or more of the three parameters is left out, then display a usage message. Finally display the answer.
#!/bin/bash
#
# arithmeticlab
# demonstrates arithmetic, functions and simple if clauses # three methods are used for arithmetic.
# the exercise requires only one.
# the three methods are:
# 1) let
# 2) expr
# 3) $(( ... )
# The user will input a letter and two numbers.
# the letter will
# be a(dd), s(subtract), m(ultiply) or d(ivide)
# to select an
# arithmetic operation.
#Functions. must be before the main part of the script
#
adder() {
# method 1. use let
let answer1=($fNumber + $sNumber)
# method 2. use expr
answer2='expr $fNumber + $sNumber'
# method 3 use $(( ... ) answer3=$(($fNumber + $sNumber))
} # end adder function
#---------------------------
subtracter () {
# method 1. use let
let answer1=($fNumber - $sNumber)
# method 2. use expr
answer2='expr $fNumber - $sNumber'
# method 3 use $(( ... )
answer3=$(($fNumber - $sNumber))
} # end subtracter function
#---------------------------
multiplyer () {
# method 1. use let
let answer1=($fNumber * $sNumber)
# method 2. use expr
answer2='expr $fNumber \* $sNumber'
# method 3 use $(( ... )
answer3=$(($fNumber * $sNumber))
} # end multiplyer function
#---------------------------
divider() {
# method 1. use let
let answer1=($fNumber / $sNumber)
# method 2. use expr
answer2='expr $fNumber / $sNumber'
# method 3 use $(( ... )
answer3=$(($fNumber / $sNumber))
} # end divider function
# End of functions
#
# Main part of the script
# check that user provided a letter and two numbers
# does not check to see if the user put in
# an incorrect letter
# which will simply display messages without an answer
if [ $# -lt 3]
then
echo ""
echo "Usage: Provide an operation (a,s,m,d) and two numbers
echo " "
exit 1
fi
#----------------------------------------------------------
# set the input numbers to variables to pass to the functions
#
fNumber=$2
sNumber=$3
if [ $1 == "a" ]
then
adder
fi
if [ $1 == "s" ] then
subtracter
fi
if [ $1 == "m" ] then
multiplyer
fi
if [ $1 == "d" ]
then
divider
fi
#-----------------------------------------------
# Present the answers for all three methods
#
echo "Method 1 Answer is $answer1"
echo "Method 2 Answer is $answer2"
echo "Method 3 Answer is $answer3"
History of Command Shells
sh was written by Steve Bourne at AT&T in 1977, and is often known as the Bourne Shell. All other shells are descended from it in some fashion and it is available on all systems that have a UNIX bloodline.
csh was written by Bill Joy at UC Berkeley and released in 1978. The internal syntax is quite different than sh and is designed to resemble the C programming language, and hence the name.
tcsh was originally developed by Ken Greer at Carnegie Mellon University in the late 1970's; the t in tcsh stands for TENEX, an operating system that was used on some DEC PDP-10's. It has many additional features as compared with csh and on virtually all modern systems csh is just a link to tcsh.
ksh was written by David Korn at AT&T and appeared in 1982, and is often known as the Korn shell. It was designed to be a major upgrade to sh and is backward compatible with it, and brings in some of the features of tcsh, such as command line history recall. This shell has long been a favorite of many system administrators.
bash is a product of the GNU project and was created in 1987. It was designed as a major upgrade of sh; the name stands for Bourne Again Shell. It has full backward compatibility with sh and partial compatibility with ksh.
On all Linux systems sh is just a link to bash, but scripts which are invoked as sh will only work without the bash extensions. A similar relationship exists between csh and tcsh.
Numerical tests
Advanced Bash scripting
String Manipulation
Let’s go deeper and find out how to work with strings in scripts.
A string variable contains a sequence of text characters. It can include letters, numbers, symbols and punctuation marks. Some examples: abcde, 123, abcde 123, abcde-123, &acbde=%123
String operators include those that do comparison, sorting, and finding the length. The following table demonstrates the use of some basic string operators.
Parts of a String
At times, you may not need to compare or use an entire string. To extract the first character of a string we can specify:
${string:0:1} Here 0 is the offset in the string (i.e., which character to begin from) where the extraction needs to start and 1 is the number of characters to be extracted.
To extract all characters in a string after a dot (.), use the following expression: ${string#*.}
Boolean Expressions
Boolean expressions evaluate to either TRUE or FALSE, and results are obtained using the various Boolean operators listed in the table.
Note that if you have multiple conditions strung together with the && operator processing stops as soon as a condition evaluates to false. For example if you have A && B && C and A is true but B is false, C will never be executed.
Likewise if you are using the || operator, processing stops as soon as anything is true. For example if you have A || B || C and A is false and B is true, you will also never execute C.
Tests in Boolean Expression
Boolean expressions return either TRUE or FALSE. We can use such expressions when working with multiple data types including strings or numbers as well as with files. For example, to check if a file exists, use the following conditional test:
[ -e <filename> ]
Similarly, to check if the value of number1 is greater than the value of number2, use the following conditional test:
[ $number1 -gt $number2 ]
The operator -gt returns TRUE if number1 is greater than number2.
The case Statement
The case statement is used in scenarios where the actual value of a variable can lead to different execution paths. case statements are often used to handle command-line options.
Below are some of the advantages of using the case statement:
-It is easier to read and write.
- It is a good alternative to nested, multi-level if-then-else-fi code blocks.
- It enables you to compare a variable against several values at once.
- It reduces the complexity of a program.
Structure of the case Statement
Here is the basic structure of the case statement:
case expression in
pattern1) execute commands;;
pattern2) execute commands;;
pattern3) execute commands;;
pattern4) execute commands;;
* ) execute some default commands or nothing ;;
esac
Looping Constructs
By using looping constructs, you can execute one or more lines of code repetitively. Usually you do this until a conditional test returns either true or false as is required.
Three type of loops are often used in most programming languages:
for
while
until
All these loops are easily used for repeating a set of statements until the exit condition is true.
The 'for' Loop
The for loop operates on each element of a list of items. The syntax for the for loop is:
for variable-name in list
do
execute one iteration for each item in the
list until the list is finished
done
In this case, variable-name and list are substituted by you as appropriate (see examples). As with other looping constructs, the statements that are repeated should be enclosed by do and done.
The screenshots here show an example of the for loop to print the sum of numbers 1 to 4.
The 'while' Loop
The while loop repeats a set of statements as long as the control command returns true. The syntax is:
while condition is true
do
Commands for execution
----
done
The set of commands that need to be repeated should be enclosed between do and done. You can use any command or operator as the condition. Often it is enclosed within square brackets ( [ ] ).
The screenshots here show an example of the while loop that calculates the factorial of a number.
The 'until' Loop
The until loop repeats a set of statements as long as the control command is false. Thus it is essentially the opposite of the while loop. The syntax is:
until condition is false
do
Commands for execution
----
done
Similar to the while loop, the set of commands that need to be repeated should be enclosed between do and done. You can use any command or operator as the condition.
The screenshot here shows example of the until loop that displays odd numbers between 1 and 10.
Intro to Script Debugging
While working with scripts and commands, you may run into errors. These may be due to an error in the script, such as incorrect syntax, or other ingredients such as a missing file or insufficient permission to do an operation. These errors may be reported with a specific error code, but often just yield incorrect or confusing output. So how do you go about identifying and fixing an error?
Debugging helps you troubleshoot and resolve such errors, and is one of the most important tasks a system administrator performs.
More About Script Debugging
Before fixing an error (or bug), it is vital to know its source.
In bash shell scripting, you can run a script in debug mode by doing bash –x ./script_file. Debug mode helps identify the error because:
- It traces and prefixes each command with the + character.
It displays each command before executing it.
It can debug only selected parts of a script (if desired) with:
set -x # turns on debugging
...
set +x # turns off debugging
Redirecting Errors to File and Screen
In UNIX/Linux, all programs that run are given three open file streams when they are started as listed in the table
Using redirection we can save the stdout and stderr output streams to one file or two separate files for later analysis after a program or command is executed
On the left screen is a buggy shell script. On the right screen the buggy script is executed and the errors are redirected to the file "error.txt". Using "cat" to display the contents of "error.txt" shows the errors of executing the buggy shell script (presumably for further debugging).
Creating Temporary Files and Directories
Consider a situation where you want to retrieve 100 records from a file with 10,000 records. You will need a place to store the extracted information, perhaps in a temporary file, while you do further processing on it.
Temporary files (and directories) are meant to store data for a short time. Usually one arranges it so that these files disappear when the program using them terminates. While you can also use touch to create a temporary file, this may make it easy for hackers to gain access to your data.
The best practice is to create random and unpredictable filenames for temporary storage. One way to do this is with the mktemp utility as in these examples:
The XXXXXXXX is replaced by the mktemp utility with random characters to ensure the name of the temporary file cannot be easily predicted and is only known within your program.
Example of Creating a Temporary File and Directory
First, the danger: If someone creates a symbolic link from a known temporary file used by root to the /etc/passwd file, like this:
$ ln -s /etc/passwd /tmp/tempfile
There could be a big problem if a script run by root has a line in like this:
echo $VAR > /tmp/tempfile
The password file will be overwritten by the temporary file contents.
To prevent such a situation make sure you randomize your temporary filenames by replacing the above line with the following lines:
TEMP=$(mktemp /tmp/tempfile.XXXXXXXX)
echo $VAR > $TEMP
Discarding Output with /dev/null
Certain commands like find will produce voluminous amounts of output which can overwhelm the console. To avoid this, we can redirect the large output to a special file (a device node) called /dev/null. This file is also called the bit bucket or black hole.
It discards all data that gets written to it and never returns a failure on write operations. Using the proper redirection operators, it can make the output disappear from commands that would normally generate output to stdout and/or stderr:
$ find / > /dev/null
In the above command, the entire standard output stream is ignored, but any errors will still appear on the console.
Random Numbers and Data
It is often useful to generate random numbers and other random data when performing tasks such as:
Performing security-related tasks.
Reinitializing storage devices.
Erasing and/or obscuring existing data.
Generating meaningless data to be used for tests.
Such random numbers can be generated by using the $RANDOM environment variable, which is derived from the Linux kernel’s built-in random number generator, or by the OpenSSL library function, which uses the FIPS140 algorithm to generate random numbers for encryption
To read more about FIPS140, see http://en.wikipedia.org/wiki/FIPS_140-2
The example shows you how to easily use the environmental variable method to generate random numbers.
How the Kernel Generates Random Numbers
Some servers have hardware random number generators that take as input different types of noise signals, such as thermal noise and photoelectric effect. A transducer converts this noise into an electric signal, which is again converted into a digital number by an A-D converter. This number is considered random. However, most common computers do not contain such specialized hardware and instead rely on events created during booting to create the raw data needed.
Regardless of which of these two sources is used, the system maintains a so-called entropy pool of these digital numbers/random bits. Random numbers are created from this entropy pool.
The Linux kernel offers the /dev/random and /dev/urandom device nodes which draw on the entropy pool to provide random numbers which are drawn from the estimated number of bits of noise in the entropy pool.
/dev/random is used where very high quality randomness is required, such as one-time pad or key generation, but it is relatively slow to provide vaules. /dev/urandom is faster and suitable (good enough) for most cryptographic purposes.
Furthermore, when the entropy pool is empty, /dev/random is blocked and does not generate any number until additional environmental noise (network traffic, mouse movement, etc.) is gathered whereas /dev/urandom reuses the internal pool to produce more pseudo-random bits.
Labs
Lab1: Putting Case Statements into Practice
Write a script that will be given a month number as the argument and will translate this number into a month name. The result will be printed to stdout.
#!/bin/bash
# caseExample will accept a number between 1 and 12 as
# an argument to this script, then return the
# the name of the month that corresponds to that number.
# Demonstrates the use the of the case statement.
# Check to see if the user passed a parameter.
then echo "Error. Please pass an arguement that is a"
echo "number between 1 and 12."
fi
# set numb equal to argument passed for use in the script
numb=$1
################################################
# The example of a case statement:
case $numb in
1)
echo "January"
;;
2)
echo "February"
;;
3)
echo "March"
;;
4)
echo "April"
;;
5)
echo "May"
;;
6)
echo "June"
;;
7)
echo "July"
;;
8)
echo "August"
;;
9)
echo "September"
;;
10)
echo "October"
;;
11)
echo "November"
;;
12)
echo "December"
;;
*)
echo "Error. No month matches that number"
echo "Please pass an arguement that is a"
echo "number between 1 and 12."
;;
esac
##########################################
exit 0
Lab 2: Script Arguments and Usage Information
Write a script that takes exactly one argument, a directory name. The script should print that argument back to standard output.
Make sure the script generates a usage message if needed and that it handles errors with a message.
Lab 3: Randomness
Create a script that takes a word as an argument from the user, then appends a random number to the word and display it to the user. Put in a check to make sure the user passed in a word, displaying a usage statement if a word was not passed as an argument.
#!/bin/bash
##
# randomExample will accept a word then return it
# back to the user with with a random number as part
# of the word If no argument is given a usage message
# is returned
## Can be used, for example, to provide random files names
#
# check to see if they user put in the parameter.
if [ $# -eq 0 ]
then echo "Usage: randomExample word"
exit 1
fi
randNumb=$RANDOM
echo "$1-$randNumb"
##############################################
exit 0
Lab 4: Strings
Write a script that will read two strings from the user. The script will perform three operations on the two strings:
(1) Use the test command to see if one of the strings is of zero length and if the other is of non-zero length, telling the user of both results.
(2) Determine the length of each string and tell the user which is longer or if they are of equal length.
(3) Compare the strings to see if they are the same. Let the user know the result.
#!/bin/bash
##
## stringsLab
## demonstrates some strings operations
## does not test the input for blank lines
##########################################
##
## get two strings from the user
##
echo "Enter in a string: "
read mystring1
echo "Enter in a second string: "
read mystring2
#------------------------------------
## test command
echo "Is string 1 zero length? Value of 1 means FALSE"
test -z $mystring1
echo $?
echo "Is string 2 nonzero length? Value of 0 means TRUE;"
test -n $mystring2
echo $?
####################################
## demonstrates comparing the lengths of two string
##
myLen1=${#mystring1}
myLen2=${#mystring2}
if [ $myLen1 -gt $myLen2 ]
then
echo "String 1 is longer than string 2"
else
if [ $myLen2 -gt $myLen1 ]
then
echo "String 2 is longer than string 1"
else
echo "String 1 is the same length as string 2"
fi
fi
else if [ $mystring1 != $mystring2 ]
then
else if [ $mystring1 != $mystring2 ]
then
fi
#############################################
## compare the two strings to see if they are the same
##
if [ $mystring1 == $mystring2 ]
then
echo "String 1 is the same as string 2"
else
if [ $mystring1 != $mystring2 ]
then
echo "String 1 is not the same as string 2"
fi
fi
Processes
A process is simply an instance of one or more related tasks (threads) executing on your computer. It is not the same as a program or a command; a single program may actually start several processes simultaneously. Some processes are independent of each other and others are related. A failure of one process may or may not affect the others running on the system.
Processes use many system resources, such as memory, CPU (central processing unit) cycles, and peripheral devices such as printers and displays. The operating system (especially the kernel) is responsible for allocating a proper share of these resources to each process and ensuring overall optimum utilization.
Process Types
A terminal window (one kind of command shell), is a process that runs as long as needed. It allows users to execute programs and access resources in an interactive environment. You can also run programs in the background, which means they become detached from the shell.
Processes can be of different types according to the task being performed. Here are some different process types along with their descriptions and examples.
Printer Scheduling and States
When a process is in a so-called running state, it means it is either currently executing instructions on a CPU, or is waiting for a share (or time slice) so it can run. A critical kernel routine called the scheduler constantly shifts processes in and out of the CPU, sharing time according to relative priority, how much time is needed and how much has already been granted to a task. All processes in this state reside on what is called a run queue and on a computer with multiple CPUs, or cores, there is a run queue on each.
However, sometimes processes go into what is called a sleep state, generally when they are waiting for something to happen before they can resume, perhaps for the user to type something. In this condition a process is sitting in a wait queue.
There are some other less frequent process states, especially when a process is terminating. Sometimes a child process completes but its parent process has not asked about its state. Amusingly such a process is said to be in a zombie state; it is not really alive but still shows up in the system's list of processes.
Process and Thread IDs
At any given time there are always multiple processes being executed. The operating system keeps track of them by assigning each a unique process ID (PID) number. The PID is used to track process state, cpu usage, memory use, precisely where resources are located in memory, and other characteristics.
New PIDs are usually assigned in ascending order as processes are born. Thus PID 1 denotes the init process (initialization process), and succeeding processes are gradually assigned higher numbers.
The table explains the PID types and their descriptions
User and Group IDs
Many users can access a system simultaneously, and each user can run multiple processes. The operating system identifies the user who starts the process by the Real User ID (RUID) assigned to the user.
The user who determines the access rights for the users is identified by the Effective UID (EUID). The EUID may or may not be the same as the RUID.
Users can be categorized into various groups. Each group is identified by the Real Group ID, or RGID. The access rights of the group are determined by the Effective Group ID, or EGID. Each user can be a member of one or more groups.
Most of the time we ignore these details and just talk about the User ID (UID).
More About Priorities
At any given time, many processes are running (i.e., in the run queue) on the system. However, a CPU can actually accommodate only one task at a time, just like a car can have only one driver at a time. Some processes are more important than others so Linux allows you to set and manipulate process priority. Higher priority processes are granted more time on the CPU.
The priority for a process can be set by specifying a nice value, or niceness, for the process. The lower the nice value, the higher the priority. Low values are assigned to important processes, while high values are assigned to processes that can wait longer. A process with a high nice value simply allows other processes to be executed first. In Linux, a nice value of -20 represents the highest priority and 19 represents the lowest. (This does sound kind of backwards, but this convention, the nicer the process, the lower the priority, goes back to the earliest days of UNIX.)
You can also assign a so-called real-time priority to time-sensitive tasks, such as controlling machines through a computer or collecting incoming data. This is just a very high priority and is not to be confused with what is called hard real time which is conceptually different, and has more to do with making sure a job gets completed within a very well-defined time window.
Listing Processes
The ps Command (System V Style)
ps provides information about currently running processes, keyed by PID. If you want a repetitive update of this status, you can use top or commonly installed variants such as htop or atop from the command line, or invoke your distribution's graphical system monitor application.
ps has many options for specifying exactly which tasks to examine, what information to display about them, and precisely what output format should be used.
Without options ps will display all processes running under the current shell. You can use the -u option to display information of processes for a specified username. The command ps -ef displays all the processes in the system in full detail. The command ps -eLf goes one step further and displays one line of information for every thread (remember, a process can contain multiple threads).
The ps Command (BSD Style)
ps has another style of option specification which stems from the BSD variety of UNIX, where options are specified without preceding dashes. For example, the command ps aux displays all processes of all users. The command ps axo allows you to specify which attributes you want to view.
The following tables shows sample output of ps with the aux and axo qualifiers.
The Process Tree
At some point one of your applications may stop working properly. How might you terminate it?
pstree displays the processes running on the system in the form of a tree diagram showing the relationship between a process and its parent process and any other processes that it created. Repeated entries of a process are not displayed, and threads are displayed in curly braces.
To terminate a process you can type kill -SIGKILL <pid> or kill -9 <pid>. Note however, you can only kill your own processes: those belonging to another user are off limits unless you are root.
top
While a static view of what the system is doing is useful, monitoring the system performance live over time is also valuable. One option would be to run ps at regular intervals, say, every two minutes. A better alternative is to use top to get constant real-time updates (every two seconds by default) until you exit by typing q. top clearly highlights which processes are consuming the most CPU cycles and memory (using appropriate commands from within top.)
First Line of the top Output
The first line of the top output displays a quick summary of what is happening in the system including:
How long the system has been up
How many users are logged on
What is the load average
The load average determines how busy the system is. A load average of 1.00 per CPU indicates a fully subscribed, but not overloaded, system. If the load average goes above this value, it indicates that processes are competing for CPU time. If the load average is very high, it might indicate that the system is having a problem, such as a runaway process (a process in a non-responding state).
Second Line of the top Output
The second line of the top output displays the total number of processes, the number of running, sleeping, stopped and zombie processes. Comparing the number of running processes with the load average helps determine if the system has reached its capacity or perhaps a particular user is running too many processes. The stopped processes should be examined to see if everything is running correctly.
Third Line of the top Output
The third line of the top output indicates how the CPU time is being divided between the users (us) and the kernel (sy) by displaying the percentage of CPU time used for each.
The percentage of user jobs running at a lower priority (niceness - ni) is then listed. Idle mode (id) should be low if the load average is high, and vice versa. The percentage of jobs waiting (wa) for I/O is listed. Interrupts include the percentage of hardware (hi) vs. software interrupts (si). Steal time (st) is generally used with virtual machines, which has some of its idle CPU time taken for other uses.
Fourth and Fifth Lines of the top Output
The fourth and fifth lines of the top output indicate memory usage, which is divided in two categories:
Physical memory (RAM) – displayed on line 4.
Swap space – displayed on line 5.
Both categories display total memory, used memory, and free space.
You need to monitor memory usage very carefully to ensure good system performance. Once the physical memory is exhausted, the system starts using swap space (temporary storage space on the hard drive) as an extended memory pool, and since accessing disk is much slower than accessing memory, this will negatively affect system performance.
If the system starts using swap often, you can add more swap space. However, adding more physical memory should also be considered.
Process List of the top Output
Each line in the process list of the top output displays information about a process. By default, processes are ordered by highest CPU usage. The following information about each process is displayed:
Process Identification Number (PID)
Process owner (USER)
Priority (PR) and nice values (NI)
Virtual (VIRT), physical (RES), and shared memory (SHR)
Status (S)
Percentage of CPU (%CPU) and memory (%MEM) used
Execution time (TIME+)
Command (COMMAND)
Interactive Keys with top
Besides reporting information, top can be utilized interactively for monitoring and controlling processes. While top is running in a terminal window you can enter single-letter commands to change its behaviour. For example, you can view the top-ranked processes based on CPU or memory usage. If needed, you can alter the priorities of running processes or you can stop/kill a process.
The table lists what happens when pressing various keys when running top
Process Metrics and Process Control
Load Averages
Load average is the average of the load number for a given period of time. It takes into account processes that are:
Actively running on a CPU.
Considered runnable, but waiting for a CPU to become available.
Sleeping: i.e., waiting for some kind of resource (typically, I/O) to become available.
The load average can be obtained by running w, top or uptime.
Interpreting Load Averages
The load average is displayed using three different sets of numbers, as shown in the following example:
The last piece of information is the average load of the system. Assuming our system is a single-CPU system, the 0.25 means that for the past minute, on average, the system has been 25% utilized. 0.12 in the next position means that over the past 5 minutes, on average, the system has been 12% utilized; and 0.15 in the final position means that over the past 15 minutes, on average, the system has been 15% utilized. If we saw a value of 1.00 in the second position, that would imply that the single-CPU system was 100% utilized, on average, over the past 5 minutes; this is good if we want to fully use a system. A value over 1.00 for a single-CPU system implies that the system was over-utilized: there were more processes needing CPU than CPU was available.
If we had more than one CPU, say a quad-CPU system, we would divide the load average numbers by the number of CPUs. In this case, for example, seeing a 1 minute load average of 4.00 implies that the system as a whole was 100% (4.00/4) utilized during the last minute.
Short term increases are usually not a problem. A high peak you see is likely a burst of activity, not a new level. For example, at start up, many processes start and then activity settles down. If a high peak is seen in the 5 and 15 minute load averages, it would may be cause for concern.
Background and Foreground Processes
Linux supports background and foreground job processing. (A job in this context is just a command launched from a terminal window.) Foreground jobs run directly from the shell, and when one foreground job is running, other jobs need to wait for shell access (at least in that terminal window if using the GUI) until it is completed. This is fine when jobs complete quickly. But this can have an adverse effect if the current job is going to take a long time (even several hours) to complete.
In such cases, you can run the job in the background and free the shell for other tasks. The background job will be executed at lower priority, which, in turn, will allow smooth execution of the interactive tasks, and you can type other commands in the terminal window while the background job is running. By default all jobs are executed in the foreground. You can put a job in the background by suffixing & to the command, for example: updatedb &
You can either use CTRL-Z to suspend a foreground job or CTRL-C to terminate a foreground job and can always use the bg and fg commands to run a process in the background and foreground, respectively.
Managing Jobs
The jobs utility displays all jobs running in background. The display shows the job ID, state, and command name, as shown here.
jobs -l provides a the same information as jobs including the PID of the background jobs.
The background jobs are connected to the terminal window, so if you log off, the jobs utility will not show the ones started from that window.
Scheduling Future Processes using at
Suppose you need to perform a task on a specific day sometime in the future. However, you know you will be away from the machine on that day. How will you perform the task? You can use the at utility program to execute any non-interactive command at a specified time, as illustrated in the diagram
cron
cron is a time-based scheduling utility program. It can launch routine background jobs at specific times and/or days on an on-going basis. cron is driven by a configuration file called /etc/crontab (cron table) which contains the various shell commands that need to be run at the properly scheduled times. There are both system-wide crontab files and individual user-based ones. Each line of a crontab file represents a job, and is composed of a so-called CRON expression, followed by a shell command to execute.
The crontab -e command will open the crontab editor to edit existing jobs or to create new jobs. Each line of the crontab file will contain 6 fields
Examples:
1. The entry "* * * * * /usr/local/bin/execute/this/script.sh" will schedule a job to execute 'script.sh' every minute of every hour of every day of the month, and every month and every day in the week.
2. The entry "30 08 10 06 * /home/sysadmin/full-backup" will schedule a full-backup at 8.30am, 10-June irrespective of the day of the week.
sleep
Sometimes a command or job must be delayed or suspended. Suppose, for example, an application has read and processed the contents of a data file and then needs to save a report on a backup system. If the backup system is currently busy or not available, the application can be made to sleep (wait) until it can complete its work. Such a delay might be to mount the backup device and prepare it for writing.
sleep suspends execution for at least the specified period of time, which can be given as the number of seconds (the default), minutes, hours or days. After that time has passed (or an interrupting signal has been received) execution will resume.
Syntax:
sleep NUMBER[SUFFIX]...
where SUFFIX may be:
1. s for seconds (the default)
2. m for minutes
3. h for hours
4. d for days
sleep and at are quite different; sleep delays execution for a specific period while at starts execution at a later time.
Labs
Lab1: Background and Foreground Jobs
Create a job that writes the date to an output file thrice, with a gap of 60 seconds and 180 seconds. Check whether the job is running and bring it to foreground job. Stop the foreground job and make it run in the background. Finally, kill the background job and verify its status.
Step-1: [test1@localhost ~]$ (date; sleep 60; date; sleep 180; date) > date.out &
Step-2: [test1@localhost ~]$ cat date.out
Step-3: [test1@localhost ~]$ jobs
Step-4: [test1@localhost ~]$ fg %1 Ctrl-Z
Step-5: [test1@localhost ~]$ bg %1 Step-6: [test1@localhost ~]$ jobs Step-7: [test1@localhost ~]$ kill %1 Step-8: [test1@localhost ~]$ jobs
Lab 2: Scheduling a One-Time Backup
Create job using at to back up files in one directory to another 10 minutes from now.
$ at now + 10 minutes
at> cp dir1/* dir2/
at> CTRL-
Lab 3: Scheduling Repeated Backups
Set up a cron job to backup the files in one directory to another every day at 10 am. Put
the commands in file called mycron.
$ nano mycron
0 10 * * * /root/bin/mybackup
CTRL-X
$ nano bin/mybackup
cp dir1/* dir2/
CTRL-X
$ chmod +x bin/mybackup
$ crontab mycron
$ crontab –l
Lab 4: Using the sleep command.
Create a small script or use the command line to create a reminder that a meeting is starting in 15 minutes. The reminder should appear 10 minutes from now.
$ sleep 600; echo "The meeting is starting in 15 minutes"
Common Applications
Internet Applications
The Internet is a global network that allows users around the world to perform multiple tasks such as searching for data, communicating through emails and online shopping. Obviously, you need to use network-aware applications to take advantage of the Internet. These include:
Web browsers
Email clients
Online media applications
Other applications
Web Browsers
As discussed in the earlier chapter on Network Operations, Linux offers a wide variety of web browsers, both graphical and text based, including:
Firefox
Google Chrome
Chromium
Epiphany
Konqueror
w3m
lynx
Email Applications
Email applications allow for sending, receiving, and reading messages over the Internet. Linux systems offer a wide number of email clients, both graphical and text-based. In addition many users simply use their browsers to access their email accounts.
Most email clients use the Internet Message Access Protocol (IMAP) or the older Post Office Protocol (POP) to access emails stored on a remote mail server. Most email applications also display HTML (HyperText Markup Language) formatted emails that display objects, such as pictures and hyperlinks. The features of advanced email applications include the ability of importing address books/contact lists, configuration information, and emails from other email applications.
Linux supports the following types of email applications:
Graphical email clients, such as Thunderbird (produced by Mozilla), Evolution, and Claws Mail
Text mode email clients such as mutt and mail
Other Internet Applications
Linux systems provide many other applications for performing Internet-related tasks.
Productivity and Development Applications
Office Applications
Most day-to-day computer systems have productivity applications (sometimes called office suites) available or installed (click here for a list of commonly used suites). Each suite is a collection of closely coupled programs used to create and edit different kinds of files such as:
Text (articles, books, reports etc.)
Spreadsheet
Presentation
Graphical objects
Most Linux distributions offer LibreOffice, an open source office suite that started in 2010 and has evolved from OpenOffice.org. While other office suites are available as we have listed, LibreOffice is the most mature, widely used and intensely developed.
The component applications included in LibreOffice are Writer, Calc, Impress, and Draw
Development Applications
Linux distributions come with a complete set of applications and tools that are needed by those developing or maintaining both user applications and the kernel itself.
These tools are tightly integrated and include:
Advanced editors customized for programmers' needs, such as vi and emacs.
Compilers (such as gcc for programs in C and C++) for every computer language that has ever existed.
Debuggers such as gdb and various graphical front ends to it and many other debugging tools (such as valgrind).
Performance measuring and monitoring programs, some with easy to use graphical interfaces, others more arcane and meant to be used only by serious experienced development engineers.
Complete Integrated Development Environments (IDE's) such as Eclipse, that put all these tools together.
On other operating systems these tools have to be obtained and installed separately, often at a high cost, while on Linux they are all available at no cost through standard package installation systems.
Movie Players
Movie (video) players can portray input from many different sources, either local to the machine or on the Internet.
Linux systems offer a number of movie players including:
VLC
MPlayer
Xine
Totem
Movie Editors
Movie editors are used to edit videos or movies. Linux systems offer a number of movie editors including Kino, Cinepaint, Blender, Cinelerra, FFmpeg
GIMP (GNU Image Manipulation Program)
Graphic editors allow you to create, edit, view, and organize images of various formats like Joint Photographic Experts Group (JPEG or JPG), Portable Network Graphics (PNG), Graphics Interchange Format (GIF), and Tagged Image File Format (TIFF).
GIMP (GNU Image Manipulation Program) is a feature-rich image retouching and editing tool similar to Adobe Photoshop and is available on all Linux distributions. Some features of the GIMP are:
It can handle any image file format.
It has many special purpose plugins and filters.
It provides extensive information about the image, such as layers, channels, and histograms.
References
1. Linux online class