Up: Index, Next: The simplest case
What is Bash?
Bash is currently the most common shell for Linux and other UNIX flavors. You’ll find it literally everywhere from servers and desktop computers to smartphones and home appliances.
What is a shell?
A shell is an interface between the user and the operating system kernel. It enables high-level access to computer’s resources like:
-
CPU
-
Memory
-
Processes
-
Devices
-
File systems
It can either be text-based, like Bash, Korn shell, C shell and PowerShell, or graphical (e.g. Windows Shell).
Shell scripts
Apart from regular, manual interaction with the operating system’s resources, shells often provide a way to automate complex and/or frequently performed tasks. In case of text-based shells, such automation takes form of sequences of shell commands called scripts.
Scripts are used in many places and for various purposes: system administration, software test automation, CI/CD build piplines.
Speed of shell scripts
Shell scripts are usually short, written by a single person to perform a very specific task. They are executed from time to time, it often doesn’t matter if they run 5 or 10 seconds.
However, sometimes they can grow into very sophisticated tools, or become a part of a bigger system, in which they are called hundreds or even thousands of times per day (e.g. a CI pipeline for a big monolithic system with dozens of teams developing it every day). Saving a second or two on a single invocation can save a lot of time and money on a larger scale.
Even if the script is small and standalone, we still can gain from perfomance improvements if we need to run it on a large data set (one very big file or many smaller files and directories). Small savings on a single iteration can sometimes reduce the execution time from hours to minutes.
Measurement and analysis
Now that we know why we should care about the speed of our scripts, we need some tools to tell us how fast they actually run. In subsequent experiments, we will use the following tools to measure the speed and then to understand the reasons for the results we got.
time
-
Measures the time spent on the execution of a program
-
real - wall clock time - from start to end
-
user - time spent in the user space
-
sys - time spent in the kernel space (on system calls)
time [-p] command [arguments]
-
for
-
Executes a command a given number of times (to increase the scale)
for ((i=0; i<number_of_iterations; i++)); do command [arguments] done
strace
-
Tracks system calls and signals of a given program or process
strace [options] command [arguments] strace [options] -p pid
pmap
-
Reports memory map of a process
pmap [options] pid
Up: Index, Next: The simplest case