logo

Cheatsheet - Shell Commands

Getting help

--help

Most commands have built-in help info, usually with the --help or -h option, e.g. $ ls --help will give you more info about the command ls.

man

# Display manual of the command.
$ man $COMMAND

# Specify section number.
$ man 2 mount # the system call
$ man 8 mount # the admin command

# Display manual of man itself. Section info are included.
$ man man

# Search the man pages
$ man -k mount

# List all sections that have the term.
$ man -f mount
mount (8)            - mount a filesystem
mount (2)            - mount filesystem
...
  • Section 1: user commands
  • Section 2: system calls
  • Section 3: library functions
  • Section 4: special files
  • Section 5: file formats
  • Section 6: games
  • Section 7: conventions and miscellany
  • Section 8: administration and privileged commands
  • Section L: math library functions
  • Section N: tcl functions

Current Working Directory

ls lists CWD if given no other parameters.

Check CWD, these 2 are equivalent:

$ pwd # Print Working Directory
$ echo $PWD

More or Less

  • more: can only go forward, bash autocompletion uses more
  • less: can go backward, search, and more; almost everything else uses less

dd

The dd Unix utility program reads octet streams from a source to a destination, possibly performing data conversions in the process.

Create File With Zeroes

Creating a 1 MiB file, called foobar, filled with zeroes:

dd if=/dev/zero of=foobar count=1024 bs=1024

Note: The block size value can be given in SI (decimal) values, e.g. in GB, MB, etc. To create a 1 GB file one would simply type:

Test IO Performance

$ dd if=/dev/zero of=test bs=64k count=16k conv=fdatasync
16384+0 records in
16384+0 records out
1073741824 bytes (1.1 GB) copied, 3.94954 s, 272 MB/s

Read operations from /dev/zero return as many null characters (0x00) as requested in the read operation.

Unlike /dev/null, /dev/zero may be used as a source, not only as a sink for data.

dd if=/dev/zero of=/dev/<destination partition>

dd if=/dev/zero of=foobar count=1 bs=1GB

diff

diff
sdiff
vimdiff
colordiff

env

Filter:

$ env | grep HOME
$ env | egrep 'HOME|USER|VERSION|SHELL|PWD'

ldd

ldd is a powerful command-line tool that allows users to view an executable file's shared object dependencies.

ldd = List Dynamic Dependencies

$ ldd ./my-program
not a dynamic executable
$ ldd /usr/bin/gzip
linux-vdso.so.1 =>  (0x00007fff39fff000)
    libc.so.6 => /lib64/libc.so.6 (0x00007fb842afa000)
/lib64/ld-linux-x86-64.so.2 (0x00007fb842ea0000)
$ ldd /usr/bin/ssh
linux-vdso.so.1 =>  (0x00007fffd0164000)
libfipscheck.so.1 => /lib64/libfipscheck.so.1 (0x00007fb62de63000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007fb62dc43000)
libcrypto.so.10 => /usr/lib64/libcrypto.so.10 (0x00007fb62d8a9000)
...

LD_LIBRARY_PATH to include this path for the application to successfully run.

type

Shell Builtin

Linux built in commands:

$ type pwd cd
pwd is a shell builtin
cd is a shell builtin

And type itself is a shell builtin

$ type type
type is a shell builtin

Shell Keyword

Shell keywords are used in shell scripts

$ type if fi
if is a shell keyword
fi is a shell keyword

Alias

If you have alias defined in .bash_profile or .bashrc, for example

alias ls="ls -G"

where -G is set to enable colorized output. Then ls becomes an alias:

$ type ls
ls is aliased to `ls -G'

Executable

If you have docker installed, you will see something like this:

$ type docker
docker is /usr/local/bin/docker

Show all python executables

$ type -a python

xargs

Similar to map, apply functions on each item in the list

e.g. kill all ssh connections

$ ps -ax | grep ssh | cut -d ' ' -f 1 | xargs kill -9

tee

Reads the standard input and writes it to both the standard output and one or more files.

$ echo "test" | tee foo.txt
test
$ cat foo.txt
test

Locate a command

Use type or command to locate exact path for the env command:

$ type env
env is hashed (/usr/bin/env)

$ command -V env
env is hashed (/usr/bin/env)

which vs whereis

which: based on PATH

$ which hadoop
/Users/myname/lib/hadoop

whereis: based on standard binary directories

$ whereis hadoop
hadoop: /usr/bin/hadoop /etc/hadoop /usr/lib/hadoop /usr/share/man/man1/hadoop.1.gz

Check If Command Exists

$ command -v foo >/dev/null 2>&1
$ type foo >/dev/null 2>&1
$ hash foo 2>/dev/null

Example: add HADOOP_CLASSPATH if hadoop exists

command -v hadoop > /dev/null && {
    HADOOP_CLASSPATH=`hadoop classpath`
    CLASSPATH=$HADOOP_CLASSPATH:$CLASSPATH
}

TTY

TTY: teletype, now refers to any device that opens a physical or virtual terminal session.

Serial Port Terminals

Each serial port is considered to be a "device". e.g. /dev/ttys0

Pseudo Terminals

Pairs of devices such as /dev/ptyp3 and /dev/ttyp3; no physical device directly associated with either.

Controlling Terminal

/dev/tty

SSH to a Linux server (Ubuntu)

$ tty
/dev/pts/1

On a Mac

$ tty
/dev/ttys001

Shortcut

Change to different tty terminals: Ctrl + Alt + F1-F6

cd

cd to symlinked folder

cd -P /bin will go to /usr/bin

tree

Print directories tree

$ tree -dlL 2
  • -d : show only the directories
  • -l : follow symbolic links
  • L 2 : show only level one and level two directories

Combine multiple commands

Change file extension (e.g. from .md to .mdx):

$ find . -type f -name "*.md" -exec rename 's/\.md$/.mdx/' '{}' \;

How to download multiple packages?

Create a downloads.txt file that looks like this, one package per line:

https://github.com/containerd/containerd/releases/download/v1.7.23/containerd-1.7.23-linux-amd64.tar.gz
https://github.com/etcd-io/etcd/releases/download/v3.4.27/etcd-v3.4.27-linux-amd64.tar.gz

Then use wget:

$ wget -q --show-progress \
  --https-only \
  --timestamping \
  -P downloads \
  -i downloads.txt

Text Processing

Useful tools:

  • grep, egrep and fgrep: match patterns.
  • sed(stream editor) is for programmatically editing files based on lines.
  • awk is for text processing, especially useful for table-like text files like csv.

Replace characters

$ cat foo.txt | tr "," "_" > bar.txt

for unprintable character, e.g. \u0007, press ctrl-V ctrl-G

Delete character

Use tr -d, e.g. to remove all the o:

$ echo "Hello World" | tr -d o
Hell Wrld

Change Everything to Uppercase

$ cat foo.txt | tr "[a-z]" "[A-Z]"

Count Rows

$ cat foo.txt | wc -l

Extract a Column

$ echo 'a b c' | cut -d ' ' -f1
a

$ echo 'a b c' | cut -d ' ' -f2
b

Add - to list everything to the right

$ echo 'a b c' | cut -d ' ' -f2-
b c

split

Split data into chunks

Split by number of lines: split myfile, each chunk has 500 lines, prefixed by segment_, i.e. segment_aa, segment_ab, segment_ac...

$ split -l 500 myfile segment_

Split by size: split myfile, each chunk is 40k

$ split -b 40k myfile segment_

sort

  • -k: (key) column number
  • -t: delimiter
$ cat file | sort -nr -t \| -k 2 | head

dmesg

  • dmesg: show kernel ring buffer, show logs from systemd, systemd-journald
  • dmesg -T: human readable timestamp

Does it survive a reboot?

No. Because it lives in memory, the raw dmesg buffer is volatile, meaning it is completely wiped out when the power is cut or the system reboots.

What happens if the buffer is full?

If the buffer has already wrapped around and you need to see those missing early boot messages, check these persistent files on your disk:

  • /var/log/dmesg: A snapshot of the buffer taken immediately after boot.
  • /var/log/kern.log or /var/log/syslog: Files where a logging daemon (like rsyslog) saves kernel messages permanently.
  • journalctl -k: On systems with systemd, this command often has a much larger, persistent history of kernel logs than the raw dmesg buffer.

How to check the size of the buffer

Use dmesg -S or grep your boot configuration with grep CONFIG_LOG_BUF_SHIFT /boot/config-$(uname -r). The size is (2^{\text{SHIFT}}) bytes.

E.g. if CONFIG_LOG_BUF_SHIFT is 17, the ring buffer size is 2^17 bytes = 128 KB.

While that sounds tiny compared to modern RAM, it is a very common default size for the kernel ring buffer.

In a healthy system, the systemd-journald or rsyslog service reads these messages almost instantly and saves them to the disk, so the 128 KB buffer only needs to act as a "waiting room."

How to write to the kernel ring buffer?

The primary way a developer sends a message to this buffer is by using the kernel function printk().

What about kmsg?

cat /dev/kmsg

The dmesg command is essentially a formatter. It reads that raw, ugly data from /dev/kmsg, cleans it up, makes the timestamps pretty, and prints it to your screen.

On some older systems (or if /dev/kmsg isn't available), dmesg uses a direct system call called syslog() (not to be confused with the syslogd service). This tells the kernel: "Copy the entire log buffer into this chunk of memory I've provided." The syslog() system call is still available in modern Linux kernels, but it is rarely used directly by modern applications.

FAQ

How to pretty print JSON?

$ cat data.json | python -m json.tool

How to add new line to the result?

Sometimes when decoding a base64 string, there's no newline at the end; the solution is to add && echo:

$ ... | base64 -d && echo

How to append Multiple Lines To a File

Use Here Document syntax, <<EOF means the multi-line text ends at string EOF

$ cat <<EOF
> foo
> bar
> EOF
foo
bar

e.g. add multiple lines to a file, like ~/.bashrc:

$ cat <<EOF >> ~/.bashrc
export KUBECONFIG=/path/to/kubeconfig
export FOO=foo
export BAR=bar
EOF

How to set and unset a env variable?

# set
$ export FOO="hello"

# unset
$ unset FOO

How to show history with timestamp?

history with timestamp: export HISTTIMEFORMAT="%F %T "

  • %F: show the date in YYYY-MM-DD format.
  • %T: show time in HH:MM:SS format.

How to check glibc version?

# Check ldd version.
$ ldd --version
ldd (Debian GLIBC 2.37-6+gl0) 2.37

# Check a built-in command, like ls.
$(ldd `which /bin/ls` | grep libc | awk '{print $3}')
GNU C Library (Debian GLIBC 2.37-6+gl0) stable release version 2.37.

How to check file type?

Use file command:

$ file /dev/dm-0
/dev/dm-0: block special (253/0)

chroot

chroot: changes the root file system directory as seen by a job, so that one program cannot access files outside of its directory tree. (for isolation)

chroot is both a system call and a wrapper program.

Cowsay


$ brew install cowsay
$ cowsay hello

---

< hello >

---

        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Also try:

$ fortune | cowsay
$ fortune | cowsay -f tux
$ cowsay -l

Shortcuts

grep | wc -> grep -c
sort | uniq | wc -> sort -u | wc

List the largest files/directories

If you want to find and print the top 10 largest files names (not directories) in a particular directory and its sub directories

$ find . -printf '%s %p\n'|sort -nr|head

To restrict the search to the present directory use -maxdepth 1 with find.

$ find . -maxdepth 1 -printf '%s %p\n'|sort -nr|head

print the top 10 largest "files and directories":

$ du -a . | sort -nr | head

e.g. check which log file/directory takes most usage

$ du -sh /var/log/* | sort -rh |less

Global Search And Replacement

$ find / -name game
$ find . -type f -name '*.txt' -exec sed -i '' s/this/that/g {} +

Count lines in markdown files

for file in `find ../src -name '*.md'`; do
  wc -l ${file}
done | sort -n > ../_tmp/result.txt

Change -l to -w to count words.

What is the double dash?

-- signifies the end of the options.

A double dash (--) is used in most Bash built-in commands and many other commands to signify the end of command options, after which only positional ("non-option") arguments are accepted.

For example: after the --, -v is no longer considered an option for kubectl

$ kubectl exec -ti $POD_NAME -- nginx -v
nginx version: nginx/1.25.3

What is a POSIX Shell?

The POSIX (the Portable Operating System Interface) shell is the standard Unix shell, i.e. it was formally defined and shipped in a published standard, it has many competing implementations on many different operating systems, but are compatible.

POSIX shell is basically Bourne shell, lives at the standardized location /bin/sh.

The POSIX standard does not recognize long flags like grep --file=FILE, but only the short flags like grep -f. (Because it does not define getopt_long function, only getopt function).

Other Commands

  • sysctl - configure kernel parameters at runtime, not to be confused with systemctl.
  • timedatectl - to query and change the system's date, time, timezone, and network time synchronization settings. Replaces date and hwclock.