Dealing With Text

Updated: 2018-11-30

Replace characters

$ cat foo.txt | tr "," "_" > bar.txt

for unprintable character, e.g. \u0007, press ctrl-V ctrl-G

Change Everything to Uppercase

$ cat foo.txt | tr "[a-z]" "[A-Z]"

Count Rows

$ cat foo.txt | wc -l

Count Columns

If delimiter is ,

$ cat foo.txt | awk -F, '{print NF}'

or

$ awk 'BEGIN {FS=","} {print NF}' file.txt

where

  • NF=Number of Fields
  • FS=File Separator

If delimiter is \u0007(ctrl-v ctrl-g)

$ cat foo.txt | awk -F'^G' '{print NF}'

Get Column Number

replace <pattern> with the column name or pattern

head -1 foo.csv | awk -v RS="|" '/<pattern>/{print NR;}'

print the second row:

$ awk 'NR==2' filename

print line 2 to line 10

$ awk 'NR==2,NR==10' filename

Match

Show ssh processes

$ ps | grep ssh

Specify max count by -m:

$ cat foo.log | grep -m 10 ERROR

Add One Line To The Beginning

$ sed -i '1s/^/line to insert\n/' /path/to/file

Remove 1st line in place

$ sed -i 1d filename

Extract a Column

awk

$ cat file | awk '{print $2}'

cut

$ echo 'a b c' | cut -d ' ' -f1
a

$ echo 'a b c' | cut -d ' ' -f2
b

Add - to list everything to the right

$ echo 'a b c' | cut -d ' ' -f2-
b c

Add comma to the end of the line

$ cat foo.txt | sed s/$/,/g

split

Split data into chunks

Split by number of lines: split myfile, each chunk has 500 lines, prefixed by segment_, i.e. segment_aa, segment_ab, segment_ac...

$ split -l 500 myfile segment_

Split by size: split myfile, each chunk is 40k

$ split -b 40k myfile segment_

sort

  • -k: (key) column number
  • -t: delimiter
$ cat file | sort -nr -t \| -k 2 | head

Pretty print JSON

$ cat data.json | python -m json.tool

Append Multiple Lines To File

Use Here Document syntax, << "EOF" means the multi-line text ends at string EOF, anything in between will be appended to file.

cat >> path/to/file/to/append-to.txt << "EOF"
Some text here
And some text here
EOF