Skip to content

Latest commit

 

History

History
179 lines (124 loc) · 6.06 KB

05_manipulate.md

File metadata and controls

179 lines (124 loc) · 6.06 KB

Manipulating

  1. Display some random text:

    $ echo "foo"
    

    Like a lot of commands, echo generates output that by default displays on the screen, a target called standard output.

    • standard output
  2. Put the output into a file, aka redirect it:

    $ echo "foo" > file.txt
    $ cat file.txt
    
    • redirect
  3. The foo text above clobbers whatever's in that file. Changing the > to >> appends it to whatever's already in the file. Run each of these commands and see what you get:

    $ echo "bar" >> file.txt
    $ echo "foo" >> file.txt
    $ cat file.txt
    
    • append
  4. Type a bunch of those echo commands to put a bunch of random words into the file, and make sure to repeat a few of them. Now sort them:

    $ sort file.txt
    

    That takes the file's contents as input, runs it through a sort command, and sends the results to standard output, the screen by default.

  5. This is exactly equivalent to this command pipe:

    $ cat file.txt | sort
    

    Like a redirect, a pipe diverts the output of the cat command, but in this case it sends it as input to another command rather than dumping it to a file.

    • pipe
  6. Copy the file, then try to sort it and put the results back into the same file:

    $ cp file.txt fail.txt
    $ sort fail.txt > fail.txt
    $ cat fail.txt
    

    What went wrong? The file is now empty. In many programming environments, that second command would be interpreted as: read the file, sort the contents, then write the result to the file. The shell is actually a bit dumber, and sees the file-clobbering > as the very first and most important thing to do. So when it comes time to read the file, it's already empty. To make it work, you need to direct the output to a different file, then use that to clobber the original:

    $ sort file.txt > temp.txt
    $ mv temp.txt file.txt
    $ cat file.txt
    
  7. This sorts, but filters out any repeated lines, to display only unique lines:

    $ sort -u file.txt
    
  8. This finds the five most most popular words:

    $ sort file.txt | uniq -c | sort -rn | head -5
    

    It may help to slowly build this chain of commands from the left to the right to fully understand it. After the initial sort command, uniq -c counts sequences of repeated adjacent lines and prepends that number to a single collapsed line of output. The second sort uses -n to sort numerically rather than by string. (If you sort 1, 5, and 10 using default string sorting, 5 comes last because it sorts from the left one character at a time.) The sort's -r option reverses the order to display larger values on top, and the final head command displays only the first five lines.

  9. Here's how to send the results of any Unix command through email:

    $ cat file.txt | mail -s "subject line" [email protected]
    

    If you're in an administered work environment, you may be able to run lpr to send the text directly to your default printer:

    $ cat file.txt | lpr
    
  10. Command chains and redirects tend to flow from left to right, so this alternative file input redirect syntax can be a bit confusing:

    $ mail -s "subject" [email protected] < file.txt
    

    This may help clarify how it works, or not:

    $ sort < unsorted.txt > sorted.txt
    
  11. Capture error logs. Commands often send out errors along with legitimate output, and this can appear all jumbled and confusing on the screen. You can redirect the standard error stream separately from standard output, using the same clobber/append distinction. These both send standard output to output.txt, and standard error to errlog.txt:

    $ some_command > output.txt 2>  errlog.txt
    $ some_command > output.txt 2>> errlog.txt
    

    A variation like this keeps standard output displaying on the screen, but removes any errors from view.

    $ some_command 2>  errlog.txt
    
    • standard error
  12. Finally, dump a bunch of filenames into a list:

    $ echo file1.txt >> list
    $ echo file2.txt >> list
    $ echo file3.txt >> list
    $ echo file4.txt >> list
    $ echo file5.txt >> list
    $ echo file6.txt >> list
    $ echo file7.txt >> list
    $ echo file8.txt >> list
    $ echo file9.txt >> list
    

    This takes all the lines of the cat command's output and passes them as arguments to the touch command. The technique is called command interpolation:

    $ touch `cat list`
    
    • interpolation
  13. Here's how you might use this in real life to build a complex command. Suppose you have a directory tree full of JavaScript files, and you want to search recursively for various syntax @things used in documentation, and you want to make sure not to include non-JavaScript files. The find command generates a full recursive list of files, including directories:

    $ find . -print
    

    The find command is pretty unusual. Unlike other Unix utilities, it actually has to be told to produce output with the -print option.

    This narrows the output to JavaScript files ending with js. (The $ in the regular expression means at the end of the line.)

    $ find . -print | grep "js$"
    

    Search each one of those files for the significant pattern. (The -h hides filenames from the output.)

    $ grep -h '@[a-z]' `find . -print | grep "js$"`
    

    We may be interested to see these items clumped together, to make sure they're handled consistently. This sorts and pages the output:

    $ grep -h '@[a-z]' `find . -print | grep "js$"` | sort -u | more
    

    You may see output such as @description presents a window that..., so you can pipe to another search for sentences that don't start with uppercase. (The character class works much the same way in the regexp as in shell wildcards.)

    $ grep -h '@[a-z]' `find . -print | grep "js$"` | sort -u | grep "@description [^A-Z]"