You know the scenario. You're trying to fix a problem in Linux or MacOS. You have 40 Stack Overflow tabs open. So far, everything has failed, and you're psyching yourself up for your last ditch, Hail Mary technique: update the the OS to latest release and hope that somehow fixes the problem. Not so fast! You forgot the add to PATH step in browser tab 1! If you had ever bothered to learn anything about the Unix shell, you would have realized that was the problem hours ago!
One of the tools that is among the most useful and the most ignored in data science is the humble Unix shell. It is the glue that allows all of our tools work together, yet the extent of most data scientists' knowledge about it is a handful of commands that they have picked up along the way. If you take the small amount of time it takes to become really proficient with the Unix shell, you can boost your productivity by automating your workflow and avoiding spending large amounts of time making trivial mistakes.
Recommended Books
Linux Command Line and Shell Scripting Bible
Richard Blum
Key Features
Key Topics
- Alternative Shells
- Basic Scripting
- Control Flow
- Creating Functions
- Filesystem Navigation
- GNOME Terminal
- Installing Software
- Konsole Terminal
- Linux Environment Variables
- Linux File Permissions
- Managing Filesystems
- Monitoring Disk Space
- Monitoring Programs
- Package Managment
- Presenting Data
- Regular Expressions
- User Input
- Working with Data Files
- bash Shell Commands
- sed and gawk
- xterm Terminal
Description
This is a gentle but thorough guide to the Linux command line (most of the information will apply to the Mac OS too). Blum focuses on the bash shell, the default on most systems. We found the chapters on Linux environment variables and gawk to be almost excessively useful, and regret the fact that we didn't read a book like this when we started using the command line. Blum's book is also a great reference if you need a refresher on a particular aspect of the shell.