Terminal Primer – Part 5 – Managing Files

It is vacation time here in the Scripting OS X headquarters and I will be traveling with family for most of August. To compensate for that I will be posting a draft of parts of my next book on macOS Terminal and bash. Let me know how you like them or if you think something important is wrong or missing. You can give feedback in the comments, over Twitter or in the MacAdmins forum (also scriptingosx). Thank you for your interest and feedback!

Managing Files

We already know how to navigate and read the file system with cd and ls.
Now we want to actually do something to the files.

Create an Empty File

Sometimes it can be useful to quickly create an empty file. You can use the touch command to do this.

$ touch emptyfile

If you use the touch command on a file that already exists it will update the file’s modification date and change nothing else.
macOS (and other Unix-like operation systems) sometimes uses the existence of a file in a certain directory as a flag for configuration.
For example, when a file named .hushlogin exists at the root of a user’s home directory, the ‘Last Login: …’ message when you open a new shell (terminal window) is suppressed.

$ touch ~/.hushlogin

will create this file and subsequent new Terminal windows will not show this message. To return to showing the message, you will have to delete the file.

Deleting Files

To delete a file use the rm (remove) command:

$ rm document.txt

You have to use special care with the rm command.

In the Finder, deleted files are moved to the Trash, which is actually the invisible directory ~/.Trash. There the file will remain until the user chooses ‘Empty Trash’. Only then is the file removed from disk.
The command line has no such safety net.

When you delete a file (or directory) with the rm command it is gone. You have to be especially careful with files with special characters. If a filename has a space in it, and you run the rm command without escaping or quotes, then you will get an error or even worse, might delete the wrong file.

For example:

$ rm My Important Document.txt

Will delete the three files My, Important, and Document.txt, if they exist. If they do not exist it will show errors.

Use escape sequences or quotation marks to protect from spaces and other special characters in and directory names:

$ rm 'My Important Document.txt'

Tab-completion will also protect from improperly typed or escaped file names. If the tab-completion will not work, even though you believe you have the right file or path then something went awry and you have to step back and verify your working directory and paths.

To delete the .hushlogin file we created above, you use rm

$ rm ~/.hushlogin

Once the file is removed, new terminal windows will show the ’Last Login: …" message again.

You can add the -i option to the rm command which will ask for confirmation before actually deleting a file.

$ rm -i ~/.hushlogin 
remove /Users/armin/.hushlogin? y

Creating Directories

To create a new empty directory (or folder) you use the mkdir command.

$ mkdir scratchspace

you can give the mkdir command multiple arguments when you want or need to create multiple directories at once

$ mkdir folder1 folder2 folder3

When you create a nested directories, all the directories in between already have to exist:

$ mkdir LevelA
$ mkdir LevelA/LevelB/LevelC
mkdir: LevelA/LevelB: No such file or directory 

When you need to create nested directory hierarchies like this, you can use mkdir’s -p option:

$ mkdir -p LevelA/LevelB/LevelC

This will create all three folders at once, if they do not already exist.

Moving and Renaming

You can move a file or directory using the mv command. The mv command needs two arguments: the source and the destination.

$ touch testfile
$ mkdir testdir
$ mv testfile testdir
$ ls testdir
testfile

This mv command reads as ‘move the file testfile to the directory testdir’. To move it back to the current working directory you can use the ‘.’ short cut.

$ mv testdir/testfile .

The mv command can also rename a file:

$ mv testfile samplefile

Moving and renaming is considered the same in the shell.

Warning: When a file already exists in the destination, mv will mercilessly overwrite the destination file. There is no way to retrieve an overwritten file.

You have to take care to type the proper paths in the shell. It is a very unforgiving environment.
To make mv a bit safer add the -i option which will prompt to confirm when it will overwrite a file:

Warning: when you use mv to move between volumes, the source file will be removed after it is moved to the destination. This is different from the behavior in Finder, where the default drag action between volumes is copy.

Filename Extensions

On macOS and other operating systems it is common to denote the file type with an extension. The extension is a standard alphanumeric code separated from the rest of the file’s name by a dot or period. E.g. .txt, .pdf or .mobileconfig.

In bash, the filename extension is part of the filename. There is no special treatment for the extension.
On macOS, however, Finder usually hides the file extension from the user. You can control the display of the file extension in the ‘Advanced’ tab of Finder Preferences. You can also control this setting for each individual file in its Info panel.

Finder will also warn when you attempt to change the file extension, since it might change which application is used to open a file. (You can also disable this in the Finder Preferences.) bash has no such warning mechanism.

$ mv hello.txt hello.md

Note: A feature specific to macOS is that some directories will have filename extensions and Finder will display them as if they were files, not folders. These folders are called ‘packages’ or ‘bundles.’ The most common example are applications with the .app extension.
Packages and bundles are used to hide complex file and data structures from users. Another example is the ‘Photos Library’ (or ‘iPhotos Library’ on older systems) which hides a big and complex folder structure.
You can choose ‘Show Package Contents’ from the context menu in Finder to drill down further into the internal structure of a package or bundle.
bash and other shells are not really aware of packages or bundles and treat them like normal directories.
You can read more detail on Bundles and Packages on the Apple Developer Page.
Note to the Note: the name ‘packages’ is also used for package installer files (with the .pkg extension). These are different uses of the same word.

Copying

The command to copy is cp. It follows similar syntax as the mv command:

$ cp source destination

So you can copy the samplefile we created earlier:

$ cp samplefile newsamplefile

You can also copy to a directory:

$ cp samplefile testdir

Warning: the cp command will mercilessly overwrite existing files with the same name!

When you run cp again it will overwrite the existing copy:

$ cp samplefile testdir

As with the rm command overwritten files are lost. There is no way to retrieve overwritten files.

The -i option shows a prompt to confirm whenever a file will be overwritten:

$ cp -i samplefile testdir
overwrite testdir/samplefile? (y/n [n]) n
not overwritten

When you try to copy a directory, you will get an error message:

$ cp testdir newdir
cp: testdir is a directory (not copied).

Since the command to copy a file or directory would look exactly the same, cp expects an extra option to be certain you know what you are doing. The -R option (for recursive) will tell cp to recursively copy all files and sub-directories (and their contents) of a folder.

$ cp -R testdir newdir

This will create a copy of testdir and all its contents with the name newdir.

Warning: the option for recursive copying is -R (uppercase R). There is a legacy option -r (lowercase r) which seems to do the same thing. However, there is a difference in behavior mentioned in the cp man page:

Historic versions of the `cp` utility had a `-r` option.  This implementation supports that option; however, its use is strongly discouraged, as it does not correctly copy special files, symbolic links, or fifo's.
If the destination directory already exists the way you write the path of the source directory will influence the behavior of cp.

When the path to the source does not end with a /, cp will create a copy of the directory in the destination directory:

$ mkdir dirA
$ cp -R testdir dirA
$ ls dirA
testdir

When the path of the source directory ends with a /, cp will copy all the contents of the source directory to the destination folder:

$ ls testdir
samplefile
$ mkdir dirB
$ cp -R testdir dirB
$ ls dirB
samplefile

Warning: when you use tab-completion to complete paths to directories the / is always appended! You will need to consider whether you want to keep the trailing / or not.

You can add more source arguments to a cp command, the last argument will be the destination:

$ cp -R samplefile otherfile hello.txt bigfolder

Wildcards (Globbing)

Note: In early versions of Unix wildcard substitution was the responsibility of a program called glob, short for ‘global command.’ Because of this the action of replacing wildcards with actual paths and filenames was and still is called globbing.

When you have to address or manage many files at once, it can be slow, tedious and ineffective to address each file individually. bash provides wildcard characters to make that easier.

There are two commonly used wildcard characters: * and ?

The asterisk or star * will match zero or more characters. It can be placed anywhere in a path.

The question mark ? will match any character, but there has to be a character.

By default filenames that start with a period ‘.’ are not matched, unless you specifically start the string with a dot .*

It is important to keep in mind that bash will build a list of filenames that match the wildcards and substitute this list in place of the name with the wildcard(s) before executing the command.

When you enter

$ cd ~
$ ls D*

The D* will be replaced with the list of filenames that match (Desktop Documents Downloads) and then executed:

$ ls Desktop Documents Downloads

This can lead to some unforeseen consequences. For example say you are in a folder with some files and directories:

$ ls -F
dirA/  dirB/  dirC/  file1  file2  file3

And you run

$ cp file? dir?

The wildcards will be expanded to

$ cp file1 file2 file3 dirA dirB dirC

Which means that the three files as well as dirA and dirB will be copied into dirC, since treats the last argument as the destination and all previous arguments will be copied.

You can use wildcards in paths, so /Users/*/ will expand into all directories in the /Users folders.

However, /Users/*/Desktop will expand into a list of all users’ Desktop folders. Note that the first list contains /Users/Shared while the second does not contain /Users/Shared/Desktop, because that directory does not exist!

Warning: Wildcards can be extremely useful, but also very dangerous. They have to be handled with utmost caution, especially with potentially destructive commands such as mv, cp, and rm.

You can always test the result of wildcard expansion with the echo command:

$ echo /Users/*/
/Users/Guest/ /Users/Shared/ /Users/armin/
$ echo /Users/*/Desktop
/Users/Guest/Desktop /Users/armin/Desktop

You can also hit the escape key twice and bash will show the expansion, if there are any:

$ echo /Users/*<esc><esc>
Guest/  Shared/ armin/

Finally, bash has a third globbing or wildcard character, but it is a bit more complex. You can provide a list of possible characters between square brackets.

[bcr]at

will match

bat, cat or rat, but not Bat, Cat or Rat

Since shell commands are case-sensitive, you may have to provide both cases, if you want to match:

[bBcCrR]at

No matter how many characters are in the square brackets, they will match to exactly one character:

[bB][aei]t matches bat, Bat, bet, Bet, bit, or Bit

Deleting Directories

We have been creating and copying a lot of files. It is time to clean up. We already know the rm command to remove files. However, when you try to use rm to delete a directory you get:

$ rm newdir
rm: newdir: is a directory

There is a command rmdir which is the destructive equivalent of mkdir. However, rmdir can only remove empty directories:

$ rmdir newdir
rmdir: newdir: Directory not empty

You can use the * wildcard to delete all files in newdir:

$ rm newdir/*

Note: the * wildcard will not expand to filenames starting with a period. You may have to explicitly delete dot files as well:

$ touch newdir/.dotfile
$ rm newdir/*
$ rmdir newdir
rmdir: newdir: Directory not empty
$ rm newdir/.*
rm: "." and ".." may not be removed
$ rmdir newdir

This will work as long there are only files or empty directories in newdir. When the directory you want to delete contains an entire hierarchy of files and directories, then this approach will be cumbersome.

For this the rm command has the -R option which will recursively delete all contents and subdirectories.

$ rm -R testdir

Since there is no way to recover a file deleted by rm you should always use this command with care, especially when using the -R option.

You can add the -i option when using -R as well, but then you will be prompted for every single file and subdirectory, which can be very tedious and counter-productive.

Note: unlike the cp command, the -r and -R option for the rm command are synonyms. However, for consistency’s sake and to build muscle memory. I would recommend making a habit of using the -R syntax for both commands.

Leave a Reply

Your email address will not be published. Required fields are marked *