Bash Script Examples (Directory Navigation)

in #bash7 years ago (edited)

icon

Herein you'll find an explanation of the current working directory, directory traversal, and various ways to accomplish directory traversal and the effects and side effects as commonly leveraged by shell scripts.

I'm never sure how long to make the first sentence to get it to show up in the synopsis, but what I was trying to convey is that this article is going to discuss file locations and directory navigation as would likely be carried out from a shell script. The target audience is still beginner, so leave a comment below if I gloss over any details. And even though I'm talking about shell scripts, you can obviously use all of these commands in the regular shell. Some of them are even time savers.

This article covers changing directories only in Bash, other shells may make use of additional variables or have other side effects, I'm sticking strictly to Bash here.

I'll cover the following commands:

  • cd
  • popd
  • pushd
  • pwd

And I'll demonstrate these in the shell and in functions.

I provide examples with commands that I haven't talked about yet, so here's a brief description of those commands:

  • rmdir Removes the specified directory, but that directory must be empty first.
  • mkdir Creates a new directory.
  • touch Updates a time stamp on an existing file or creates a new file with the specified name and sets its timestamps appropriately.

Definitions

Everything in here has to do with directories and modifying the current working directory, so what's a directory? According to Wikipedia it "is a file system cataloging structure with contains references to other computer files, and possibly other directories." More simply, they're the folders on the computer containing other files and folders.

So what's a working directory? It's the directory in which the current command is executing or where commands will be executed when they are executed in the shell. So what's the shell? It's the command interpreter that is reading your commands or executing your shell script. In this article, that's Bash.

So, what's a sub shell? It's a shell being executed under another shell. It follows the same rules as the parent shell, but none of the context from the sub shell is saved, just the return value. More on this later.

Directory Commands

Let me just say up front that the directory changing commands all do the following:

  • change directory
  • set the PWD variable
  • have a non zero error on failure to change directory

Also, it's not specific to any command, but if you use a variable as the destination for switching directories, be sure to put quotes around that variable, otherwise the results could be unpredictable if a space ends up in the value of the directory name.

For example:

    not-a-bird@nest:/tmp$ export WHERE="dir with space"
    not-a-bird@nest:/tmp$ mkdir "${WHERE}"
    not-a-bird@nest:/tmp$ cd $WHERE
    bash: cd: too many arguments
    not-a-bird@nest:/tmp$ cd "${WHERE}"
    not-a-bird@nest:/tmp/dir with space$

In this example, a variable is set to dir with space. This value is used to create a directory with the mkdir command, using appropriate quoting. Then, a cd command is attempted without proper quoting and the resulting error is:

bash: cd: too many arguments

This error happens because Bash replaces the variable with the value, but the value contains spaces, so the cd command is being handed three arguments instead of one argument with spaces in it. Next in the example, appropriate quoting is used in the cd command and the user is in the new directory.

So be sure to put quotes around your variables when changing directory.

cd

First, a quick explanation of cd. From the help cd command, Change the current directory to DIR. The default DIR is the value of the HOME shell variable.

There's actually a whole lot more in there, but I suggest not making use of those features in shell scripts without heavily documenting the shell script. Anyway, the general use of cd is to switch directories. If no directory is provided, the command will try to change to whatever path is in ${HOME}. If a variable is passed to the cd command, Bash will substitute the value of the variable to cd and it will then try to change directory to that value.

If you pass - to the cd command it will go the directory that was the current working directory before the cd command was used to get to the current directory.

For example:

    not-a-bird@nest:/tmp$ cd
    not-a-bird@nest:~$ cd -
    /tmp
    not-a-bird@nest:/tmp$

In this example, the user was in the /tmp directory. cd took them to $HOME (represented by the tilde ~) and then a cd - put them back in /tmp.

I recommend not using cd - in a shell script to preserve the current directory, there are better, more predictable ways to do that. In general, you use the cd command in a shell script when you don't care about preserving where you've been and you only care about where you're going. I'm not saying you should avoid cd - in a shell script, it's just fine for navigating, but don't use it to preserve state between invocations of functions.

There are two special files that exist in every directory, these are . and .. and they refer to the current directory and the parent directory, respectively. You can pass both of these as values to the cd command. It may be more obvious as to why you would want to use .. since that will change the current working directory to the parent directory, but why would you want to use cd .?

Well, it turns out if you happen to remove the current working directory while you are in it, and then you put that directory back, you wont be able to do anything in that directory unless you change directory again. Sound confusing?

Here's an example to illustrate both the cd . and the types of errors you might see:

not-a-bird@nest:/tmp$ mkdir foo
not-a-bird@nest:/tmp$ cd foo
not-a-bird@nest:/tmp/foo$ touch bar
not-a-bird@nest:/tmp/foo$ ls
bar
not-a-bird@nest:/tmp/foo$ rm bar
not-a-bird@nest:/tmp/foo$ rmdir /tmp/foo
not-a-bird@nest:/tmp/foo$ mkdir /tmp/foo
not-a-bird@nest:/tmp/foo$ touch bar
touch: cannot touch 'bar': No such file or directory
not-a-bird@nest:/tmp/foo$ cd .
not-a-bird@nest:/tmp/foo$ touch bar
not-a-bird@nest:/tmp/foo$ ls
bar
not-a-bird@nest:/tmp/foo$ exit

So what happened here? The user starts in the /tmp directory and creates a new directory foo.
The user then moves into foo, and creates a file called bar with the touch command. They list the files to see that bar was created successfully. So far, so good?
Next the user removes the file they just created with the rm command. Then they remove the current directory with the rmdir command.

Now, at this point in the example it's obviously contrived, but this scenario could happen in two different terminals, in one where you are cleaning up directories and in the other where you happen to be sitting in that directory that was just cleaned up, but you want to recreate that directory and use it for other stuff. Back to the example...

Next the directory is recreated with the mkdir command. And then the user tries to touch a file called bar, but it fails. Note that the directory exists, but the user just isn't in it, the user is in the deleted handle for an older directory of the same name. So, to get back to the correct directory, or the new directory that was created with the same name as the old one, the user does a cd . and then they are able to run the touch command and then the ls command to confirm that everything is happening as is expected.

pushd/popd

The most common use of pushd is to save the current directory to the stack and then switch to the specified directory so that later on, the current directory can be restored from the value in the stack. For example:

    not-a-bird@nest:/tmp$ pushd /var
    /var /tmp
    not-a-bird@nest:/var$ popd
    /tmp
    not-a-bird@nest:/tmp$

So, in this example the user starts out in /tmp, they use pushd /var which then drops them in the /var directory. Notice the output? It displays the entire directory stack (after it switched directories). (You can also view the directory stack by typing in dirs.) Then the user uses popd to go back to the directory they were in.

There's a second use of pushd, but I suggest never ever ever putting it in a shell script because no one will understand what it's doing, except for you. The second use of pushd is that it will swap the order of the last element on the stack without adding more elements to the stack. Incidentally, this is a great source of bugs when the command is invoked with a variable that contains no value. But for now, here's and example of how it works without arguments:

    not-a-bird@nest:/tmp$ pushd /var
    /var /tmp
    not-a-bird@nest:/var$ pushd /usr
    /usr /var /tmp
    not-a-bird@nest:/usr$ pushd
    /var /usr /tmp
    not-a-bird@nest:/var$ popd
    /usr /tmp
    not-a-bird@nest:/usr$ popd
    /tmp
  1. Starting in /tmp, pushd goes to /var
  2. From /var a pushd to get to /usr.
  3. While in /usr a pushd with no arguments moves back to /var and leaves /usr on the stack.
  4. Now that the user is back in /usr a popd moves the user back to /var
  5. From /var a popd goes to /tmp

So, a pushd with no arguments swaps the current directory with the one on the stack. The same number of popd will still be executed to get back to the start, so this is useful in shell scripts, but it's not commonly used, and so I would suggest either shying away from it or putting in a giant comment block to explain to the reader what's happening.

pwd

The pwd command prints the working directory. You can also access this value from the ${PWD} shell environment variable, but if you're running the shell interactively, it's easier and faster to type pwd. In a shell script there's not much difference between using the value of pwd or the value of ${PWD}.

Preserving the current directory

Generally, when you enter a function that needs to switch directories, you want to leave that function with the current working directory being the same as when the function was entered. There are a few strategies for this:

  • save the current directory in a variable
  • save the current directory to the stack
  • execute commands in a sub shell so that the current directory is never modified

All of these are valid approaches and you'll just have to decide for yourself which one is appropriate in which scenario.

Saving the current directory to a variable

Generally this is done in a function so, this example uses the local keyword so that the variable doesn't get saved in the global context.

Using the pwd command:

foo() {
    local SAVED_DIR=$(pwd)
    # other commands here
    cd "${SAVED_DIR}"
}

Using the ${PWD} variable:

foo() {
    local SAVED_DIR=${PWD}
    # other commands here
    cd "${SAVED_DIR}"
}

Note that the value of ${PWD} did not need to be quoted when it was assigned to the SAVED_DIR variable, but the SAVED_DIRvariable *DID* need to be qutoed when it was passed tocd` in order to handle spaces correctly.

Saving the current directory to the stack

It's important that the function body, if it uses pushd elsewhere, uses popd the same number of times to ensure the stack isn't left in an unpredictable state.

foo() {
    pushd "${SOMEWHERE}"
    # other commands here
    popd
}

The pushd command saves the current working directory and switches to ${SOMEWHER} and then other commands are used. Before exiting the function, the directory is restored by calling popd. The only real drawback here is that the function must not leave the stack in an unpredictable state by calling popd or pushd such that the last directory is no longer the directory that was saved upon entering the function.

Executing the commands in a subshell

Generally this will be done with only one or two commands, here, in a rather contrived example, the function will just display the current directory in a subshell, but hypothetially you would want to do something useful.

foo() {
    (
    cd "${SOMEWHERE}"
    # other commands here
    )
}

Note that this function would need actual commands in the place of #other commands here.

So, what does it mean to run something in a sub shell? Well, it's still a Bash shell, and it has access to all of the variables that have been exported from the parent shell, but any variables set in it will not be exported back to the parent shell. So if the function needs to set variables, this mode of preserving the current directory isn't going to work because those variables would never live in the parent shell. What use is it? Well, because it cannot change the parent's context, the parent shell will never have changed directory as a result of that first cd "${SOMEWHERE}" line. All the commands in that subshell will be executed in that "${SOMEWHER}" but when the foo function exits, the parent shell will never have changed directories.

Summary

So, now you should have a pretty good idea on what you can use for changing directories and how to either have that affect the parent script or not. I covered the following topics:

  • directories
  • changing directory
  • saving the current directory
  • sub shells

References:

Image source is Pixabay


This is a continuation of my Bash Scripting Example Series, other entries in this series:


Sort:  

Awesome job on the post. I actually didn't know the pushd or popd commands, or I forgot about them, even though I use the command line every day, and use linux as my daily driver.

There are similar commands in python though, for working with variables.

Yeah, I really like getting a chance to see how other people use the CLI for just this reason, they bring up stuff I've forgotten about or haven't even tried before.

I did not know about the subshell. Seems useful, will have to try it out!

Push/pop I also know about but haven't gotten into the habit of using though I do occasionally.

I was actually considering a more in-depth explanation of sub shells would be a good topic for another post. Because along with () there is also $() and `` and <() and stuff | otherstuff that all end up invoking sub shells... and while I'm at it, here-docs might fit in there, too, as you can do stuff like bash << EOF bash <<< stuff...

I almost always forgot to popd after I pushd... I generally end up cd -ing...

Same same. I definitely use $() a lot.

Have you gone over piping in one of your posts yet? Somehow I was just reminded of it and it's one of the most common things I use for scripting tasks. Must be must be...

I've referenced it a little bit when talking about while loops, but I haven't devoted much attention to it. I think between subshells and piping I've more than enough to fill another article.

Coin Marketplace

STEEM 0.18
TRX 0.16
JST 0.031
BTC 60915.40
ETH 2627.54
USDT 1.00
SBD 2.58