Oct 23

One of the things I found confusing about bash was its startup scripts: there were so many of them. Eventually I snapped and sat down with a terminal and the man pages, and worked out how it actually behaves. Here’s a summary.

Interactive
login
Interactive
non-login
Non-interactive Remote shell
/etc/profile A      
/etc/bash.bashrc   A†    
~/.bashrc   B   A
~/.bash_profile B2      
~/.bash_login B3      
~/.profile B4      
~/.bash_logout C      
BASH_ENV     A  

On startup, bash executes any script labeled A in the table above, followed by the first script B it finds. On exit, it executes any script labeled C above.

Let’s look at the column headings in a little more detail.

  • An interactive login shell is a shell that you are typing into, that is the first such shell you execute on the machine. Typically you will have had to log in immediately before the shell starts. For example, when you SSH to a remote system and type commands to that system, you are typing into an interactive login shell.
  • An interactive non-login shell is a new shell started once you have already logged in; one which doesn’t require that you log in again.For example, if you open a new terminal window in your graphical user interface and get a shell prompt, that’s an interactive non-login shell. Another example of an interactive non-login shell would be a sub-shell started from inside a text editor; for example, typing :sh in vi.
  • A non-interactive shell is a shell which doesn’t prompt you; it just runs a program and then exits. The most common example of this is any program written in shell script, such as a configure script, a startup script in /etc/init.d, or any other file marked as executable that has #!/bin/bash on the first line.
  • A remote shell is a shell started by a program such as SSH or rsh in order to run a command on a remote machine.For example, the rsync and scp commands use SSH remote shells in order to copy files between machines.

So looking at the second column, an interactive login shell will execute /etc/profile always. It then looks for ~/.bash_profile, ~/.bash_login, and ~/.profile in turn, and executes the first of those it finds. On logout, it executes ~/.bash_logout.

The /etc/bash.bashrc A† item is special; whether bash searches for it is dependent on a compile-time option.

BASH_ENV is an environment variable which allows you to make non-interactive non-login shells (such as shell scripts) execute a startup script. Set BASH_ENV to the filename of a script and then invoke a sub-shell, and the script will be executed when the sub-shell starts up.

Problems with bash’s behavior

Given the above table, the short summary is:

  • If you want something executed only when you first log in, put it in ~/.bash_profile
  • If you want something executed only for additional shells (such as OS X terminal windows and xterms), put it in ~/.bashrc

But there are a couple of problems with this arrangement, problems which suggest that bash’s startup behavior wasn’t really thought out with users in mind.

Firstly, if you are anything like me, most of the things you want to put in shell startup scripts are things you always want executed. Command aliases, for example; or environment variables that tell pieces of software where to find their bits (JAVA_HOME, ECLIPSE_HOME).

You could put those in both .bashrc and .bash_login, but that represents a maintenance problem: if you change something, you have to remember to change it in both places. So, you might set up a third file for global stuff, and use the shell command source to read it in from both .bashrc and .bash_login. I’ve seen some Linux distributions set this up as the default. I don’t like it, however, because it means you now have 3 different startup files floating around, and when you want to change something you have to remember which file it’s in (or sit and work it out).

The second issue with bash startup scripts is that the distinction between login shell and non-login shell isn’t a very useful one these days. Most of us use graphical user interfaces, so we never see a login shell on the machine we’re using. (For example, any terminal window you open on OS X is a non-login one.) Even when I use SSH to shell into a remote system, I don’t generally want that first login to behave differently to any other shell I start (such as shells inside screen).

What I do care about, on the other hand, is whether the shell is interactive. I don’t want my reminder program printing stuff when rsync is trying to connect and transfer files. I don’t want all my custom commands and aliases getting in the way when running scripts to configure or build software. And I don’t want to slow things down loading cdargs unless I’m actually going to be maneuvering around the directory structure by typing.

So what I want is to have a single customization script, and be able to split it into stuff that is always run, and stuff that is only run when I’m using the shell session interactively. Here’s how to do that.

Simple all-purpose bash initialization script

Start off by moving all your current bash startup scripts into a temporary directory, so you have a clean slate. Then, create a skeleton ~/.bashrc that looks like this:

### Start of universal section ###
# Commands in this section will be executed by both interactive and
# non-interactive shells.
# Commands here must produce no output, or they will break commands
# like scp and rsync.

### End of universal section ###
[ -z "$PS1" ] && return
### Start of interactive section ###
# Commands in this section will be executed only by interactive shells.

### End of interactive section ###

Next, cd ~ if you’re not already in your home directory, then ln -s .bashrc .bash_login

Now you have a single customization file for all your shell sessions, called ~/.bashrc. You can copy in each command from your old customization files, placing them in the appropriate section according to whether you need them all the time, or just in shells that you’re typing in to.

If you really care about login shells

If for some reason you do want to have login shells behave differently from non-login, that’s pretty simple too. Instead of the ln -s command above, create the following ~/.bash_login file:

if [ -f ~/.bashrc ]; then
  source ~/.bashrc
fi
# Commands for login shells only go under here

Now you have two places customization commands may be placed, but you get the option of having login-specific stuff.

Dealing with multiple systems

Another trick I use is to examine the host name of the machine. This lets me use the same .bashrc everywhere; my Mac’s .bashrc is the same as the one I use on my Linux box and the System z mainframe at work. Here’s the code:

if [ "$HOSTNAME" = "T41p" ]; then
  # Customizations specific to the ThinkPad laptop go in here
fi

You can use code like this in either the interactive or non-interactive section of the .bashrc above.

Mar 24

Every now and then I read an article about REXX, a scripting language designed at IBM and popularized on the Amiga. The authors of such articles generally enthuse about the language in a low-key kind of way, and I find myself wondering if maybe I should learn it.

Then I go away and find a REXX FAQ and tutorial, and I read for a bit, and I realize that no, I shouldn’t. So for my own benefit (when I later archive and index this part of my journal), here’s a quick list of reasons why I should never go near REXX:

  1. Functions can’t return multiple values, nor can they modify their arguments. If you want to write a function which returns two values (say), you need to use a magic string which you think will never occur in either of them as a separator, concatenate, return the result, and then split it apart again.

  2. Whitespace is significant—it’s interpreted as concatenation. Hence mistyped or syntactically invalid statements are quite likely to be reinterpreted as some kind of concatenation of variables. (And I thought Python was bad.)

  3. Using an undefined variable is not considered an error. Instead, it just defaults to having a value that’s the same as its name, only in upper case. Truly foul, especially when combined with misfeature #2 above.

    So if (for example) you put whitespace between a function name and the brackets surrounding its arguments, it suddenly stops being a function call and becomes a concatenation of strings instead. Pass the barf bag.

  4. REXX normally guesses continuations, by assuming the next line is a continuation of the current line if the current line doesn’t look like a complete statement.

  5. Comma is used both to separate function arguments, and to indicate explicit continuation. So in spite of #3, you can’t just break a long list of function arguments across multiple lines—you have to turn the last comma on each line into double-comma, or you get something completely not what you intended. Ugh.

  6. You’re allowed to use variables that have the same names as words used in the language itself.

  7. Scoping is dynamic. Functions and procedures are just a hack whereby the system temporarily hides all variables except the listed ones, until it next hits a return. Not that you have to; it’s quite possible to write functions with overlapping scope.

  8. Forget about associative storage, REXX doesn’t even have arrays. You can simulate them with‘compound variables’, but then there’s no type checking or bounds checking. If you want any, you have to write it yourself.

  9. You can’t pass arguments by reference. In fact, you can’t pass them by value either. Instead, you have to pass constants, and have the function or procedure use those constants to calculate the name of the variables it should use.

Still, I’m sure it’s better than JCL.

Anyone want to convince me of the beauty of REXX, in spite of the above? If so, give it your best shot.

Jun 05

Ch. The performance of a scripting language, with the cleanness and ease of use of C. What could be better?

Well, perhaps someone could build a scripting interpreter for x86 assembler…