Thursday, September 18, 2008

How to debug environments in Unix or Linux

How to tkdiff two environments

This scenario occurs frequently:
Frank can run program Chutney Aggregator, but Bob cannot.

What differs between Frank's and Bob's environment variables such that Frank can run Chutney Aggregator and Bob cannot?

If you run the command env, your shell will dump out all the environment variables. e.g.


Unfortunately, if we were to compare the output of command env on Frank's and Bob's, it would be extremely tedious, because env seems to just dump the environment variable hash in whatever fashion it pleases.

In order to be able to compare the environments, we need to sort the output from env.

env | sort


If we put the output of this one-liner into a file, we can use tkdiff to compare the Frank's and Bob's environments in a visual way.

Frank performs this command in his terminal:

 env | sort > ~/tmp.shell_env

Bob does the same command in his terminal:

 env | sort > ~/tmp.shell_env

Now, we can compare the two environments:

 tkdiff ~Frank/tmp.shell_env ~Bob/tmp.shell_env

I find that the above process determines will crack most environment issues.

How to import a different environment

In tcsh, if I have to copy another person's environment variable wholesale, you can use this one-liner to create a tcsh script from their env which you may then source in another environment in order to reproduce their issues.

env | sort | perl -ne 's/'"'"'/'"'"'"'"'"'"'"'"'/g; \
s/^([\w_]+)=(.*)/setenv \1 '"'"'\2'"'"'/; print $_' > ~/tmp
source ~/tmp

If you back up your environment in the same way and then iteratively load part of their environment ( binary search?) you can pin point the difference that causes 1 environment to succeed and another to fail.

Root-Causing an Environment Issue

If you do have an environment dump from a person who is able to successfully run the program that you want to run, then you can possibly determine the root cause of the issue by iteratively importing their environment and testing whether you are now able to run the program.

To start, make sure that the problem you're encountering is caused by environment variable issues. Back up your environment as a script which you can use to restore it.

 env | sort | perl -ne 's/'"'"'/'"'"'"'"'"'"'"'"'/g; \
s/^([\w_]+)=(.*)/setenv \1 '"'"'\2'"'"'/; print $_' > ~/my_env

Take the dump that the other person created from their environment and import all variables into your shell by creating a script from their environment per How to debug environments in Unix or Linux#How to import a different environment.

Import their environment by sourcing the script you created.

source ~/other_env

Run the program and see if it passes. If it does, you know that the root cause of your issues is the difference between your environment variables and the other person's environment variables.

Now comes the binary search part.

Restore your environment:

source ~/my_env

* Comment out the first half of the other person's environment script*. Then source

their modified environment script:

source ~/other_env

Run the program. If the program runs, then the environment variable that was causing the issue is in the 2nd half of the ~/other_env script. *You have just eliminated the first half of the script as the source of your environment problem!*

If the program doesn't run, the bad environment variable is in the 1st half of the ~/other_env script. *You have just eliminated the 2nd half of the script as the source of your environment problem!* Uncomment the first half of the script and comment out the 2nd half.

Keep repeating this process with the remaining un-commented sections of environment variables by restoring your environment ( source ~/my-env ) and smashing over your environment with different pieces of ~/other_env until you find the problem environment variable.
Identifying and Removing Unused Variables with a 'one-liner'

Unused struct members can clutter and obfuscate Specman code. For the purpose of eliminating this annoyance I have created the following one-liner that may help you identify unused Specman members.

grep -i --perl-regex -h '^ *[a-zA-Z][\w_]+ *:.*;' *e |           \
grep -v var | \
perl -ne \
's/^ *//; \
s/([\w_]+).*/$1/; \
print "$_";' | \
sort -u | \
perl -ne \
'chomp; \
$ret_val = `grep -h $_ *e | \
grep -v --perl-regex '"'"'^\\s*$_\\s*:.*;'"'"' | \
grep -v --perl-regex '"'"'^\\s*keep\\s*(soft\\s*)?$_.*==.*'"'"'`; \
if( $ret_val =~m/^\s*$/ ) { \
print "$_\n"; \

This little 'one-liner' will look through all the *e files in your current directory and look for all struct or unit members and then try to see if each of the members is used in any other code in the current directory.


Aliases for the cshell.
Pull up tkdiff for each file that is different between two directories:

alias diff_dirs 'diff --brief "\!:1" "\!:2" |\\
perl -pne '"'"'if (m/([\w-_\.\/]+) +and +([\w-_\.\/]+) +differ/)\\
{ `tkdiff "$1" "$2"` ;}'"'"


tkdiff the current directory with what all files used to be

alias svn_diff_dirs 'svn status |\\
grep M | \\
perl -ne '"'"'s/^M *//; \\
chomp; \\
`tkdiff .svn/text-base/$_.svn-base $_`'"'"