Thursday, September 18, 2008

How to debug environments in Unix or Linux

How to tkdiff two environments

This scenario occurs frequently:
Frank can run program Chutney Aggregator, but Bob cannot.

What differs between Frank's and Bob's environment variables such that Frank can run Chutney Aggregator and Bob cannot?

If you run the command env, your shell will dump out all the environment variables. e.g.

KDE_MULTIHEAD=false
SSH_AGENT_PID=5349
DM_CONTROL=/var/run/xdmctl
TERM=xterm
SHELL=/bin/bash
XDM_MANAGED=/var/run/xdmctl/xdmctl-:0,maysd,mayfn,sched,rsvd,method=classic
...


Unfortunately, if we were to compare the output of command env on Frank's and Bob's, it would be extremely tedious, because env seems to just dump the environment variable hash in whatever fashion it pleases.

In order to be able to compare the environments, we need to sort the output from env.

env | sort

COLORTERM=
DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-aEV1NDyS2z,guid=c31c98d5c8ad88de33dbfa0046ec2f30
DESKTOP_SESSION=default
DISPLAY=:0.0
...


If we put the output of this one-liner into a file, we can use tkdiff to compare the Frank's and Bob's environments in a visual way.

Frank performs this command in his terminal:

 env | sort > ~/tmp.shell_env


Bob does the same command in his terminal:

 env | sort > ~/tmp.shell_env


Now, we can compare the two environments:

 tkdiff ~Frank/tmp.shell_env ~Bob/tmp.shell_env



I find that the above process determines will crack most environment issues.


How to import a different environment

In tcsh, if I have to copy another person's environment variable wholesale, you can use this one-liner to create a tcsh script from their env which you may then source in another environment in order to reproduce their issues.


env | sort | perl -ne 's/'"'"'/'"'"'"'"'"'"'"'"'/g; \
s/^([\w_]+)=(.*)/setenv \1 '"'"'\2'"'"'/; print $_' > ~/tmp
source ~/tmp



If you back up your environment in the same way and then iteratively load part of their environment ( binary search?) you can pin point the difference that causes 1 environment to succeed and another to fail.


Root-Causing an Environment Issue

If you do have an environment dump from a person who is able to successfully run the program that you want to run, then you can possibly determine the root cause of the issue by iteratively importing their environment and testing whether you are now able to run the program.

To start, make sure that the problem you're encountering is caused by environment variable issues. Back up your environment as a script which you can use to restore it.


 env | sort | perl -ne 's/'"'"'/'"'"'"'"'"'"'"'"'/g; \
s/^([\w_]+)=(.*)/setenv \1 '"'"'\2'"'"'/; print $_' > ~/my_env



Take the dump that the other person created from their environment and import all variables into your shell by creating a script from their environment per How to debug environments in Unix or Linux#How to import a different environment.

Import their environment by sourcing the script you created.


source ~/other_env



Run the program and see if it passes. If it does, you know that the root cause of your issues is the difference between your environment variables and the other person's environment variables.

Now comes the binary search part.

Restore your environment:


source ~/my_env



* Comment out the first half of the other person's environment script*. Then source

their modified environment script:


source ~/other_env



Run the program. If the program runs, then the environment variable that was causing the issue is in the 2nd half of the ~/other_env script. *You have just eliminated the first half of the script as the source of your environment problem!*

If the program doesn't run, the bad environment variable is in the 1st half of the ~/other_env script. *You have just eliminated the 2nd half of the script as the source of your environment problem!* Uncomment the first half of the script and comment out the 2nd half.

Keep repeating this process with the remaining un-commented sections of environment variables by restoring your environment ( source ~/my-env ) and smashing over your environment with different pieces of ~/other_env until you find the problem environment variable.
Tricks
Identifying and Removing Unused Variables with a 'one-liner'

Unused struct members can clutter and obfuscate Specman code. For the purpose of eliminating this annoyance I have created the following one-liner that may help you identify unused Specman members.

grep -i --perl-regex -h '^ *[a-zA-Z][\w_]+ *:.*;' *e |           \
grep -v var | \
perl -ne \
's/^ *//; \
s/([\w_]+).*/$1/; \
print "$_";' | \
sort -u | \
perl -ne \
'chomp; \
$ret_val = `grep -h $_ *e | \
grep -v --perl-regex '"'"'^\\s*$_\\s*:.*;'"'"' | \
grep -v --perl-regex '"'"'^\\s*keep\\s*(soft\\s*)?$_.*==.*'"'"'`; \
if( $ret_val =~m/^\s*$/ ) { \
print "$_\n"; \
}'

This little 'one-liner' will look through all the *e files in your current directory and look for all struct or unit members and then try to see if each of the members is used in any other code in the current directory.

Aliases

Aliases for the cshell.
diff_dirs  
Pull up tkdiff for each file that is different between two directories:

alias diff_dirs 'diff --brief "\!:1" "\!:2" |\\
perl -pne '"'"'if (m/([\w-_\.\/]+) +and +([\w-_\.\/]+) +differ/)\\
{ `tkdiff "$1" "$2"` ;}'"'"



svn_diff_dirs

tkdiff the current directory with what all files used to be

alias svn_diff_dirs 'svn status |\\
grep M | \\
perl -ne '"'"'s/^M *//; \\
chomp; \\
`tkdiff .svn/text-base/$_.svn-base $_`'"'"