Saturday 20 June 2009

MSI PR200WX-058EU sleep - 91a6c462b02d8dc02dbe95e5a407d78078a38d01 is first bad commit

Nailed it!

After a rocky start, I managed to find the commit that broke sleep for my laptop.


91a6c462b02d8dc02dbe95e5a407d78078a38d01 is first bad commit
commit 91a6c462b02d8dc02dbe95e5a407d78078a38d01
Author: H. Peter Anvin

Date: Wed Jul 11 12:18:57 2007 -0700

Use the new x86 setup code for x86-64; unify with i386

This unifies arch/*/boot (except arch/*/boot/compressed) between
i386 and x86-64, and uses the new x86 setup code for x86-64 as well.

Signed-off-by: H. Peter Anvin

Signed-off-by: Linus Torvalds



Simply reverting this commit wouldn't fix the problem entirely since the screen was always blank after the successful resumes; OTOH, the script used for testing was supposed to do stuff after resume, stuff which would have effects visible on the hard-disk, so it was visible on the next reboot if the last sleep/resume cycle was successful or not.


Great. Oh, and git bisect rules!




Now, if you're interested in finding a regression in the kernel and you might be interested in how I automated the thing, here are some small scripts I used:

  • linux-build - a wrapper script around make-kpkg to build .deb packages of the linux kernels I build; I used it way before this bisect, but now I modified it in such a way the kernels are clearly versioned and indicate the commit to which they correspond, too
  • sleepit - a script that automated the actions needed for a linux kernel to be tested; is really trivial and highly specialized on sleep/resume debugging; it assumes to be ran in the directory where you'd later want to grab dmesg-s outputs from
  • sleeptest - a wrapper script that is smart enough to detect if the current kernel is a kernel to be tested or a stable (regular kernel) one
    • if the kernel is a stable one:
      • looks for the signs left by the last test kernel and depending on them, mark the kernel bad or good in the bisect; this would result in a new checkout which would be processed or, if the bad commit was identified, the script would stop
      • in the case of a new bisect, the new checkout is cleaned up, patched, built, then the script installs the new linux-image .deb[1] and update-grub[2], leaving the reboot command at my discretion for the eventual case something went awry; a failure to compile the kernel in an automated fashion would have dropped me in an interactive console which meant I had to manually do the steps necessary to be ready to boot into the next kernel
    • if the kernel is a test kernel run the sleepit script
The main script was the sleeptest script which is ran as root to allow sleep commands, installation of the kernel and update-grub; when building, the build is done via su to my user.

As a supplemental speed up, I configured libpam-usb to authenticate root and myself through a USB storage device, which is quite cool. I am still pondering if I should keep this enabled or migrate to something like libpam-rsa[*].

Of course, the scripts contain stuff hard-coded into them (my user name for one), but they can easily be modified to remove those limitations (generally they use variables).


linux-build


#!/bin/sh
# License: GPLv2+/MIT
# Author: Eddy Petrișor
#
# Acest script trebuie rulat din directorul nucleului cu comanda:
# linux-build [--no-headers] [--rebuild]
#
# This script must be ran from the kernel tree directory with
# linux-build [--no-headers] [--rebuild]

FATTEMPT=../attempt

TARGETS="kernel-image kernel-headers modules_config modules"
[ "$1" = "--no-headers" ] && shift && TARGETS="$(echo $TARGETS | sed 's#kernel-headers ##')"

if [ -f $ATTEMPT ]
then
ATT=`cat "${FATTEMPT}"`
if [ $# -eq 0 ]
then
ATT=`expr $ATT + 1`
make-kpkg clean
else
if [ $# -eq 1 ] && [ $1 = '--rebuild' ]
then
# nothing to do, we are already set
echo 'Preparing for rebuild'
else
echo 'Illegal parameters'
exit 2
fi
fi
else
ATT=1
fi

# no problem if is rewritten on rebuild
echo "$ATT" >$FATTEMPT

# must define MODULE_LOC for mol module compilation
DIR=`pwd`
cd ..
MODULE_LOC=$(pwd)/modules
# this didn't work
# export ALL_PATCH_DIR=$(pwd)/linux-patches
cd ${DIR}

echo "Modules should be here: ${MODULE_LOC}"
echo "Stop by ctrl+c, if the independent modules aren't there"

# press ctrl+c, if needed -- disabled for now
#read

export MODULE_LOC
export CONCURRENCY_LEVEL=$(grep -c 'processor' /proc/cpuinfo)

[ -d .git ] && PREFIX="g$(git log --pretty=oneline --max-count=1 | cut -c 1-8)-" || PREFIX=""
APPEND=$PREFIX$(hostname)

#time make-kpkg --rootcmd fakeroot --revision ${ATT} --stem linux --append-to-version -`hostname` --config menuconfig --initrd --uc --us kernel-image kernel-headers modules_config modules
#time make-kpkg --rootcmd fakeroot --revision ${ATT} --stem linux --append-to-version -`hostname` --added-patches 'ata_piix-ich8-fix-map-for-combined-mode.patch,ata_piix-ich8-fix-native-mode-slave-port.patch' --config silentoldconfig --initrd --uc --us kernel-image kernel-headers modules_config modules
time make-kpkg --rootcmd fakeroot --revision ${ATT} --stem linux --append-to-version -$APPEND --config silentoldconfig --initrd --uc --us $TARGETS


sleepit


#!/bin/sh

FAILEDRESUME=/failed-resume
RESUMED=/resumed

modprobe i915
invoke-rc.d acpid stop
echo "$(uname -r)" > $FAILEDRESUME
dmesg >dmesg_before_$(uname -r); echo mem > /sys/power/state; dmesg >dmesg_after_$(uname -r); sync
echo 'resumed, oh my god' > resumed
echo "$(uname -r)" >> $RESUMED
rm -f $FAILEDRESUME
sync
sleep 10
reboot



sleeptest


#!/bin/sh

RESULTSDIR=/root/var/debug/sleep/regression
UNAMER="$(uname -r)"
FAILEDSLEEPFILE=/failed-resume
RESUMED=/resumed
SOURCEDIR=/home/eddy/usr/src/linux/linux-2.6

check_same_commit ()
{
local COMMIT
COMMIT=$(git log --pretty=oneline --max-count=1 | cut -c 1-8)
[ "$COMMIT" = "$1" ] && return 0 || return 1
}

get_rev_from_unamer ()
{
echo "$1" | sed 's#.*-g\([0-9a-f]*\)-heidi#\1#'
}

mark_bad ()
{
cd $SOURCEDIR
su -c 'git reset --hard HEAD' eddy
su -c 'git bisect bad' eddy
cd -
}

mark_good ()
{
cd $SOURCEDIR
su -c 'git reset --hard HEAD' eddy
su -c 'git bisect good' eddy
cd -
}

compile_next ()
{
cd $SOURCEDIR
if [ -f $FAILEDSLEEPFILE ] ; then
LKVER=$(cat $FAILEDSLEEPFILE)
else
LKVER=$(tail -n 1 $RESUMED)
fi
PREVCOMMIT=$(get_rev_from_unamer "$LKVER")

if check_same_commit "$PREVCOMMIT" ; then
echo "It looks like you got your result!"
exit 1337 # of course $? isn't 1337, but anyways
fi

su -c 'make clean && rm -fr debian && git reset --hard HEAD && patch -p1 < lkver="$(cat">>> BAD <<< $LKVER ($(get_rev_from_unamer $LKVER))" mark_bad else LKVER=$(tail -n 1 $RESUMED) echo "Marking >>> good <<< $LKVER ($(get_rev_from_unamer $LKVER))" mark_good fi compile_next && \ cd $SOURCEDIR/.. && \ echo 'Installing the linux-image and running update-grub && reboot' && \ dpkg -i $(ls linux-image-*_$(cat attempt)_*.deb) && \ update-grub fi


You have my permission to use, modify and redistribute these scripts or modified versions based on these under the terms of the MIT license.


[*] because the libpam-rsa package seems to be unmaintained (especially upstream), while libpam-usb seems to inactive (maybe is considered finished by upstream?)

[1] I didn't automate the removal of the previous test kernel, but that could have been done easily

[2] I haven't made a custom grub section for the test kernels in such a way they would boot by default at the next reboot since I considered that to be too cumbersome for the moment (although I had /vmlinuz symlinks) and it was simpler to select manually the kernel

No comments: