Salt * Wet * Bytes

January 18, 2008

Windebug: Windows 2003 BSODing frequently with different error

Filed under: WinHardware, Windebug — saltwetfish @ 7:54 am
Tags: ,

Recently we found a server (HP Proliant DL580G2) rebooting itself almost everyday because of BSOD. My first instinct was to get the server patched with all the latest patched. I especially thought that the issue is SATA or SCSI drivers, as we have seen other Windows 2003 servers having similar issues with storport drivers, HP Proliant server running storport crashing, and it resolved via patching the HP drivers and updating W2k3 with the hotfixes.

This was also the original diagnosis and recommendation from Microsoft support also:

1. The stop code is 0×000000C2, which indicates that the current thread is making a bad pool request:

1: kd> .bugcheck
Bugcheck code 000000C2
Arguments 00000007 0000121a 00000800 e9a15d90

2. The 4th parameter is the pool block address got corrupted:
*e9a15d78 size: 288 previous size: 10 (Allocated) *Toke (Protected)

1: kd> dc e9a15d90
e9a15d90 00000000 00000000 00000000 00000001 …………….
e9a15da0 00000000 00000000 bad0b0b0 82100000 …………….
e9a15db0 00000000 00000000 61766441 20206970 ……..Advapi

3. The previous pool block should reach e9a15d78+288=e9a16000:

1: kd> dc e9a16000
e9a16000 e8000000 fffffc84 c01bd8f7 00074992 ………….I..
e9a16010 0054004e 0073005c 00730079 00650074 N.T.\.s.y.s.t.e.
e9a16020 0033006d 005c0032 006c0064 0063006c m.3.2.\.d.l.l.c.

4. Search in the memory and found the address could be referenced by pool tagged with $CPH, which should be owned by CPQPHP:

1: kd> !for_each_module s-a @#Base @#End “$CPH”
f34d8869 24 43 50 48 ff 74 24 08-6a 01 ff 15 48 01 4d f3 $CPH.t$.j…H.M.

1: kd> u f34d8869
*** ERROR: Module load completed but symbols could not be loaded for CPQPHP.SYS
CPQPHP+0xc869:
f34d8869 2443 and al,43h

I got in contact with the server owner and schedule an upgrade, but as I was doing the upgrade, the server just kept crashing. This is a bit strange, as it is not running anything much at that point in time. (more…)

January 7, 2008

Windebug: BSOD 0×0000001e (Non-paged pool empty)

Filed under: Windebug — saltwetfish @ 8:13 am
Tags: , ,

One of our application server had BSOD for the past 2 weeks with a similar error. There were a lot of event id complaining “The server was unable to allocate from the system nonpaged pool because the pool was empty” before the server crashed.

The BSOD error was 0×0000001e (0xc0000006, 0xa0001d52, 0×00000000, 0×00370cb4).

The application team was quie insistent that their application couldn’t be the cause of the error and so I raised an issue with Microsoft to help me have a look and here was the result:

> BSOD code was as follows

ExceptionAddress: a0001d52 (win32k!XDCOBJ::bCleanDC+0×00000010)
ExceptionCode: c0000006 (In-page I/O error)
ExceptionFlags: 00000000
NumberParameters: 3
Parameter[0]: 00000000
Parameter[1]: 00370cb4
Parameter[2]: c000009a

> A check with virtual memory shows that nonpaged pool was almost used up

0: kd>!vm

*** Virtual Memory Usage ***
Physical Memory: 524165 ( 2096660 Kb)
NonPagedPool Usage: 64660 ( 258640 Kb)
NonPagedPool Max: 68609 ( 274436 Kb)
********** Excessive NonPaged Pool Usage ***** (more…)

February 4, 2007

How to BSOD Windows with SAV file/folder exclusions

Filed under: Windebug, WindowsAdmin — saltwetfish @ 2:13 am
Tags: , ,

Symantec Antivirus keeps its list of files/folders to exclude into the registry.

This is no doubt a good and consistent practice, however, it also weakness the server as users and administrators could unwittingly BSOD their machines.

In normal Windows machine, this weakness don’t manifest so readily, but in machines where some of the folders can contain thousands of files, this can be a problem. By itself this is not a problem, but if there is a need to, say, exclude such folders, one could accidentally selected each individual file in that folder instead of just excluding that contents of that folder.

For example, someone comes to you to get a particular FTP folder excluded. That folder contains a lot of huge files and realtime scan is slowing down their process. The files are already prescanned elsewhere so it not an issue. So you go into the SAV realtime configuration option and select that folder. The first visual will be a (+)plus with a check mark (this creates an inherited exclusion). You clicked the folder again, you saw that now its only a check mark (this creates an individual file/folder exclusion). “Hmmm… not sure which I should apply”, you think to yourself, “Should not matter too much, let’s try and see”. You hit enter and the SAV sort of freezes as it desperately tries to fill up the registry with entries of the thousand of files you just selected!

The next thing you know… Windows BSOD

Well, it sort of happened to one of the servers I worked with! It was a dumb mistake on my part because I saw how SAV wrote to the registry with file/folder exclusions, but was experimenting with exclusions on 2 volumes and clicked OK too fast before I could remove the selection on that huge ftp folder.

I think Symantec should move the exclusion list into a text file. This way, the most it will crash is SAV and the text file, not the whole Windows via writing to the registry.

Incidentally, this happened because I was investigate seemingly an issue with SAVCE10 where my folder exclusion doesn’t seem to work, SAV is still scanning the excluded folder. Found nothing on the newsgroup nor Symantec site about it so far.

Blog at WordPress.com.