Discussion:
Truncated stack trace in UMDH
(too old to reply)
Bo McIlvain
2009-11-06 21:07:02 UTC
Permalink
This, I believe, is a tough one. I have a process control app that
communicates with OPC servers via COM that is leaking memory. I have used
UMDH and LeakDiag to try and find the source of the leak, but can't seem to
get a handle on the source because the stack trace for the offending memory
allocation doesn't go back far enough to identify the caller. The code is
written in VB6 and is running on a WIN2k server, all old stuff I know, but
the client has issues upgrading and I need to try and find a solution if
possible that doesn't entail rebooting every month.

Here's the UMDH traceback:

+ 3dce920 ( 12ae7436 - ed18b16) 20093b allocs BackTrace00072
+ 6a2d6 ( 20093b - 196665) BackTrace00072 allocations

ntdll!RtlDebugAllocateHeap+000000FD
ntdll!RtlAllocateHeapSlowly+0000005A
ntdll!RtlAllocateHeap+000008BE

I've done pretty thorough searches on MSDN, and the only hint I saw was
someone who said this is common for C RTL malloc calls (the OPC interface
DLL, opcdauto.dll, is supplied by the OPC Foundation and is written in C++)
, but when I try to use LeakDiag to track the C RTL allocator, it won't log
for me. I've also defined the environment variable to prevent BSTR caching as
indicated in
http://blogs.msdn.com/larryosterman/archive/2004/09/28/235304.aspx but that
had no effect.

In the UMDH memory dumps, one of the striking things about the memory being
allocated was that it was always in multiples of 16 bytes, it was always
either (hex) 90, A0, or B0. Multiples of 16 sounds like a variant array, no?
That would make sense for OPC, which passes back variant arrays through COM.
Could the out-of-process COM server be allocating memory in my process?

Any ideas would be very much appreciated.
Jialiang Ge [MSFT]
2009-11-09 07:47:06 UTC
Permalink
Hello

According to Pavel Lebedynskiy's comments in this blog:
http://weblogs.asp.net/mdavey/archive/2004/03/09/86569.aspx

The truncated stack trace with umdh is caused by the problem that umdh
doesn't handle FPO optimized functions (like msvcrt!malloc) or caching
allocators like SysAllocString. In these cases you either get a truncated
stack trace that doesn't tell you anything useful (FPO) or get a stack that
seems to make ense but is wrong (BSTRs).

Since you have mentioned that LeakDiag does not help, please consider these
windbg commands to diagnose the leaky app. The analysis is assocated with
my CppResourceLeaks sample in All-In-One Code Framework:
http://cfx.codeplex.com.

* !address

The !address extension command comes in very handy when you want to get a
quick overview of where the memory in your process is really located. The
command gives statistics, such as memory region usage in heaps, stack,
free,
and so on.

For example (LeakHeapMemory()),

0:000> !address -summary
ProcessParametrs 00381a18 in range 00380000 0039c000
Environment 00380810 in range 00380000 0039c000

-------------------- Usage SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Pct(Busy) Usage
11c4000 ( 18192) : 00.87% 02.27% : RegionUsageIsVAD
4f132000 ( 1295560) : 61.78% 00.00% : RegionUsageFree
397000 ( 3676) : 00.18% 00.46% : RegionUsageImage
200000 ( 2048) : 00.10% 00.26% : RegionUsageStack
2000 ( 8) : 00.00% 00.00% : RegionUsageTeb
2f760000 ( 777600) : 37.08% 97.01% : RegionUsageHeap
0 ( 0) : 00.00% 00.00% : RegionUsagePageHeap
1000 ( 4) : 00.00% 00.00% : RegionUsagePeb
0 ( 0) : 00.00% 00.00% : RegionUsageProcessParametrs
0 ( 0) : 00.00% 00.00% : RegionUsageEnvironmentBlock
Tot: 7fff0000 (2097088 KB) Busy: 30ebe000 (801528 KB)

-------------------- Type SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Usage
4f132000 ( 1295560) : 61.78% : <free>
398000 ( 3680) : 00.18% : MEM_IMAGE
1be000 ( 1784) : 00.09% : MEM_MAPPED
30968000 ( 796064) : 37.96% : MEM_PRIVATE

-------------------- State SUMMARY --------------------------
TotSize ( KB) Pct(Tots) Usage
2f4cf000 ( 774972) : 36.95% : MEM_COMMIT
4f132000 ( 1295560) : 61.78% : MEM_FREE
19ef000 ( 26556) : 01.27% : MEM_RESERVE

Largest free region: Base 30f00000 - Size 2a970000 (697792 KB)

The column Pct(Tots) means the percentage of the entry in total virtual
memory. The column Pct(Busy) means the percentage of the entry in busy
virtual memory.

RegionUsageIsVAD - memory allocated by VirtualAlloc in VMM
RegionUsageHeap - memory allocated by heap manager

From the output

11c4000 ( 18192) : 00.87% 02.27% : RegionUsageIsVAD
2f760000 ( 777600) : 37.08% 97.01% : RegionUsageHeap

we see that most used memory is heap alloc, instead of virtual alloc, so
it's
a heap memory leak.

* !heap -s, !heap -a, and !heap -x -v

The !heap -s command allows you to get a detailed look at the heap summary
of
the process and the suspicious heaps. Judging from the pattern of
allocations
in the !heap extension command output (e.g. there are tons of blocks
allocated of same user size), chances are good that we can locate the heap
blocks that are leaked. Furthermore, by looking around at the heap block
contents (e.g. does it contain ASCII characters? does it correspond to the
address of some function / symbol?) we may see how / why the block was
allocated.

Please note that because a lot of changes happened to the heap manager in
Windows Vista and the later operating system, the allocation of heap
entries
may vary. For example, the allocated block may be bigger than requested, or
the allocation granually grows in size.

To prove that this is indeed a leak, you can search for references to the
block in the process's memory space. If these potentially leaked blocks
were
being used (perhaps cached), there would need to be a reference somewhere
in
memory that points to that heap block. If there are no references, it means
that we definitely have a leak. The !heap -x -v allows you to search the
entire memory space of the process for the presence of a specified address.

For example (LeakHeapMemory()),

0:000> !heap -s
Heap Flags Reserv Commit Virt Free List UCR Virt Lock
Fast
(k) (k) (k) (k) length blocks cont.
heap
----------------------------------------------------------------------------
-
00150000 00000002 16384 16352 16352 2 0 1 0 0 L
00250000 00008000 64 12 12 10 1 1 0 0
00380000 00001002 64 44 44 9 2 1 0 0 L
----------------------------------------------------------------------------
-

The heap 00150000 occupies abnormally large memory.

0:000> !heap -a 00120000
...
00246240: 00200 . 00200 [01] - busy (1f4)
00246440: 00200 . 00200 [01] - busy (1f4)
00246640: 00200 . 00200 [01] - busy (1f4)
00246840: 00200 . 00200 [01] - busy (1f4)
00246a40: 00200 . 00200 [01] - busy (1f4)
00246c40: 00200 . 00200 [01] - busy (1f4)
00246e40: 00200 . 00200 [01] - busy (1f4)
00247040: 00200 . 00200 [01] - busy (1f4)
00247240: 00200 . 00200 [01] - busy (1f4)
...

We find a large number of blocks with the same user allocation size (1f4).
This is usually a good indicator that they are potentially leaked blocks.
The next step is to find out what these blocks actually contain. If we were
leaking memory, it would be reasonable to expect data related to our
application contained within those blocks:

0:000> db 00246c40+0x8
00246c48 41 6c 6c 2d 49 6e 2d 4f-6e 65 20 43 6f 64 65 20 All-In-One Code
00246c58 46 72 61 6d 65 77 6f 72-6b 00 00 00 00 00 00 00 Framework.......

Before we come to the conclusion that this is in fact a leak, we should
verify it by searching for references to the block in the process's memory
space.

0:000> !heap -x -v 00246c40+0x8
Entry User Heap Segment Size PrevSize Unused Flags
----------------------------------------------------------------------------
-
00246c40 00246c48 00150000 00150640 200 200 c busy

Search VM for address range 00246c40 - 00246e3f :

The search yielded zero results. As stated before, if a currently allocated
heap block is not referenced anywhere in memory, we can safely say that we
are leaking that block.

* !heap -l

The !heap -l command causes debugger to look for leaked heap blocks. It
automates the act of dumping out all heap blocks (!heap -s) and
systematically searching for any potentially leaked blocks (!heap -x -v).
Please note that !heap -l does not work if full page heap is enabled for
the
process.

For example (LeakHeapMemory()),

0:000> !heap -l
Searching the memory for potential unreachable busy blocks.
Heap 00150000
Heap 00250000
Heap 00380000
Scanning VM ...
Scanning references from 32822 busy blocks (16 MBytes) ....
Entry User Heap Segment Size PrevSize Unused Flags
----------------------------------------------------------------------------
-
00154640 00154648 00150000 00150000 200 200 c busy
00154840 00154848 00150000 00150000 200 200 c busy
00154a40 00154a48 00150000 00150000 200 200 c busy
00154e40 00154e48 00150000 00150000 200 200 c busy
00155040 00155048 00150000 00150000 200 200 c busy
00155240 00155248 00150000 00150000 200 200 c busy
00155640 00155648 00150000 00150000 200 200 c busy
00155840 00155848 00150000 00150000 200 200 c busy
..

29050 potential unreachable blocks were detected.

* Pageheap, and !heap -p -a

After you have identified a potential leak culprit using the above !heap
commands, it would be useful to see which stack trace made the allocation
to
begin with. If we had that, we could find out exactly what the code was
doing
and what it was allocating.

First, we need to enable stack tracing using Application Verifier. Second,
run !heap -p -a upon the address that we thought was leaking. Not only will
we see general information about the leaked address (such as which heap
it's
in and the trace ID), but we also get the full stack trace of the code that
made the allocation. From here, it is a trivial exercise to code review and
find the culprit code.

Note, while using page heap, !heap -s, !heap -a, !heap -x -v and !heap -l
may
not work at all! We should find the culprit memory block and run !heap -p
-a
upon it directly.

For example (LeakHeapMemory()),

0:000> !address 0b768e08
Usage: PageHeap
Base Address: 0b768000
End Address: 0b769000
Region Size: 00001000
Type: 00020000 MEM_PRIVATE
State: 00001000 MEM_COMMIT
Protect: 00000004 PAGE_READWRITE
More info: !heap -p 0x150000
More info: !heap -p -a 0xb768e08

0:000> !heap -p -a 0xb768e08
address 0b768e08 found in
_DPH_HEAP_ROOT @ 151000
in busy allocation ( DPH_HEAP_BLOCK: UserAddr UserSize - VirtAddr
VirtSize)
b72e700: b768e08 1f4 - b768000
2000
7c83d9aa ntdll!RtlAllocateHeap+0x00000e9f
0039fd2c vfbasics!AVrfpRtlAllocateHeap+0x000000b1
00401046 CppResourceLeaks!LeakHeapMemory+0x00000046

Not only do we see general information about the leaked address (such as
which heap it's in and the trace ID), but we also get the full stack trace
of
the code that made the allocation. From here, it is a trivial exercise to
code review and find the culprit code.

Regards,
Jialiang Ge
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
***@microsoft.com.

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================
Jialiang Ge [MSFT]
2009-11-13 08:39:16 UTC
Permalink
Hello

How are you? May I know whether my last reply help you?

Regards,
Jialiang Ge
Microsoft Online Community Support

=================================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
***@microsoft.com.

This posting is provided "AS IS" with no warranties, and confers no rights.
=================================================

Loading...