Opened 13 years ago

Last modified 7 years ago

#7 new defect

bdb: shared memory regions not working

Reported by: Yuri Dario Owned by:
Priority: major Milestone: db ports
Component: bdb Version:
Severity: low Keywords:
Cc:

Description

When BDB databases do not specify a memory related flag to open/create a database, it uses shared r/w mmap() memory. mmap() does not fully support such mode, still under investigation.

See http://www.mathematik.uni-ulm.de/help/BerkeleyDB/ref/env/region.html for details about regions.

Specify DB_PRIVATE to use malloc() memory, works but database access restricted to single process.
DB_SYSTEM_MEM not investigated, uses shmmem, should work.

Change History (8)

comment:1 Changed 9 years ago by Silvan Scherrer

see also #32

comment:2 Changed 9 years ago by Silvan Scherrer

Milestone: db ports
Severity: low

comment:3 Changed 7 years ago by Silvan Scherrer

did you try with latest mmap() from libcx

comment:4 Changed 7 years ago by Yuri Dario

JFYI (and mine too), related to this change in rpm code:

http://trac.netlabs.org/rpm/changeset/589

comment:5 Changed 7 years ago by dmik

Somehow it works even worse with LIBCx than with the old MMAP. Even DB_PRIVATE fails here. Thorough log analysis shows that BDB does something really wrong here. It tries accessing the mapped region AFTER it has called munmap on it (to be exact, this is the /var/lib/rpm/Name database file which gets accessed this way). Clealy looks like a BDB bug to me.

comment:6 Changed 7 years ago by dmik

Another test that proves that it's BDB and not LIBCx is that if I comment out the munmap call (so that the file is never unmapped within a single rpm run), then the crash "magically" disappears. I have no idea on how it could ever work on other platforms other than a different execution flow there for some reason. However on OS/2 it also used to work with the old MMAP which also should just free the memory right away on munmap and which definitely shows the same execution flow on OS/2 as LIBCx. I will debug against MMAP to find that out.

comment:7 Changed 7 years ago by dmik

Okay, I found a problem with LIBCx mmap. BDB expects that mappings can be nested, i.e. mapping the same file twice will require a pair of munmap calls. However, our LIBCx mmaps are "flat" so that mmap call on the same file will remap it rather than create a new mapping. I did it this way because I believed this is what the POSIX standard is. This approach results in any munmap call freeing the mapping even after a series of mmap calls on the same file/range. BDB calls mmap for /lib/rpm/Name twice, then calls munmap only once so it still expects that the file is mapped and accesses it. But it gets a crash since LIBCx has already freed the mapping.

This nesting is basically the same as overlapping (with the only difference is that nesting is a complete overlapping, i.e. when the file is entirely mapped twice or more). And we already have a ticket for the overlapping problem, https://github.com/bitwiseworks/libcx/issues/29. This is where it has to be fixed.

Note that the old MMAP implementations is free of this problem since it implements a trivial approach where each mmap call results into a new memory region so nesting/overlapping is given for free there (but this approach leads to many other problems like different mappings going out of sync etc).

Last edited 7 years ago by dmik (previous) (diff)

comment:8 Changed 7 years ago by dmik

The crash in DB_PRIVATE mode under LIBCx was fixed within https://github.com/bitwiseworks/libcx/issues/29, it works flawlessly here. However, removing DB_PRIVATE makes RPM malfunction. I created https://github.com/bitwiseworks/libcx/issues/32 to track this issue separately.

Note that there is also http://trac.netlabs.org/rpm/changeset/135 that needs to be rolled back once the problem with shared regions is completely fixed (within https://github.com/bitwiseworks/libcx/issues/32 or elsewhere).

Note: See TracTickets for help on using tickets.