Discussion:
LMDB physical file size
Christian Sell
2015-11-09 17:47:17 UTC
Permalink
Content preview: Hello, we are using LMDB as the underlying storage engine
for a lightweight + high performance special-purpose object + mass data database.
I have 2 questions about the size of the physical file used by LMDB: [...]


Content analysis details: (-2.7 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-0.7 RCVD_IN_DNSWL_LOW RBL: Sender listed at http://www.dnswl.org/, low
trust
[81.169.146.221 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: gsvitec.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature

Hello,

we are using LMDB as the underlying storage engine for a lightweight + high
performance special-purpose object + mass data database. I have 2 questions
about the size of the physical file used by LMDB:

To create the environment, we are using a mapsize of 1 GiB and the flags
MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size
that seems to correspond to the size of the data actually stored. However, under
Windows, the file size is the same as the mapsize, namely 1 GiB. We are
currently using env_copy2 and MDB_COMPACT to push this down every time the env
is closed, but I fear that this will become very slow with large databases.

The same issue surfaced under Linux when we were recently experimenting with the
MDB_WRITEMAP option to improve performance when dealing with very large data
sets. This option caused also the Linux file size to go up to 1 GiB, even though
the actual data was < 50 K.

We'd like to hear if there are ways to improve this.

thanks,
Christian
Howard Chu
2015-11-09 21:04:53 UTC
Permalink
Content preview: Christian Sell wrote: > Hello, > > we are using LMDB as the
underlying storage engine for a lightweight + high > performance special-purpose
object + mass data database. I have 2 questions > about the size of the physical
file used by LMDB: > > To create the environment, we are using a mapsize
of 1 GiB and the flags > MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results
in one file with a size > that seems to correspond to the size of the data
actually stored. However, under > Windows, the file size is the same as the
mapsize, namely 1 GiB. We are > currently using env_copy2 and MDB_COMPACT
to push this down every time the env > is closed, but I fear that this will
become very slow with large databases. > > The same issue surfaced under
Linux when we were recently experimenting with the > MDB_WRITEMAP option to
improve performance when dealing with very large data > sets. This option
caused also the Linux file size to go up to 1 GiB, even though > the actual
data was < 50 K. > > We'd like to hear if there are ways to improve this.
[...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Christian Sell
Hello,
we are using LMDB as the underlying storage engine for a lightweight + high
performance special-purpose object + mass data database. I have 2 questions
To create the environment, we are using a mapsize of 1 GiB and the flags
MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size
that seems to correspond to the size of the data actually stored. However, under
Windows, the file size is the same as the mapsize, namely 1 GiB. We are
currently using env_copy2 and MDB_COMPACT to push this down every time the env
is closed, but I fear that this will become very slow with large databases.
The same issue surfaced under Linux when we were recently experimenting with the
MDB_WRITEMAP option to improve performance when dealing with very large data
sets. This option caused also the Linux file size to go up to 1 GiB, even though
the actual data was < 50 K.
We'd like to hear if there are ways to improve this.
No.

This is how memory mapped files work on Windows. There is no way to change that.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa366537%28v=vs.85%29.aspx

Likewise for writable mmaps on POSIX systems. Read your operating system
documentation.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Hallvard Breien Furuseth
2015-11-09 22:00:50 UTC
Permalink
Content preview: On 09/11/15 18:47, Christian Sell wrote: > To create the environment,
we are using a mapsize of 1 GiB and the flags > MDB_NOSUBDIR | MDB_NOLOCK.
Under Linux, this results in one file with a size > that seems to correspond
to the size of the data actually stored. However, under > Windows, the file
size is the same as the mapsize, namely 1 GiB. > (...) The same issue surfaced
under Linux (...) with the MDB_WRITEMAP option [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[129.240.10.15 listed in list.dnswl.org]
0.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Christian Sell
To create the environment, we are using a mapsize of 1 GiB and the flags
MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size
that seems to correspond to the size of the data actually stored. However, under
Windows, the file size is the same as the mapsize, namely 1 GiB.
(...) The same issue surfaced under Linux (...) with the MDB_WRITEMAP option
That's the logical size, which can be bigger than the physical
size. In lmdb's case, the end of the file doesn't use any disk
space. On filesystems which support this, anyway. Most do.
So, nevermind mdb_copy - there is no problem to fix.

On Unix, 'du <file>' shows disk usage. Don't know about Windows.

When you want to copy the file anyway, you should use mdb_copy
rather than plain filecopy. And MDB_COMPACT does shrink the file
somewhat since it drops pages which LMDB has freed and not yet
reused, but that's another matter. The DB would grow later
anyway, LDMB does need pages it can write to.

Hallvard
Quanah Gibson-Mount
2015-11-09 22:20:05 UTC
Permalink
Content preview: --On Monday, November 09, 2015 11:00 PM +0100 Hallvard Breien
Furuseth <***@usit.uio.no> wrote: > On 09/11/15 18:47, Christian
Sell wrote: >> To create the environment, we are using a mapsize of 1 GiB
and the flags >> MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in
one file with a >> size that seems to correspond to the size of the data actually
stored. >> However, under Windows, the file size is the same as the mapsize,
namely >> 1 GiB. (...) The same issue surfaced under Linux (...) with the
Post by Hallvard Breien Furuseth
MDB_WRITEMAP option > > That's the logical size, which can be bigger than
the physical > size. In lmdb's case, the end of the file doesn't use any
disk > space. On filesystems which support this, anyway. Most do. > So, nevermind
mdb_copy - there is no problem to fix. > > On Unix, 'du <file>' shows disk
usage. Don't know about Windows. > > When you want to copy the file anyway,
you should use mdb_copy > rather than plain filecopy. And MDB_COMPACT does
shrink the file > somewhat since it drops pages which LMDB has freed and
not yet > reused, but that's another matter. The DB would grow later > anyway,
LDMB does need pages it can write to. [...]

Content analysis details: (-4.3 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[162.209.122.174 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: uio.no]
-0.0 SPF_PASS SPF: sender matches SPF record
0.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
-0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's
domain
0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid
-0.1 DKIM_VALID Message has at least one valid DKIM or DK signature

--On Monday, November 09, 2015 11:00 PM +0100 Hallvard Breien Furuseth
Post by Hallvard Breien Furuseth
To create the environment, we are using a mapsize of 1 GiB and the flags
MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a
size that seems to correspond to the size of the data actually stored.
However, under Windows, the file size is the same as the mapsize, namely
1 GiB. (...) The same issue surfaced under Linux (...) with the
MDB_WRITEMAP option
That's the logical size, which can be bigger than the physical
size. In lmdb's case, the end of the file doesn't use any disk
space. On filesystems which support this, anyway. Most do.
So, nevermind mdb_copy - there is no problem to fix.
On Unix, 'du <file>' shows disk usage. Don't know about Windows.
When you want to copy the file anyway, you should use mdb_copy
rather than plain filecopy. And MDB_COMPACT does shrink the file
somewhat since it drops pages which LMDB has freed and not yet
reused, but that's another matter. The DB would grow later
anyway, LDMB does need pages it can write to.
Here's a real world example:

[***@ldap01 db]$ ls -l data.mdb
-rw------- 1 zimbra zimbra 17967149056 Nov 9 16:19 data.mdb
[***@ldap01 db]$ du -c -h data.mdb
76M data.mdb
76M total

I.e., real usage is 76MB vs the approximately 17GB configured max size.

--Quanah



--

Quanah Gibson-Mount
Platform Architect
Zimbra, Inc.
--------------------
Zimbra :: the leader in open source messaging and collaboration
Ulrich Windl
2015-11-10 07:16:58 UTC
Permalink
Content preview: >>> Hallvard Breien Furuseth <***@usit.uio.no> schrieb
am 09.11.2015 um 23:00 in Nachricht <***@usit.uio.no>: > On
09/11/15 18:47, Christian Sell wrote: >> To create the environment, we are
using a mapsize of 1 GiB and the flags >> MDB_NOSUBDIR | MDB_NOLOCK. Under
Linux, this results in one file with a size >> that seems to correspond to
the size of the data actually stored. However, > under >> Windows, the file
size is the same as the mapsize, namely 1 GiB. >> (...) The same issue surfaced
under Linux (...) with the MDB_WRITEMAP option > > That's the logical size,
which can be bigger than the physical > size. In lmdb's case, the end of
the file doesn't use any disk > space. On filesystems which support this,
anyway. Most do. > So, nevermind mdb_copy - there is no problem to fix. >
On Unix, 'du <file>' shows disk usage. Don't know about Windows. > > When
you want to copy the file anyway, you should use mdb_copy > rather than plain
filecopy. And MDB_COMPACT does shrink the file > somewhat since it drops
pages which LMDB has freed and not yet > reused, but that's another matter.
The DB would grow later > anyway, LDMB does need pages it can write to. [...]


Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[194.94.155.51 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: uio.no]
0.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Christian Sell
To create the environment, we are using a mapsize of 1 GiB and the flags
MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size
that seems to correspond to the size of the data actually stored. However,
under
Post by Christian Sell
Windows, the file size is the same as the mapsize, namely 1 GiB.
(...) The same issue surfaced under Linux (...) with the MDB_WRITEMAP option
That's the logical size, which can be bigger than the physical
size. In lmdb's case, the end of the file doesn't use any disk
space. On filesystems which support this, anyway. Most do.
So, nevermind mdb_copy - there is no problem to fix.
On Unix, 'du <file>' shows disk usage. Don't know about Windows.
When you want to copy the file anyway, you should use mdb_copy
rather than plain filecopy. And MDB_COMPACT does shrink the file
somewhat since it drops pages which LMDB has freed and not yet
reused, but that's another matter. The DB would grow later
anyway, LDMB does need pages it can write to.
I wonder as SSD become more and more common: Should LMDB have a way to signal to the operating system that some parts of the file are no longer in use? So the OS->filesystem->blockdevice could actually reclaim the space.
Hallvard
Howard Chu
2015-11-10 21:09:43 UTC
Permalink
Post by Ulrich Windl
On 09/11/15 18:47, Christian Sell wrote: >>> To create the environment,
we are using a mapsize of 1 GiB and the flags >>> MDB_NOSUBDIR | MDB_NOLOCK.
Under Linux, this results in one file with a size >>> that seems to correspond
to the size of the data actually stored. However, >> under >>> Windows, the
file size is the same as the mapsize, namely 1 GiB. >>> (...) The same issue
surfaced under Linux (...) with the MDB_WRITEMAP option >> >> That's the
logical size, which can be bigger than the physical >> size. In lmdb's case,
the end of the file doesn't use any disk >> space. On filesystems which support
this, anyway. Most do. >> So, nevermind mdb_copy - there is no problem to
fix. >> >> On Unix, 'du <file>' shows disk usage. Don't know about Windows.
Post by Ulrich Windl
Post by Christian Sell
When you want to copy the file anyway, you should use mdb_copy >> rather
than plain filecopy. And MDB_COMPACT does shrink the file >> somewhat since
it drops pages which LMDB has freed and not yet >> reused, but that's another
matter. The DB would grow later >> anyway, LDMB does need pages it can write
to. > > I wonder as SSD become more and more common: Should LMDB have a way
to signal to the operating system that some parts of the file are no longer
in use? So the OS->filesystem->blockdevice could actually reclaim the space.
[...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Ulrich Windl
Post by Christian Sell
To create the environment, we are using a mapsize of 1 GiB and the flags
MDB_NOSUBDIR | MDB_NOLOCK. Under Linux, this results in one file with a size
that seems to correspond to the size of the data actually stored. However,
under
Post by Christian Sell
Windows, the file size is the same as the mapsize, namely 1 GiB.
(...) The same issue surfaced under Linux (...) with the MDB_WRITEMAP option
That's the logical size, which can be bigger than the physical
size. In lmdb's case, the end of the file doesn't use any disk
space. On filesystems which support this, anyway. Most do.
So, nevermind mdb_copy - there is no problem to fix.
On Unix, 'du <file>' shows disk usage. Don't know about Windows.
When you want to copy the file anyway, you should use mdb_copy
rather than plain filecopy. And MDB_COMPACT does shrink the file
somewhat since it drops pages which LMDB has freed and not yet
reused, but that's another matter. The DB would grow later
anyway, LDMB does need pages it can write to.
I wonder as SSD become more and more common: Should LMDB have a way to
signal to the operating system that some parts of the file are no longer in
use? So the OS->filesystem->blockdevice could actually reclaim the space.

No.

Pages deleted in one transaction will be reused in a subsequent transaction.
There's no benefit to telling the OS to deallocate it since it will just need
to be reallocated again shortly after. It will kill both performance overall,
issuing extraneous filesystem ops, and kill the SSD itself, issuing extraneous
metadata updates to the device, causing it to wear out faster.

LMDB manages pages the way it does *because that is the optimal way to do so*.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Loading...