Discussion:
Binding mdb_set_compare with Go
(too old to reply)
Bryan Matsuo
2015-10-29 08:27:36 UTC
Permalink
Raw Message
openldap-technical,

I am working on some Go (golang) bindings[1] for the LMDB library and I
have some interest in exposing the functionality of mdb_set_compare (and
mdb_set_dupsort). But it is proving difficult and I have a question about
the function(s).

Calling mdb_set_compare from the Go runtime is challenging. Using C APIs
with callbacks comes with restrictions[2][3]. I believe it impossible to
bind these functions way that is flexible, as one would expect. A potential
change to LMDB that would make binding drastically easier is having
MDB_cmp_func to take a third "context" argument with type void*. Then a
binding could safely use an arbitrary Go function for comparisons.

Is it possible for future versions of LMDB to add a third argument to the
MDB_cmp_func signature? Otherwise would it be acceptable for a variant API
to be added using a different function type, one accepting three arguments?

Thanks for the consideration.

Cheers,
- Bryan

[1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
[3] Cgo documentation -- https://golang.org/cmd/cgo/
Howard Chu
2015-10-29 08:58:38 UTC
Permalink
Raw Message
Content preview: Bryan Matsuo wrote: > openldap-technical, > > I am working
on some Go (golang) bindings[1] for the LMDB library and I have > some interest
in exposing the functionality of mdb_set_compare (and > mdb_set_dupsort).
But it is proving difficult and I have a question about the > function(s).
Post by Bryan Matsuo
Post by Bryan Matsuo
Calling mdb_set_compare from the Go runtime is challenging. Using C APIs
with > callbacks comes with restrictions[2][3]. I believe it impossible to
bind these > functions way that is flexible, as one would expect. A potential
change to > LMDB that would make binding drastically easier is having MDB_cmp_func
to take > a third "context" argument with type void*. Then a binding could
safely use an > arbitrary Go function for comparisons. > > Is it possible
for future versions of LMDB to add a third argument to the > MDB_cmp_func
signature? Otherwise would it be acceptable for a variant API to > be added
using a different function type, one accepting three arguments? [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: golang.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
Post by Bryan Matsuo
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and I have
some interest in exposing the functionality of mdb_set_compare (and
mdb_set_dupsort). But it is proving difficult and I have a question about the
function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C APIs with
callbacks comes with restrictions[2][3]. I believe it impossible to bind these
functions way that is flexible, as one would expect. A potential change to
LMDB that would make binding drastically easier is having MDB_cmp_func to take
a third "context" argument with type void*. Then a binding could safely use an
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to the
MDB_cmp_func signature? Otherwise would it be acceptable for a variant API to
be added using a different function type, one accepting three arguments?
This is a recurrent question.

http://www.openldap.org/its/index.cgi/Incoming?id=8276
http://www.openldap.org/its/index.cgi/Incoming?id=8124
http://www.openldap.org/its/index.cgi/Incoming?id=7980

My response now is the same as before - see ITS#8124 for example.

The compare functions are on the critical path for performance. Jumping thru a
bunch of interpreter glue layers to invoke them is a really bad idea.

I might be convinced if someone were to post some benchmarks showing that an
implementation of an MDB_cmp_func in their chosen language is actually tolerable.
Post by Bryan Matsuo
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
[3] Cgo documentation -- https://golang.org/cmd/cgo/
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu
2015-10-29 09:40:05 UTC
Permalink
Raw Message
Content preview: Bryan Matsuo wrote: > openldap-technical, > > I am working
on some Go (golang) bindings[1] for the LMDB library and I have > some interest
in exposing the functionality of mdb_set_compare (and > mdb_set_dupsort).
But it is proving difficult and I have a question about the > function(s).
Post by Bryan Matsuo
Calling mdb_set_compare from the Go runtime is challenging. Using C APIs
with > callbacks comes with restrictions[2][3]. I believe it impossible to
bind these > functions way that is flexible, as one would expect. A potential
change to > LMDB that would make binding drastically easier is having MDB_cmp_func
to take > a third "context" argument with type void*. Then a binding could
safely use an > arbitrary Go function for comparisons. > > Is it possible
for future versions of LMDB to add a third argument to the > MDB_cmp_func
signature? Otherwise would it be acceptable for a variant API to > be added
using a different function type, one accepting three arguments? > > Thanks
for the consideration. > > Cheers, > - Bryan > > [1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions -- > https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
[3] Cgo documentation -- https://golang.org/cmd/cgo/ [...]
Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: golang.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and I have
some interest in exposing the functionality of mdb_set_compare (and
mdb_set_dupsort). But it is proving difficult and I have a question about the
function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C APIs with
callbacks comes with restrictions[2][3]. I believe it impossible to bind these
functions way that is flexible, as one would expect. A potential change to
LMDB that would make binding drastically easier is having MDB_cmp_func to take
a third "context" argument with type void*. Then a binding could safely use an
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to the
MDB_cmp_func signature? Otherwise would it be acceptable for a variant API to
be added using a different function type, one accepting three arguments?
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra information to be
passed from Go to C or from C to Go.

There is a vague mention in [2]

"A particular unsafe area is C code that wants to hold on to Go func and
pointer values for future callbacks from C to Go. This works today but is not
permitted by the invariant. It is hard to detect. One safe approach is: Go
code that wants to preserve funcs/pointers stores them into a map indexed by
an int. Go code calls the C code, passing the int, which the C code may store
freely. When the C code wants to call into Go, it passes the int to a Go
function that looks in the map and makes the call."

But it's nonsense in this case - you want to pass a Go function pointer to C,
but the only way for C to use it is to call some *other* Go function? Sorry
but there is no other Go function for the mdb_cmp() function to call, the only
one it knows about is the function pointer that you pass.

If this is what you're referring to, adding a context pointer doesn't achieve
anything. If this isn't what you're referring to, then please explain exactly
what you hope to achieve with this context pointer.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Bryan Matsuo
2015-10-29 21:28:08 UTC
Permalink
Raw Message
Juerg,

That is is interesting proposal. As an alternative to letting users hook up
arbitrary Go function for comparison, I have also thought about the
possibility of providing a small set of static C functions usable for
comparison. A flexible compound key comparison function like this could fit
well into that idea.

Howard,

Sorry I did not find the issues mentioned in previous searches.

I understand the concern about such a hot code path. I'm not sure that Go
would see acceptable performance.

But, Go is not an interpreted language (though there is glue). And while
I'm not positive about the performance of Go in this area you seem to
dismiss comparison functions in any other language. Is it unreasonable to
think that comparison functions written in other compiled languages like
Rust, Nim, or ML variants would also be impractically slow?

I also believe you have misunderstood the practical problems of passing Go
function pointers to C. But to be fair, I think the wording of that quoted
paragraph could be better.
Sorry but there is no other Go function for the mdb_cmp() function to
call, the only one it knows about is the function pointer that you pass.

It may be of benefit to see how the I've used the context argument in a
binding being developed for the mdb_reader_list function.

https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c

The callback passed to mdb_reader_list is always the same static function
because correctly calling a Go function from C requires an annotated static
Go function. The context argument allows dispatch to the correct Go
function that was configured at runtime. I believe that is the "other" Go
function you mentioned.

The implementation would be similar for mdb_set_compare. The callback would
always be the same static function which handles the dynamic dispatch.

Cheers,
- Bryan
Actually I’m not commenting on binding Go but I’m voting for a context
passed to the compare function.
I fully agree that the compare function is part of the critical path. But
as I need to define custom indexes with compound keys the compare functions
varies and it would be impractical to predefine for any compound key
combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx {
MDB_val md_name; /**< name of the database
*/
MDB_cmp_func *md_cmp; /**< function for comparing keys */
void *md_cmpctx; /** user-provided context for md_cmp
**/
MDB_cmp_func *md_dcmp; /**< function for comparing data
items */
void *md_dcmpctx;/** user-provided context for md_dcmp
**/
MDB_rel_func *md_rel; /**< user relocate function */
void *md_relctx; /**< user-provided context
for md_rel */
} MDB_dbx;
The following is a draft (not tested yet) of a generic compare function.
The context contains a compare specification which is a null terminated
list of <type><order> pairs.
// compareSpec <type><order>...<null>
int key_comp_generic(const MDB_val *a, const MDB_val *b, char
*compareSpec) {
int result = 0;
char *pa = a->mv_data;
char *pb = b->mv_data;
while (1) {
switch (*compareSpec++) {
break;
{
unsigned int va = *(unsigned int *)pa;
unsigned int vb = *(unsigned int *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 4;
pb += 4;
}
}
{
unsigned long long va = *(unsigned long long *)pa;
unsigned long long vb = *(unsigned long long *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 8;
pb += 8;
}
}
{
unsigned int la = *(unsigned int *)pa;
unsigned int lb = *(unsigned int *)pb;
pa += 4;
pb += 4;
if (*compareSpec++ == ASCENDING_ORDER) {
result = strncmp(pa, pb, (la < lb) ? la : lb);
if (result != 0) {
break;
}
else {
result = (la < lb) ? -1 : la > lb;
}
}
else {
result = strncmp(pb, pa, (la < lb) ? la : lb);
if (result != 0) {
break;
}
else {
result = (la > lb) ? -1 : la < lb;
}
}
if (result != 0) {
break;
}
else {
pa += la;
pb += lb;
}
}
}
}
return result;
}
Regards
Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu" <
Post by Howard Chu
Post by Bryan Matsuo
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and I
have
Post by Howard Chu
Post by Bryan Matsuo
some interest in exposing the functionality of mdb_set_compare (and
mdb_set_dupsort). But it is proving difficult and I have a question
about the
Post by Howard Chu
Post by Bryan Matsuo
function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C
APIs with
Post by Howard Chu
Post by Bryan Matsuo
callbacks comes with restrictions[2][3]. I believe it impossible to
bind these
Post by Howard Chu
Post by Bryan Matsuo
functions way that is flexible, as one would expect. A potential change
to
Post by Howard Chu
Post by Bryan Matsuo
LMDB that would make binding drastically easier is having MDB_cmp_func
to take
Post by Howard Chu
Post by Bryan Matsuo
a third "context" argument with type void*. Then a binding could safely
use an
Post by Howard Chu
Post by Bryan Matsuo
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to
the
Post by Howard Chu
Post by Bryan Matsuo
MDB_cmp_func signature? Otherwise would it be acceptable for a variant
API to
Post by Howard Chu
Post by Bryan Matsuo
be added using a different function type, one accepting three arguments?
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
Post by Howard Chu
Post by Bryan Matsuo
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra information to be
passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go func and
pointer values for future callbacks from C to Go. This works today but is
not
Post by Howard Chu
permitted by the invariant. It is hard to detect. One safe approach is: Go
code that wants to preserve funcs/pointers stores them into a map indexed
by
Post by Howard Chu
an int. Go code calls the C code, passing the int, which the C code may
store
Post by Howard Chu
freely. When the C code wants to call into Go, it passes the int to a Go
function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function pointer
to C,
Post by Howard Chu
but the only way for C to use it is to call some *other* Go function?
Sorry
Post by Howard Chu
but there is no other Go function for the mdb_cmp() function to call, the
only
Post by Howard Chu
one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer doesn't
achieve
Post by Howard Chu
anything. If this isn't what you're referring to, then please explain
exactly
Post by Howard Chu
what you hope to achieve with this context pointer.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu
2015-10-30 02:37:43 UTC
Permalink
Raw Message
Content preview: Bryan Matsuo wrote: > I also believe you have misunderstood
the practical problems of passing Go > function pointers to C. But to be
fair, I think the wording of that quoted > paragraph could be better. > >
Sorry but there is no other Go function for the mdb_cmp() function to call,
the only one it knows about is the function pointer that you pass. > >
It may be of benefit to see how the I've used the context argument in a >
binding being developed for the mdb_reader_list function. > > https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c
Post by Bryan Matsuo
The callback passed to mdb_reader_list is always the same static function
because correctly calling a Go function from C requires an annotated static
Go > function. The context argument allows dispatch to the correct Go function
that > was configured at runtime. I believe that is the "other" Go function
you > mentioned. > > The implementation would be similar for mdb_set_compare.
The callback would > always be the same static function which handles the
dynamic dispatch. > And this is the part that I really don't understand -
why not let the user pass their own static Go function? Bouncing through
a dispatcher like this will be even slower. [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: highlandsun.com]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
I also believe you have misunderstood the practical problems of passing Go
function pointers to C. But to be fair, I think the wording of that quoted
paragraph could be better.
Post by Bryan Matsuo
Sorry but there is no other Go function for the mdb_cmp() function to call,
the only one it knows about is the function pointer that you pass.
It may be of benefit to see how the I've used the context argument in a
binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c
The callback passed to mdb_reader_list is always the same static function
because correctly calling a Go function from C requires an annotated static Go
function. The context argument allows dispatch to the correct Go function that
was configured at runtime. I believe that is the "other" Go function you
mentioned.
The implementation would be similar for mdb_set_compare. The callback would
always be the same static function which handles the dynamic dispatch.
And this is the part that I really don't understand - why not let the user
pass their own static Go function? Bouncing through a dispatcher like this
will be even slower.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Bryan Matsuo
2015-10-30 06:10:31 UTC
Permalink
Raw Message
After digging it seems that it will continue to be possible for users to
safely pass static Go function references. It is quite a burden on the
user. But I will continue to think about it.

Jay, noted. I am open to exploring that direction. Though as was pointed
out earlier a library of static functions can be made more useful (if
somewhat slower) when a context object can configure their behavior. Before
reading that suggestion I was uncertain how much useful functionality could
be exposed as a library. I am writing general purpose bindings, so I would
prefer a function library be fairly generic.

Howard, do you have thoughts on the proposal from Juerg regarding a
compound-key comparison function implemented using a context value?
From the peanut gallery: Small set of static C functions is probably the
way to go. If I understand correctly, which I probablay don't, the
mismatch between green threads and OS threads means there's a lot of
expensive stack-switching involved in go->C->go execution.
Post by Bryan Matsuo
Juerg,
That is is interesting proposal. As an alternative to letting users hook
up arbitrary Go function for comparison, I have also thought about the
possibility of providing a small set of static C functions usable for
comparison. A flexible compound key comparison function like this could fit
well into that idea.
Howard,
Sorry I did not find the issues mentioned in previous searches.
I understand the concern about such a hot code path. I'm not sure that Go
would see acceptable performance.
But, Go is not an interpreted language (though there is glue). And while
I'm not positive about the performance of Go in this area you seem to
dismiss comparison functions in any other language. Is it unreasonable to
think that comparison functions written in other compiled languages like
Rust, Nim, or ML variants would also be impractically slow?
I also believe you have misunderstood the practical problems of passing
Go function pointers to C. But to be fair, I think the wording of that
quoted paragraph could be better.
Sorry but there is no other Go function for the mdb_cmp() function to
call, the only one it knows about is the function pointer that you pass.
It may be of benefit to see how the I've used the context argument in a
binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c
The callback passed to mdb_reader_list is always the same static function
because correctly calling a Go function from C requires an annotated static
Go function. The context argument allows dispatch to the correct Go
function that was configured at runtime. I believe that is the "other" Go
function you mentioned.
The implementation would be similar for mdb_set_compare. The callback
would always be the same static function which handles the dynamic dispatch.
Cheers,
- Bryan
Actually I’m not commenting on binding Go but I’m voting for a context
passed to the compare function.
I fully agree that the compare function is part of the critical path.
But as I need to define custom indexes with compound keys the compare
functions varies and it would be impractical to predefine for any compound
key combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx {
MDB_val md_name; /**< name of the
database */
MDB_cmp_func *md_cmp; /**< function for comparing keys */
void *md_cmpctx; /** user-provided context for md_cmp
**/
MDB_cmp_func *md_dcmp; /**< function for comparing data
items */
void *md_dcmpctx;/** user-provided context for
md_dcmp **/
MDB_rel_func *md_rel; /**< user relocate function */
void *md_relctx; /**< user-provided
context for md_rel */
} MDB_dbx;
The following is a draft (not tested yet) of a generic compare function.
The context contains a compare specification which is a null terminated
list of <type><order> pairs.
// compareSpec <type><order>...<null>
int key_comp_generic(const MDB_val *a, const MDB_val *b, char
*compareSpec) {
int result = 0;
char *pa = a->mv_data;
char *pb = b->mv_data;
while (1) {
switch (*compareSpec++) {
break;
{
unsigned int va = *(unsigned int *)pa;
unsigned int vb = *(unsigned int *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 4;
pb += 4;
}
}
{
unsigned long long va = *(unsigned long long *)pa;
unsigned long long vb = *(unsigned long long *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 8;
pb += 8;
}
}
{
unsigned int la = *(unsigned int *)pa;
unsigned int lb = *(unsigned int *)pb;
pa += 4;
pb += 4;
if (*compareSpec++ == ASCENDING_ORDER) {
result = strncmp(pa, pb, (la < lb) ? la : lb);
if (result != 0) {
break;
}
else {
result = (la < lb) ? -1 : la > lb;
}
}
else {
result = strncmp(pb, pa, (la < lb) ? la : lb);
if (result != 0) {
break;
}
else {
result = (la > lb) ? -1 : la < lb;
}
}
if (result != 0) {
break;
}
else {
pa += la;
pb += lb;
}
}
}
}
return result;
}
Regards
Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu" <
Post by Howard Chu
Post by Bryan Matsuo
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB library and
I have
Post by Howard Chu
Post by Bryan Matsuo
some interest in exposing the functionality of mdb_set_compare (and
mdb_set_dupsort). But it is proving difficult and I have a question
about the
Post by Howard Chu
Post by Bryan Matsuo
function(s).
Calling mdb_set_compare from the Go runtime is challenging. Using C
APIs with
Post by Howard Chu
Post by Bryan Matsuo
callbacks comes with restrictions[2][3]. I believe it impossible to
bind these
Post by Howard Chu
Post by Bryan Matsuo
functions way that is flexible, as one would expect. A potential
change to
Post by Howard Chu
Post by Bryan Matsuo
LMDB that would make binding drastically easier is having
MDB_cmp_func to take
Post by Howard Chu
Post by Bryan Matsuo
a third "context" argument with type void*. Then a binding could
safely use an
Post by Howard Chu
Post by Bryan Matsuo
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third argument to
the
Post by Howard Chu
Post by Bryan Matsuo
MDB_cmp_func signature? Otherwise would it be acceptable for a
variant API to
Post by Howard Chu
Post by Bryan Matsuo
be added using a different function type, one accepting three
arguments?
Post by Howard Chu
Post by Bryan Matsuo
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
Post by Howard Chu
Post by Bryan Matsuo
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra information to
be
Post by Howard Chu
passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go func and
pointer values for future callbacks from C to Go. This works today but
is not
Go
Post by Howard Chu
code that wants to preserve funcs/pointers stores them into a map
indexed by
Post by Howard Chu
an int. Go code calls the C code, passing the int, which the C code may
store
Post by Howard Chu
freely. When the C code wants to call into Go, it passes the int to a Go
function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function pointer
to C,
Post by Howard Chu
but the only way for C to use it is to call some *other* Go function?
Sorry
Post by Howard Chu
but there is no other Go function for the mdb_cmp() function to call,
the only
Post by Howard Chu
one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer doesn't
achieve
Post by Howard Chu
anything. If this isn't what you're referring to, then please explain
exactly
Post by Howard Chu
what you hope to achieve with this context pointer.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Howard Chu
2015-10-30 07:43:30 UTC
Permalink
Raw Message
Content preview: Bryan Matsuo wrote: > After digging it seems that it will
continue to be possible for users to > safely pass static Go function references.
It is quite a burden on the user. > But I will continue to think about it.
Post by Bryan Matsuo
Jay, noted. I am open to exploring that direction. Though as was pointed
out > earlier a library of static functions can be made more useful (if somewhat
slower) when a context object can configure their behavior. Before reading
that suggestion I was uncertain how much useful functionality could be
exposed > as a library. I am writing general purpose bindings, so I would
prefer a > function library be fairly generic. > > Howard, do you have thoughts
on the proposal from Juerg regarding a > compound-key comparison function
implemented using a context value? [...]

Content analysis details: (-4.2 points, 5.0 required)

pts rule name description
---- ---------------------- --------------------------------------------------
-2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, medium
trust
[69.43.206.106 listed in list.dnswl.org]
0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked.
See
http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block
for more information.
[URIs: golang.org]
-1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1%
[score: 0.0000]
After digging it seems that it will continue to be possible for users to
safely pass static Go function references. It is quite a burden on the user.
But I will continue to think about it.
Jay, noted. I am open to exploring that direction. Though as was pointed out
earlier a library of static functions can be made more useful (if somewhat
slower) when a context object can configure their behavior. Before reading
that suggestion I was uncertain how much useful functionality could be exposed
as a library. I am writing general purpose bindings, so I would prefer a
function library be fairly generic.
Howard, do you have thoughts on the proposal from Juerg regarding a
compound-key comparison function implemented using a context value?
I remain unconvinced. key-comparison is still per-DB; a comparison specifier
saves some space but at the expense of time - more compare ops, more branching
per key compare. Dedicated functions are still the better way to go. Plus,
naive constructs like in the emailed example are easy to get wrong - his
example will never terminate because he only breaks from the switch statement,
nothing breaks from the while(1) loop. It is attempting to be too clever, when
a more straightforward approach will be faster and obviously bug-free.
From the peanut gallery: Small set of static C functions is probably the
way to go. If I understand correctly, which I probablay don't, the
mismatch between green threads and OS threads means there's a lot of
expensive stack-switching involved in go->C->go execution.
Juerg,
That is is interesting proposal. As an alternative to letting users
hook up arbitrary Go function for comparison, I have also thought
about the possibility of providing a small set of static C functions
usable for comparison. A flexible compound key comparison function
like this could fit well into that idea.
Howard,
Sorry I did not find the issues mentioned in previous searches.
I understand the concern about such a hot code path. I'm not sure that
Go would see acceptable performance.
But, Go is not an interpreted language (though there is glue). And
while I'm not positive about the performance of Go in this area you
seem to dismiss comparison functions in any other language. Is it
unreasonable to think that comparison functions written in other
compiled languages like Rust, Nim, or ML variants would also be
impractically slow?
I also believe you have misunderstood the practical problems of
passing Go function pointers to C. But to be fair, I think the wording
of that quoted paragraph could be better.
Post by Bryan Matsuo
Sorry but there is no other Go function for the mdb_cmp() function to
call, the only one it knows about is the function pointer that you pass.
It may be of benefit to see how the I've used the context argument in
a binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c
The callback passed to mdb_reader_list is always the same static
function because correctly calling a Go function from C requires an
annotated static Go function. The context argument allows dispatch to
the correct Go function that was configured at runtime. I believe that
is the "other" Go function you mentioned.
The implementation would be similar for mdb_set_compare. The callback
would always be the same static function which handles the dynamic
dispatch.
Cheers,
- Bryan
On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher
Actually I’m not commenting on binding Go but I’m voting for a
context passed to the compare function.
I fully agree that the compare function is part of the critical
path. But as I need to define custom indexes with compound keys
the compare functions varies and it would be impractical to
predefine for any compound key combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx {
MDB_val md_name; /**< name of the
database */
MDB_cmp_func *md_cmp; /**< function for
comparing keys */
void *md_cmpctx; /** user-provided context for
md_cmp **/
MDB_cmp_func *md_dcmp; /**< function for
comparing data items */
void *md_dcmpctx;/** user-provided context for
md_dcmp **/
MDB_rel_func *md_rel; /**< user relocate
function */
void *md_relctx; /**<
user-provided context for md_rel */
} MDB_dbx;
The following is a draft (not tested yet) of a generic compare
function. The context contains a compare specification which is a
null terminated list of <type><order> pairs.
// compareSpec <type><order>...<null>
int key_comp_generic(const MDB_val *a, const MDB_val *b, char
*compareSpec) {
int result = 0;
char *pa = a->mv_data;
char *pb = b->mv_data;
while (1) {
switch (*compareSpec++) {
break;
{
unsigned int va = *(unsigned int *)pa;
unsigned int vb = *(unsigned int *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 4;
pb += 4;
}
}
{
unsigned long long va = *(unsigned long long *)pa;
unsigned long long vb = *(unsigned long long *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 8;
pb += 8;
}
}
{
unsigned int la = *(unsigned int *)pa;
unsigned int lb = *(unsigned int *)pb;
pa += 4;
pb += 4;
if (*compareSpec++ == ASCENDING_ORDER) {
result = strncmp(pa, pb, (la < lb) ? la : lb);
if (result != 0) {
break;
}
else {
result = (la < lb) ? -1 : la > lb;
}
}
else {
result = strncmp(pb, pa, (la < lb) ? la : lb);
if (result != 0) {
break;
}
else {
result = (la > lb) ? -1 : la < lb;
}
}
if (result != 0) {
break;
}
else {
pa += la;
pb += lb;
}
}
}
}
return result;
}
Regards
Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu"
Post by Bryan Matsuo
Post by Bryan Matsuo
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB
library and I have
Post by Bryan Matsuo
Post by Bryan Matsuo
some interest in exposing the functionality of mdb_set_compare
(and
Post by Bryan Matsuo
Post by Bryan Matsuo
mdb_set_dupsort). But it is proving difficult and I have a
question about the
Post by Bryan Matsuo
Post by Bryan Matsuo
function(s).
Calling mdb_set_compare from the Go runtime is challenging.
Using C APIs with
Post by Bryan Matsuo
Post by Bryan Matsuo
callbacks comes with restrictions[2][3]. I believe it
impossible to bind these
Post by Bryan Matsuo
Post by Bryan Matsuo
functions way that is flexible, as one would expect. A
potential change to
Post by Bryan Matsuo
Post by Bryan Matsuo
LMDB that would make binding drastically easier is having
MDB_cmp_func to take
Post by Bryan Matsuo
Post by Bryan Matsuo
a third "context" argument with type void*. Then a binding
could safely use an
Post by Bryan Matsuo
Post by Bryan Matsuo
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third
argument to the
Post by Bryan Matsuo
Post by Bryan Matsuo
MDB_cmp_func signature? Otherwise would it be acceptable for a
variant API to
Post by Bryan Matsuo
Post by Bryan Matsuo
be added using a different function type, one accepting three
arguments?
Post by Bryan Matsuo
Post by Bryan Matsuo
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
Post by Bryan Matsuo
Post by Bryan Matsuo
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra
information to be
Post by Bryan Matsuo
passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go
func and
Post by Bryan Matsuo
pointer values for future callbacks from C to Go. This works
today but is not
Post by Bryan Matsuo
permitted by the invariant. It is hard to detect. One safe
approach is: Go
Post by Bryan Matsuo
code that wants to preserve funcs/pointers stores them into a
map indexed by
Post by Bryan Matsuo
an int. Go code calls the C code, passing the int, which the C
code may store
Post by Bryan Matsuo
freely. When the C code wants to call into Go, it passes the int
to a Go
Post by Bryan Matsuo
function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function
pointer to C,
Post by Bryan Matsuo
but the only way for C to use it is to call some *other* Go
function? Sorry
Post by Bryan Matsuo
but there is no other Go function for the mdb_cmp() function to
call, the only
Post by Bryan Matsuo
one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer
doesn't achieve
Post by Bryan Matsuo
anything. If this isn't what you're referring to, then please
explain exactly
Post by Bryan Matsuo
what you hope to achieve with this context pointer.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Jürg Bircher
2015-10-30 08:29:48 UTC
Permalink
Raw Message
The example is just a draft. It is a suggestion to support compound keys in a generic way with the intention to keep the extra coast as low as possible. I need a generic way as I have to support user-defined compound indices.

Could we not compromise on a #define so those who need the context could compile with SUPPORT_CMP_CONTEXT. And if not needed there is no performance penalty.
Therefore the code basis would remain the same.


Thanks for considering…
Post by Howard Chu
Post by Bryan Matsuo
After digging it seems that it will continue to be possible for users to
safely pass static Go function references. It is quite a burden on the user.
But I will continue to think about it.
Jay, noted. I am open to exploring that direction. Though as was pointed out
earlier a library of static functions can be made more useful (if somewhat
slower) when a context object can configure their behavior. Before reading
that suggestion I was uncertain how much useful functionality could be exposed
as a library. I am writing general purpose bindings, so I would prefer a
function library be fairly generic.
Howard, do you have thoughts on the proposal from Juerg regarding a
compound-key comparison function implemented using a context value?
I remain unconvinced. key-comparison is still per-DB; a comparison specifier
saves some space but at the expense of time - more compare ops, more branching
per key compare. Dedicated functions are still the better way to go. Plus,
naive constructs like in the emailed example are easy to get wrong - his
example will never terminate because he only breaks from the switch statement,
nothing breaks from the while(1) loop. It is attempting to be too clever, when
a more straightforward approach will be faster and obviously bug-free.
Post by Bryan Matsuo
From the peanut gallery: Small set of static C functions is probably the
way to go. If I understand correctly, which I probablay don't, the
mismatch between green threads and OS threads means there's a lot of
expensive stack-switching involved in go->C->go execution.
Juerg,
That is is interesting proposal. As an alternative to letting users
hook up arbitrary Go function for comparison, I have also thought
about the possibility of providing a small set of static C functions
usable for comparison. A flexible compound key comparison function
like this could fit well into that idea.
Howard,
Sorry I did not find the issues mentioned in previous searches.
I understand the concern about such a hot code path. I'm not sure that
Go would see acceptable performance.
But, Go is not an interpreted language (though there is glue). And
while I'm not positive about the performance of Go in this area you
seem to dismiss comparison functions in any other language. Is it
unreasonable to think that comparison functions written in other
compiled languages like Rust, Nim, or ML variants would also be
impractically slow?
I also believe you have misunderstood the practical problems of
passing Go function pointers to C. But to be fair, I think the wording
of that quoted paragraph could be better.
Sorry but there is no other Go function for the mdb_cmp() function to
call, the only one it knows about is the function pointer that you pass.
It may be of benefit to see how the I've used the context argument in
a binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c
The callback passed to mdb_reader_list is always the same static
function because correctly calling a Go function from C requires an
annotated static Go function. The context argument allows dispatch to
the correct Go function that was configured at runtime. I believe that
is the "other" Go function you mentioned.
The implementation would be similar for mdb_set_compare. The callback
would always be the same static function which handles the dynamic
dispatch.
Cheers,
- Bryan
On Thu, Oct 29, 2015 at 3:12 AM Jürg Bircher
Actually I’m not commenting on binding Go but I’m voting for a
context passed to the compare function.
I fully agree that the compare function is part of the critical
path. But as I need to define custom indexes with compound keys
the compare functions varies and it would be impractical to
predefine for any compound key combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx {
MDB_val md_name; /**< name of the
database */
MDB_cmp_func *md_cmp; /**< function for
comparing keys */
void *md_cmpctx; /** user-provided context for
md_cmp **/
MDB_cmp_func *md_dcmp; /**< function for
comparing data items */
void *md_dcmpctx;/** user-provided context for
md_dcmp **/
MDB_rel_func *md_rel; /**< user relocate
function */
void *md_relctx; /**<
user-provided context for md_rel */
} MDB_dbx;
The following is a draft (not tested yet) of a generic compare
function. The context contains a compare specification which is a
null terminated list of <type><order> pairs.
// compareSpec <type><order>...<null>
int key_comp_generic(const MDB_val *a, const MDB_val *b, char
*compareSpec) {
int result = 0;
char *pa = a->mv_data;
char *pb = b->mv_data;
while (1) {
switch (*compareSpec++) {
break;
{
unsigned int va = *(unsigned int *)pa;
unsigned int vb = *(unsigned int *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 4;
pb += 4;
}
}
{
unsigned long long va = *(unsigned long long *)pa;
unsigned long long vb = *(unsigned long long *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 8;
pb += 8;
}
}
{
unsigned int la = *(unsigned int *)pa;
unsigned int lb = *(unsigned int *)pb;
pa += 4;
pb += 4;
if (*compareSpec++ == ASCENDING_ORDER) {
result = strncmp(pa, pb, (la < lb) ? la : lb);
if (result != 0) {
break;
}
else {
result = (la < lb) ? -1 : la > lb;
}
}
else {
result = strncmp(pb, pa, (la < lb) ? la : lb);
if (result != 0) {
break;
}
else {
result = (la > lb) ? -1 : la < lb;
}
}
if (result != 0) {
break;
}
else {
pa += la;
pb += lb;
}
}
}
}
return result;
}
Regards
Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard Chu"
Post by Bryan Matsuo
openldap-technical,
I am working on some Go (golang) bindings[1] for the LMDB
library and I have
Post by Bryan Matsuo
some interest in exposing the functionality of mdb_set_compare
(and
Post by Bryan Matsuo
mdb_set_dupsort). But it is proving difficult and I have a
question about the
Post by Bryan Matsuo
function(s).
Calling mdb_set_compare from the Go runtime is challenging.
Using C APIs with
Post by Bryan Matsuo
callbacks comes with restrictions[2][3]. I believe it
impossible to bind these
Post by Bryan Matsuo
functions way that is flexible, as one would expect. A
potential change to
Post by Bryan Matsuo
LMDB that would make binding drastically easier is having
MDB_cmp_func to take
Post by Bryan Matsuo
a third "context" argument with type void*. Then a binding
could safely use an
Post by Bryan Matsuo
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a third
argument to the
Post by Bryan Matsuo
MDB_cmp_func signature? Otherwise would it be acceptable for a
variant API to
Post by Bryan Matsuo
be added using a different function type, one accepting three
arguments?
Post by Bryan Matsuo
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
Post by Bryan Matsuo
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra
information to be
passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on to Go
func and
pointer values for future callbacks from C to Go. This works
today but is not
permitted by the invariant. It is hard to detect. One safe
approach is: Go
code that wants to preserve funcs/pointers stores them into a
map indexed by
an int. Go code calls the C code, passing the int, which the C
code may store
freely. When the C code wants to call into Go, it passes the int
to a Go
function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go function
pointer to C,
but the only way for C to use it is to call some *other* Go
function? Sorry
but there is no other Go function for the mdb_cmp() function to
call, the only
one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context pointer
doesn't achieve
anything. If this isn't what you're referring to, then please
explain exactly
what you hope to achieve with this context pointer.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Bryan Matsuo
2015-11-01 04:52:32 UTC
Permalink
Raw Message
I think arguments about how error prone a given function is are moot in
this case. Tests can be written. And maintaining the number of functions
Juerg is talking about is error prone in itself (and would potentially be
harder to test). I also believe that asking everyone else to roll there own
"simple" compound index functions involving strings will result in its own
set of errors.

Simply put, putting the mdb_set_compare and mdb_set_dupsort functions in
lmdb.h is letting people to shoot themselves in the foot. Allowing
functions to be reused more easily helps reduce that overall.
Post by Jürg Bircher
The example is just a draft. It is a suggestion to support compound keys
in a generic way with the intention to keep the extra coast as low as
possible. I need a generic way as I have to support user-defined compound
indices.
Could we not compromise on a #define so those who need the context could
compile with SUPPORT_CMP_CONTEXT. And if not needed there is no performance
penalty.
Therefore the code basis would remain the same.
Thanks for considering

Post by Howard Chu
Post by Bryan Matsuo
After digging it seems that it will continue to be possible for users to
safely pass static Go function references. It is quite a burden on the
user.
Post by Howard Chu
Post by Bryan Matsuo
But I will continue to think about it.
Jay, noted. I am open to exploring that direction. Though as was
pointed out
Post by Howard Chu
Post by Bryan Matsuo
earlier a library of static functions can be made more useful (if
somewhat
Post by Howard Chu
Post by Bryan Matsuo
slower) when a context object can configure their behavior. Before
reading
Post by Howard Chu
Post by Bryan Matsuo
that suggestion I was uncertain how much useful functionality could be
exposed
Post by Howard Chu
Post by Bryan Matsuo
as a library. I am writing general purpose bindings, so I would prefer a
function library be fairly generic.
Howard, do you have thoughts on the proposal from Juerg regarding a
compound-key comparison function implemented using a context value?
I remain unconvinced. key-comparison is still per-DB; a comparison
specifier
Post by Howard Chu
saves some space but at the expense of time - more compare ops, more
branching
Post by Howard Chu
per key compare. Dedicated functions are still the better way to go. Plus,
naive constructs like in the emailed example are easy to get wrong - his
example will never terminate because he only breaks from the switch
statement,
Post by Howard Chu
nothing breaks from the while(1) loop. It is attempting to be too clever,
when
Post by Howard Chu
a more straightforward approach will be faster and obviously bug-free.
Post by Bryan Matsuo
From the peanut gallery: Small set of static C functions is
probably the
Post by Howard Chu
Post by Bryan Matsuo
way to go. If I understand correctly, which I probablay don't, the
mismatch between green threads and OS threads means there's a lot of
expensive stack-switching involved in go->C->go execution.
On Thu, Oct 29, 2015 at 5:28 PM, Bryan Matsuo <
Juerg,
That is is interesting proposal. As an alternative to letting
users
Post by Howard Chu
Post by Bryan Matsuo
hook up arbitrary Go function for comparison, I have also
thought
Post by Howard Chu
Post by Bryan Matsuo
about the possibility of providing a small set of static C
functions
Post by Howard Chu
Post by Bryan Matsuo
usable for comparison. A flexible compound key comparison
function
Post by Howard Chu
Post by Bryan Matsuo
like this could fit well into that idea.
Howard,
Sorry I did not find the issues mentioned in previous searches.
I understand the concern about such a hot code path. I'm not
sure that
Post by Howard Chu
Post by Bryan Matsuo
Go would see acceptable performance.
But, Go is not an interpreted language (though there is glue).
And
Post by Howard Chu
Post by Bryan Matsuo
while I'm not positive about the performance of Go in this area
you
Post by Howard Chu
Post by Bryan Matsuo
seem to dismiss comparison functions in any other language. Is
it
Post by Howard Chu
Post by Bryan Matsuo
unreasonable to think that comparison functions written in other
compiled languages like Rust, Nim, or ML variants would also be
impractically slow?
I also believe you have misunderstood the practical problems of
passing Go function pointers to C. But to be fair, I think the
wording
Post by Howard Chu
Post by Bryan Matsuo
of that quoted paragraph could be better.
Sorry but there is no other Go function for the mdb_cmp()
function to
Post by Howard Chu
Post by Bryan Matsuo
call, the only one it knows about is the function pointer that
you pass.
Post by Howard Chu
Post by Bryan Matsuo
It may be of benefit to see how the I've used the context
argument in
Post by Howard Chu
Post by Bryan Matsuo
a binding being developed for the mdb_reader_list function.
https://github.com/bmatsuo/lmdb-go/blob/bmatsuo/reader-list-context-fix/lmdb/lmdbgo.c
Post by Howard Chu
Post by Bryan Matsuo
The callback passed to mdb_reader_list is always the same static
function because correctly calling a Go function from C
requires an
Post by Howard Chu
Post by Bryan Matsuo
annotated static Go function. The context argument allows
dispatch to
Post by Howard Chu
Post by Bryan Matsuo
the correct Go function that was configured at runtime. I
believe that
Post by Howard Chu
Post by Bryan Matsuo
is the "other" Go function you mentioned.
The implementation would be similar for mdb_set_compare. The
callback
Post by Howard Chu
Post by Bryan Matsuo
would always be the same static function which handles the
dynamic
Post by Howard Chu
Post by Bryan Matsuo
dispatch.
Cheers,
- Bryan
On Thu, Oct 29, 2015 at 3:12 AM JÃŒrg Bircher
Actually I’m not commenting on binding Go but I’m voting
for a
Post by Howard Chu
Post by Bryan Matsuo
context passed to the compare function.
I fully agree that the compare function is part of the
critical
Post by Howard Chu
Post by Bryan Matsuo
path. But as I need to define custom indexes with compound
keys
Post by Howard Chu
Post by Bryan Matsuo
the compare functions varies and it would be impractical to
predefine for any compound key combination a c function.
The compare context would be stored on the struct MDB_dbx.
typedef struct MDB_dbx {
MDB_val md_name; /**< name
of the
Post by Howard Chu
Post by Bryan Matsuo
database */
MDB_cmp_func *md_cmp; /**< function for
comparing keys */
void *md_cmpctx; /** user-provided
context for
Post by Howard Chu
Post by Bryan Matsuo
md_cmp **/
MDB_cmp_func *md_dcmp; /**< function for
comparing data items */
void *md_dcmpctx;/** user-provided
context for
Post by Howard Chu
Post by Bryan Matsuo
md_dcmp **/
MDB_rel_func *md_rel; /**< user relocate
function */
void *md_relctx; /**<
user-provided context for md_rel */
} MDB_dbx;
The following is a draft (not tested yet) of a generic
compare
Post by Howard Chu
Post by Bryan Matsuo
function. The context contains a compare specification
which is a
Post by Howard Chu
Post by Bryan Matsuo
null terminated list of <type><order> pairs.
// compareSpec <type><order>...<null>
int key_comp_generic(const MDB_val *a, const MDB_val *b,
char
Post by Howard Chu
Post by Bryan Matsuo
*compareSpec) {
int result = 0;
char *pa = a->mv_data;
char *pb = b->mv_data;
while (1) {
switch (*compareSpec++) {
break;
{
unsigned int va = *(unsigned int *)pa;
unsigned int vb = *(unsigned int *)pb;
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 4;
pb += 4;
}
}
{
unsigned long long va = *(unsigned long
long *)pa;
Post by Howard Chu
Post by Bryan Matsuo
unsigned long long vb = *(unsigned long
long *)pb;
Post by Howard Chu
Post by Bryan Matsuo
if (*compareSpec++ == ASCENDING_ORDER) {
result = (va < vb) ? -1 : va > vb;
}
else {
result = (va > vb) ? -1 : va < vb;
}
if (result != 0) {
break;
}
else {
pa += 8;
pb += 8;
}
}
{
unsigned int la = *(unsigned int *)pa;
unsigned int lb = *(unsigned int *)pb;
pa += 4;
pb += 4;
if (*compareSpec++ == ASCENDING_ORDER) {
result = strncmp(pa, pb, (la < lb) ?
la : lb);
Post by Howard Chu
Post by Bryan Matsuo
if (result != 0) {
break;
}
else {
result = (la < lb) ? -1 : la > lb;
}
}
else {
result = strncmp(pb, pa, (la < lb) ?
la : lb);
Post by Howard Chu
Post by Bryan Matsuo
if (result != 0) {
break;
}
else {
result = (la > lb) ? -1 : la < lb;
}
}
if (result != 0) {
break;
}
else {
pa += la;
pb += lb;
}
}
}
}
return result;
}
Regards
Juerg
On 29/10/15 10:40, "openldap-technical on behalf of Howard
Chu"
of
Post by Howard Chu
Post by Bryan Matsuo
Post by Bryan Matsuo
openldap-technical,
I am working on some Go (golang) bindings[1] for the
LMDB
Post by Howard Chu
Post by Bryan Matsuo
library and I have
Post by Bryan Matsuo
some interest in exposing the functionality of
mdb_set_compare
Post by Howard Chu
Post by Bryan Matsuo
(and
Post by Bryan Matsuo
mdb_set_dupsort). But it is proving difficult and I
have a
Post by Howard Chu
Post by Bryan Matsuo
question about the
Post by Bryan Matsuo
function(s).
Calling mdb_set_compare from the Go runtime is
challenging.
Post by Howard Chu
Post by Bryan Matsuo
Using C APIs with
Post by Bryan Matsuo
callbacks comes with restrictions[2][3]. I believe it
impossible to bind these
Post by Bryan Matsuo
functions way that is flexible, as one would expect. A
potential change to
Post by Bryan Matsuo
LMDB that would make binding drastically easier is
having
Post by Howard Chu
Post by Bryan Matsuo
MDB_cmp_func to take
Post by Bryan Matsuo
a third "context" argument with type void*. Then a
binding
Post by Howard Chu
Post by Bryan Matsuo
could safely use an
Post by Bryan Matsuo
arbitrary Go function for comparisons.
Is it possible for future versions of LMDB to add a
third
Post by Howard Chu
Post by Bryan Matsuo
argument to the
Post by Bryan Matsuo
MDB_cmp_func signature? Otherwise would it be
acceptable for a
Post by Howard Chu
Post by Bryan Matsuo
variant API to
Post by Bryan Matsuo
be added using a different function type, one accepting
three
Post by Howard Chu
Post by Bryan Matsuo
arguments?
Post by Bryan Matsuo
Thanks for the consideration.
Cheers,
- Bryan
[1] Go bindings -- https://github.com/bmatsuo/lmdb-go
[2] Cgo pointer restrictions --
https://github.com/golang/proposal/blob/master/design/12416-cgo-pointers.md
Post by Howard Chu
Post by Bryan Matsuo
Post by Bryan Matsuo
[3] Cgo documentation -- https://golang.org/cmd/cgo/
I see nothing in these restrictions that requires extra
information to be
passed from Go to C or from C to Go.
There is a vague mention in [2]
"A particular unsafe area is C code that wants to hold on
to Go
Post by Howard Chu
Post by Bryan Matsuo
func and
pointer values for future callbacks from C to Go. This
works
Post by Howard Chu
Post by Bryan Matsuo
today but is not
permitted by the invariant. It is hard to detect. One safe
approach is: Go
code that wants to preserve funcs/pointers stores them
into a
Post by Howard Chu
Post by Bryan Matsuo
map indexed by
an int. Go code calls the C code, passing the int, which
the C
Post by Howard Chu
Post by Bryan Matsuo
code may store
freely. When the C code wants to call into Go, it passes
the int
Post by Howard Chu
Post by Bryan Matsuo
to a Go
function that looks in the map and makes the call."
But it's nonsense in this case - you want to pass a Go
function
Post by Howard Chu
Post by Bryan Matsuo
pointer to C,
but the only way for C to use it is to call some *other*
Go
Post by Howard Chu
Post by Bryan Matsuo
function? Sorry
but there is no other Go function for the mdb_cmp()
function to
Post by Howard Chu
Post by Bryan Matsuo
call, the only
one it knows about is the function pointer that you pass.
If this is what you're referring to, adding a context
pointer
Post by Howard Chu
Post by Bryan Matsuo
doesn't achieve
anything. If this isn't what you're referring to, then
please
Post by Howard Chu
Post by Bryan Matsuo
explain exactly
what you hope to achieve with this context pointer.
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP
http://www.openldap.org/project/
Post by Howard Chu
--
-- Howard Chu
CTO, Symas Corp. http://www.symas.com
Director, Highland Sun http://highlandsun.com/hyc/
Chief Architect, OpenLDAP http://www.openldap.org/project/
Loading...