Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wip/mongo #261

Merged
merged 24 commits into from
Jul 13, 2020
Merged

Wip/mongo #261

merged 24 commits into from
Jul 13, 2020

Conversation

alexej-jordan
Copy link
Collaborator

basic support for polyDB

here a small example:

julia> db = Polymake.Polydb.get_db()
Polymake.Polydb.Database(Database(Client(URI("mongodb://polymake:[email protected]/?authSource=admin&ssl=true")), "polydb"))

julia> collection = Polymake.Polydb.get_collection(db, "Polytopes.Lattice.SmoothReflexive")
Polymake.Polydb.Collection(Collection(Database(Client(URI("mongodb://polymake:[email protected]/?authSource=admin&ssl=true")), "polydb"), "Polytopes.Lattice.SmoothReflexive"))

julia> query1 = Dict("DIM"=>3, "N_FACETS"=>5)
Dict{String,Int64} with 2 entries:
  "N_FACETS" => 5
  "DIM"      => 3

julia> results1 = Polymake.Polydb.find(collection, query1)
Polymake.Polydb.Cursor(Mongoc.Cursor{Mongoc.Collection}(Collection(Database(Client(URI("mongodb://polymake:[email protected]/?authSource=admin&ssl=true")), "polydb"), "Polytopes.Lattice.SmoothReflexive"), Ptr{Nothing} @0x0000000002f29980))

julia> for doc in results1
       println(doc["F_VECTOR"])
       end
Any["6", "9", "5"]
Any["6", "9", "5"]
Any["6", "9", "5"]
Any["6", "9", "5"]

any ideas for easily resetting the iterator and how to use more complex BSON queries?

@benlorenz
Copy link
Member

benlorenz commented Apr 23, 2020

@kalmarek it would be nice if we could support some advanced syntax for querying with other operators than =:

find(collection, DIM > 3 && N_VERTICES <= 12)

Can we do this via expressions or use some macro to parse this?

edit: the result would be some json like dict { 'DIM': { '$gt': 3 }, 'N_VERTICES': { '$le': 12} }

@kalmarek
Copy link
Contributor

The standard way of filtering collections/finding in collections is through

findall(f, A)

  Return a vector I of the indices or keys of A where f(A[I]) returns true. If there are no such elements of A, return an
  empty array.

or

  filter(f, a::AbstractArray)

  Return a copy of a, removing elements for which f is false. The function f is passed one argument.

(the first argument is a predicate)

so this syntax could become

filter(PolymakeQuery(...), polymake_db)

or if you have longer function you could use do syntax

filtered_db = filter(polymake_db) do
    ( ... ) #the content of your PolymakeQuery
end

of course PolymakeQuery could be a macro expression building the predicate expression and then converting it to the json we'd run Polydb magic on;


a different option is to support the api of queryverse: https://github.com/queryverse/Query.jl

to be honest I have no experience with databases, so I don't feel competent enough to have an opinion

@benlorenz
Copy link
Member

I don't think filter or findall is not applicable here, the collection is not a julia collection but just a reference to a collection on the database server. We would neet to transform the query into some JSON / BSON type (more or less a nested dict) and send this to the server via Mongoc.find.
(The collection might be way to large to iterate it locally with julia)

I need to read some more about Query.jl but the readme doesn't really look promising at the moment:

The package currently provides working implementations for in-memory data sources, but will eventually be able to translate queries into e.g. SQL.

@kalmarek
Copy link
Contributor

@benlorenz why not? we could add new methods to findall/filter

Base.filter(query::Polydb.Query, polydb::Polymake.Polydb.Collection)
Base.findall(query::Polydb.Query, polydb::Polymake.Polydb.Collection)

to do what we want to; then the logic behind could be hidden in Query construction and the way we access Polydb.Collection. Am I missing something?
e.g.

@query DIM >= 3 && N_VERTICES <= 12

constructs the appropriate Polydb.Query object and filter above takes it, constructs the json and dispatches to the appropriate Polydb.find method.


as for Query.jl: I don't feel like implementing my own poor-mans query language (the @query above), when a solution nearby is present.

db = Polymake.Polydb.get_db()

result = db |> 
    @select("Polytopes.Lattice.SmoothReflexive") |>
    @filter(_.DIM == 3) |>
    @filter(_.N_FACETS == 5) |>
    @map(_.F_VECTOR) |>
    collect

looks clean and we don't have to reinvent the wheel. Of course if we can make it work ;)

@alexej-jordan
Copy link
Collaborator Author

for my understanding it would optimal to construct a single BSON document that already cotains every condition we want, including e.g. < or the logical OR and then use that to query the server. locally applying filters does not help is very much, probably forcing us to iterate through a complete collection. passing a BSON variable as an additional options argument to Mongoc.find is also possible, which is the next i am going to investigate, but i think, besides sorting or reducing memory useage by giving us an object containing only the information we need, this might not be too useful for us

@benlorenz
Copy link
Member

Now it does make sense ... At the beginning I read your suggestions as we should use these functions but what you meant was probably that we should just use that syntax (which means that we still need to do most of the stuff ourselves but we have an API / Syntax to work along).

The syntax of Query.jl looks nice, we need to investigate how we can overload these macros to construct the corresponding BSON dictionary and dispatch it when something like collect is called.

@alexej-jordan
Copy link
Collaborator Author

@benlorenz @kalmarek do we want the objects returned by our iterator to still be the BSON document or should it already be parsed so wa have a Polymake.BigObject? in the previous commit i used the first idea, in the commit i just pushed it is the latter.

@benlorenz
Copy link
Member

I would prefer to hide the BSON from the user. We could/should keep some way to access it directly but default to convert. We might at some point store objects in the database that are not polymake objects (or not only polymake objects).

@kalmarek
Copy link
Contributor

kalmarek commented May 7, 2020

we defer query language to #281; I'll try to have a look at the code today

Project.toml Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
@alexej-jordan alexej-jordan requested a review from kalmarek May 10, 2020 21:49
@kalmarek
Copy link
Contributor

ok, I just tried this locally and i get:

julia> a, s = iterate(results1.mcursor)
ERROR: BSONError: domain=15, code=13053, message=No suitable servers found (`serverSelectionTryOnce` set): [TLS handshake failed: error:1416F086:SSL routines:tls_process_server_certificate:certificate verify failed calling ismaster on 'db.polymake.org:27017']

I guess it's a problem with julia not finding the correct certificates store? @benlorenz?

@benlorenz
Copy link
Member

I guess it's a problem with julia not finding the correct certificates store? @benlorenz?

Correct, one workaround might be to set SSL_CERT_FILE to a path containing a CA bundle before starting julia (or SSL_CERT_DIR to a directory with all the certificate files). There might be a bundle in path/to/julia-$version/share/julia/cert.pem.

I'm still looking for a better workaround.

@benlorenz
Copy link
Member

Another workaround is to run ]up once, directly after launching julia, which loads and initializes the correct SSL libraries, before using Mongoc or Polymake.

@kalmarek
Copy link
Contributor

ok, the second workaround didn't work for me

@benlorenz
Copy link
Member

maybe the second one only works on gentoo when using the system libraries, not with the julia bundled ssl libraries.

…i string

should fix connection problems where the certificate cannot be validated
@benlorenz
Copy link
Member

I added a fix for the certificate issue, please try without setting any environment variables.

It turns out one can set the CA file via the uri and we can use the one from MozillaCACerts_jll which has a variable pointing us to the correct path. This is in my opinion way better than hoping that joinpath(Sys.BINDIR,Base.DATAROOTDIR,"julia","cert.pem") exists.

Copy link
Contributor

@kalmarek kalmarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in general there is still too much entanglement between getting actual data and transforming the data for the purpose of printing. get_[something] should get it in a "lossless" format and only show functions should produce nicely looking string out of the pieces you can get

src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
@alexej-jordan
Copy link
Collaborator Author

I added a fix for the certificate issue, please try without setting any environment variables.

tested it and it worked

src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
src/polydb.jl Outdated Show resolved Hide resolved
@alexej-jordan
Copy link
Collaborator Author

i just pushed a commit in which i completely re-built the our meta-info workflow. i tried to make better use of arrays and strings when generating the long info string, and replaced the recursive algorithm by another one which about does the following:

  1. sort the collections and sections into a tree-like nesting of dicts and strings
  2. going through the tree in a depth-search-like way, generating an already sorted (due to 1) array of strings
  3. joining and printing that array
    the way this algorithm works makes some already implemented helpers obsolete, so i removed them.

i still need to add a few comments, but otherwise every suggestion is now implemented.

which leaves the following to be done:
a. querying (issue is already open)
b. info level: should not be to hard to implement now, info level can be checked in the same line as the haskey checks, but surely we need to add some more of these to cover all neccessary attributes
c. the jupyter notebook guiding through the usage. im already done 95%, but need to adjust some syntax and need help for the following problem: (@benlorenz @kalmarek )

"Objects from the database know where they come from"
trying to access a._polyDB on a Polymake.BigObject a received via this module causes the followong exception:

julia> a.polyDB
ERROR: Exception occured at Polymake side:
unknown property Polytope<Rational>::polyDB at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObjectType.pm line 430.
    Polymake::Core::BigObjectType::property(Polymake::Core::BigObjectType=ARRAY(0x8f4f0c8), "polyDB") called at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObjectType.pm line 710
    Polymake::Core::BigObjectType::encode_descending_path(Polymake::Core::BigObjectType=ARRAY(0x8f4f0c8), "polyDB") called at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObjectType.pm line 753
    Polymake::Core::BigObjectType::encode_read_request(Polymake::Core::BigObjectType=ARRAY(0x8f4f0c8), "polyDB") called at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObject.pm line 1551
    Polymake::Core::BigObject::give_pv called at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObject.pm line 1568
    Polymake::Core::BigObject::give(Polymake::polytope::Polytope__Rational=ARRAY(0xadf0410), "polyDB") called at -e line 0
    eval {...} called at -e line 0

accessing the "_polyDB" field on a BSON object received via this module works without any problems.

probably related, but i can not access a.DIM for a Polymake.BigObject a, probably because it actually is a function returning CONE_DIM - 1?

julia> a.DIM
ERROR: Exception occured at Polymake side:
unknown property Polytope<Rational>::DIM at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObjectType.pm line 430.
    Polymake::Core::BigObjectType::property(Polymake::Core::BigObjectType=ARRAY(0xaddd668), "DIM") called at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObjectType.pm line 710
    Polymake::Core::BigObjectType::encode_descending_path(Polymake::Core::BigObjectType=ARRAY(0xaddd668), "DIM") called at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObjectType.pm line 753
    Polymake::Core::BigObjectType::encode_read_request(Polymake::Core::BigObjectType=ARRAY(0xaddd668), "DIM") called at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObject.pm line 1551
    Polymake::Core::BigObject::give_pv called at /home/jordan/.julia/dev/Polymake/deps/usr/share/polymake/perllib/Polymake/Core/BigObject.pm line 1568
    Polymake::Core::BigObject::give(Polymake::polytope::Polytope__Rational=ARRAY(0xcc55f48), "DIM") called at -e line 0
    eval {...} called at -e line 0

Stacktrace:
 [1] give(::Polymake.BigObjectAllocated, ::String) at /home/jordan/.julia/dev/Polymake/src/perlobj.jl:45
 [2] getproperty(::Polymake.BigObjectAllocated, ::Symbol) at /home/jordan/.julia/dev/Polymake/src/perlobj.jl:59

again, doing this with BSON works perfectly fine.

@alexej-jordan alexej-jordan requested a review from kalmarek May 20, 2020 23:47
@benlorenz
Copy link
Member

"Objects from the database know where they come from"
trying to access a._polyDB on a Polymake.BigObject a received via this module causes the followong exception:

...
accessing the "_polyDB" field on a BSON object received via this module works without any problems.

_polyDB is stored as a normal field in the BSON but converted to an attachment for the polymake objects. On the polymake side there is a method get_attachment for BigObjects but we have not added this to the julia interface yet (and attach as well), I will try to do this tomorrow.

probably related, but i can not access a.DIM for a Polymake.BigObject a, probably because it actually is a function returning CONE_DIM - 1?

again, doing this with BSON works perfectly fine.

DIM is stored for convenience in the BSON to allow queries for DIM, but there is no property DIM as this would create a conflict since any Polytope with DIM=2 would also be a Cone with DIM=3. Thus we have CONE_DIM as a property and DIM as a method that works depending on the object.
In polymake both can be accessed via $a->DIM / $a->CONE_DIM but in julia the method is converted to a function (and lowercase):

julia> polytope.dim(c)
3

So both effects are as expected.

@benlorenz
Copy link
Member

Attachment support is in #288, but we will probably need to add support for the type which seems to be a plain perl hash:

julia> Polymake.get_attachment(x,"_polyDB")
PropertyValue wrapping HASH
Error showing value of type Polymake.PropertyValueAllocated:
ERROR: invalid value for an input string property
Stacktrace:
 [1] to_string(::Polymake.PropertyValueAllocated) at /home/lorenz/.julia/packages/CxxWrap/LfTHV/src/CxxWrap.jl:597
...

In this case it can be converted to Map{String,String} which is also unmapped but at least can be displayed:

julia> attc = @convert_to Map{String,String} att
PropertyValue wrapping pm::Map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >>
{(collection SmoothReflexive) (creation_date 2019-08-02) (section Polytopes.Lattice) (uri http://polymake.org/polytopes/paffenholz/www/fano.html) (version 2.1)}

I will think about a good way to map and convert HASH, a nested dictionary with arbitrary and mixed types as values (but only primitive types like int or string as keys).

Copy link
Contributor

@kalmarek kalmarek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the simple remark about types/default value argument it seems ready!
Thanks!

src/polydb.jl Outdated Show resolved Hide resolved
@alexej-jordan alexej-jordan requested a review from kalmarek May 26, 2020 20:32
@alexej-jordan alexej-jordan mentioned this pull request Jun 9, 2020
@benlorenz benlorenz merged commit 3a9fbed into master Jul 13, 2020
@benlorenz benlorenz deleted the wip/mongo branch March 17, 2021 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants