Title: | Geospatial Queries Using 'PyMongo' |
---|---|
Description: | Utilizes methods of the 'PyMongo' 'Python' library to initialize, insert and query 'GeoJson' data (see <https://github.com/mongodb/mongo-python-driver> for more information on 'PyMongo'). Furthermore, it allows the user to validate 'GeoJson' objects and to use the console for 'MongoDB' (bulk) commands. The 'reticulate' package provides the 'R' interface to 'Python' modules, classes and functions. |
Authors: | Lampros Mouselimis [aut, cre] |
Maintainer: | Lampros Mouselimis <[email protected]> |
License: | Apache License 2.0 |
Version: | 1.0.3 |
Built: | 2024-11-26 03:06:31 UTC |
Source: | https://github.com/mlampros/geomongo |
mongodb geospatial methods ( using PyMongo in R )
mongodb geospatial methods ( using PyMongo in R )
# init <- geomongo$new(host = 'localhost', port = 27017, # tz_aware = FALSE, connect = TRUE, ...)
# init <- geomongo$new(host = 'localhost', port = 27017, # tz_aware = FALSE, connect = TRUE, ...)
See the reference link for more details on the ellipsis (...) concerning the additional parameters of the MongoClient()
the geomongo$new method initializes the MongoClient
the getClient method returns a "pymongo.mongo_client.MongoClient" object
the read_mongo_bson method allows the user to read a file/string using the bson.json_util module, which loads MongoDB Extended JSON data ( SEE https://stackoverflow.com/questions/42089045/bson-errors-invaliddocument-key-oid-must-not-start-with-trying-to-insert )
the geoInsert method allows the user to import data to a mongo-db from a folder, file or list
the geoQuery method allows the user to perform geospatial queries using one of the find, aggregate or command methods
For spherical query operators to function properly, you must convert distances to radians, and convert from radians to the distances units used by your application.
To convert distance to radians: divide the distance by the radius of the sphere (e.g. the Earth) in the same units as the distance measurement. To convert radians to distance: multiply the radian measure by the radius of the sphere (e.g. the Earth) in the units system that you want to convert the distance to.
The equatorial radius of the Earth is approximately 3,963.2 miles or 6,378.1 kilometers.
If specifying latitude and longitude coordinates, list the longitude first and then latitude:
Valid longitude values are between -180 and 180, both inclusive. Valid latitude values are between -90 and 90 (both inclusive).
geomongo$new(host = 'localhost', port = 27017, tz_aware = FALSE, connect = TRUE, ...)
--------------
getClient()
--------------
read_mongo_bson(FILE = NULL, STR = NULL)
--------------
geoInsert(DATA = NULL, TYPE_DATA = NULL, COLLECTION = NULL, GEOMETRY_NAME = NULL, read_method = "geojsonR")
--------------
geoQuery(QUERY = NULL, METHOD = NULL, COLLECTION = NULL, DATABASE = NULL, GEOMETRY_NAME = NULL, TO_LIST = FALSE)
new()
geomongo$new( host = "localhost", port = 27017, tz_aware = FALSE, connect = TRUE, ... )
host
(optional) hostname or IP address or Unix domain socket path of a single mongod or mongos instance to connect to, or a mongodb URI, or a list of hostnames / mongodb URIs. See the reference link for more information.
port
(optional) port number on which to connect
tz_aware
(optional) if TRUE, datetime instances returned as values in a document by this MongoClient will be timezone aware (otherwise they will be naive)
connect
(optional) if TRUE (the default), immediately begin connecting to MongoDB in the background. Otherwise connect on the first operation
getClient()
geomongo$getClient()
read_mongo_bson()
geomongo$read_mongo_bson(FILE = NULL, STR = NULL)
FILE
a character string specifying a valid path to a file ( applies to read_mongo_bson method )
STR
a character string ( applies to read_mongo_bson method )
geoInsert()
geomongo$geoInsert( DATA = NULL, TYPE_DATA = NULL, COLLECTION = NULL, GEOMETRY_NAME = NULL, read_method = "geojsonR" )
DATA
a valid path to a file/folder or a list ( applies to geoInsert method )
TYPE_DATA
a character string. One of 'folder', 'file', 'dict_one' (takes as input a list or a character string) or 'dict_many' (takes as input a list or a character string vector) ( applies to geoInsert method )
COLLECTION
a pymongo.collection.Collection object ( applies to geoInsert and geoQuery methods )
GEOMETRY_NAME
a character string specifying the name of the geometry object, as it appears in the file/string ( applies to geoInsert and geoQuery methods )
read_method
a character string specifying the method to use to read the data. Either using the "geojsonR" (package) or the "mongo_bson" utility function ( applies to geoInsert method )
geoQuery()
geomongo$geoQuery( QUERY = NULL, METHOD = NULL, COLLECTION = NULL, DATABASE = NULL, GEOMETRY_NAME = NULL, TO_LIST = FALSE )
QUERY
a named list specifying the query to use in mongodb ( applies to geoQuery method )
METHOD
a character string specifying the method to use to perform geospatial queries in mongodb. One of "find", "aggregate" OR "command" ( applies to geoQuery method )
COLLECTION
a pymongo.collection.Collection object ( applies to geoInsert and geoQuery methods )
DATABASE
a "pymongo.database.Database" object ( applies to geoQuery method )
GEOMETRY_NAME
a character string specifying the name of the geometry object, as it appears in the file/string ( applies to geoInsert and geoQuery methods )
TO_LIST
either TRUE or FALSE. If TRUE then the output of the geoQuery method will be a list, otherwise a data.table (matrix) object ( applies to geoQuery method )
clone()
The objects of this class are cloneable with this method.
geomongo$clone(deep = FALSE)
deep
Whether to make a deep clone.
https://github.com/mongodb/mongo-python-driver, https://docs.mongodb.com/manual/tutorial/calculate-distances-using-spherical-geometry-with-2d-geospatial-indexes/
## Not run: library(GeoMongo) init = geomongo$new() getter_client = init$getClient() init_db = getter_client$get_database("example_db") init_col = init_db$get_collection("example_collection") #-------------------------- # geonear using 'aggregate' #-------------------------- query_geonear = list('$geoNear' = list(near = list(type = "Point", coordinates = c(-122.5, 37.1)), distanceField = "distance", maxDistance = 900 * 1609.34, distanceMultiplier = 1 / 1609.34, spherical = TRUE)) init$geoQuery(QUERY = query_geonear, METHOD = "aggregate", COLLECTION = init_col, DATABASE = init_db, GEOMETRY_NAME = "location", TO_LIST = FALSE) ## End(Not run)
## Not run: library(GeoMongo) init = geomongo$new() getter_client = init$getClient() init_db = getter_client$get_database("example_db") init_col = init_db$get_collection("example_collection") #-------------------------- # geonear using 'aggregate' #-------------------------- query_geonear = list('$geoNear' = list(near = list(type = "Point", coordinates = c(-122.5, 37.1)), distanceField = "distance", maxDistance = 900 * 1609.34, distanceMultiplier = 1 / 1609.34, spherical = TRUE)) init$geoQuery(QUERY = query_geonear, METHOD = "aggregate", COLLECTION = init_col, DATABASE = init_db, GEOMETRY_NAME = "location", TO_LIST = FALSE) ## End(Not run)
simple way to validate a json instance under a given schema
json_schema_validator(json_data = NULL, json_schema = NULL)
json_schema_validator(json_data = NULL, json_schema = NULL)
json_data |
a named list specifying the input data to validate against the json-schema |
json_schema |
a named list specifying the json-schema |
Define a json-schema that the input data should follow and then validate the input data against the schema. If the input data follows the schema then by running the function nothing will be returned, otherwise an error with Traceback will be printed in the R-session.
In case that type is at the same time also a property name in the json data, then do not include "type" = "string" in the json schema ( https://github.com/epoberezkin/ajv/issues/137 )
https://pypi.org/project/jsonschema/, https://python-jsonschema.readthedocs.io/en/latest/
try({ if (reticulate::py_available(initialize = FALSE)) { if (reticulate::py_module_available("jsonschema")) { library(GeoMongo) schema_dict = list("type" = "object", "properties" = list( "name" = list("type" = "string"), "location" = list("type" = "object", "properties" = list( "type" = list("enum" = c("Point", "Polygon")), "coordinates" = list("type" = "array") )))) data_dict = list("name" = "example location", "location" = list("type" = "Point", "coordinates" = c(-120.24, 39.21))) json_schema_validator(json_data = data_dict, json_schema = schema_dict) } } }, silent=TRUE)
try({ if (reticulate::py_available(initialize = FALSE)) { if (reticulate::py_module_available("jsonschema")) { library(GeoMongo) schema_dict = list("type" = "object", "properties" = list( "name" = list("type" = "string"), "location" = list("type" = "object", "properties" = list( "type" = list("enum" = c("Point", "Polygon")), "coordinates" = list("type" = "array") )))) data_dict = list("name" = "example location", "location" = list("type" = "Point", "coordinates" = c(-120.24, 39.21))) json_schema_validator(json_data = data_dict, json_schema = schema_dict) } } }, silent=TRUE)
MongoDB (bulk) commands
mongodb_console(Argument = NULL, ...)
mongodb_console(Argument = NULL, ...)
Argument |
a character string specifying the mongodb shell command to run from within an R-session |
... |
the ellipsis (...) parameter allows a unix-user (windows-user) to give additional parameters to the base-R system() (shell()) function which is run in background. |
MongoDB shell commands are important for instance if someone has to import/export bulk data to a mongo database. This R function utilizes the system base function to run the mongodb shell command from within an R-session. See the reference links for more details. The ellipsis (...) parameter could be used for instance to disallow messages be printed in the console (on unix by using ignore.stdout and ignore.stderr).
https://docs.mongodb.com/database-tools/mongoimport/, https://docs.mongodb.com/database-tools/mongoexport/
## Not run: library(GeoMongo) ARGs = "mongoimport -d DB -c COLLECTION --type json --file /MY_DATA.json" mongodb_console(Argument = ARGs) ## End(Not run)
## Not run: library(GeoMongo) ARGs = "mongoimport -d DB -c COLLECTION --type json --file /MY_DATA.json" mongodb_console(Argument = ARGs) ## End(Not run)