-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lazy browse records #13
Comments
Hi Egor, Yes, browse records can be a bit slow sometimes (some improvements could be done, see below). It is very difficult to keep a cache of records in OERPLib, as we can't rely on any cached data if some of them has been updated in the mean time on the server, and we can't check if the cache is outdated each time (it will cost a lot of RPC queries, there is no gain to do that). So I chose to fetch the data each time we need them: id(oerp.user.company_id) == id(oerp.user.company_id)
False If you want to keep a record, you have to reference it: company = oerp.user.company_id Having a cache at the field level is difficult too, as some field values depend on other fields of the record (kind of @api.depends of the new API in Odoo), and as said, having a cache is difficult anyway. But! All fields of a record are not fetched in fact. This is the case of all relational fields for access rights reasons. Regards, |
Sébastien, import time
import oerplib
oerp = oerplib.OERP(server='localhost', database='production_local')
oerp.login(user='***', passwd='***')
model = oerp.get('mrp.production')
ids = model.search([])
print 'Length = {} records'.format(len(ids))
t = time.time()
for obj in model.browse(ids):
obj.name
print 'Browse records = {} s'.format(time.time() - t)
t = time.time()
for id in ids:
model.read(id, ['name'])
print 'Read method (loop) = {} s'.format(time.time() - t)
t = time.time()
model.read(ids, ['name'])
print 'Read method (batch) = {} s'.format(time.time() - t) And here is an output:
|
Thanks for your feedback. But there is no way to fetch only one field with model = oerp.get('project.task')
ids = model.search([])
print 'Length = {} records'.format(len(ids))
t = time.time()
for task in task_model.browse(task_ids):
task.name
print "Browse records (loop) = {} s".format(time.time() - t)
print "Browse records (batch) = TO IMPLEMENT :)"
t = time.time()
for id_ in task_ids:
task_model.read([id_])
print "Read method (loop) = {} s".format(time.time() - t)
t = time.time()
for id_ in task_ids:
task_model.read([id_], ['name'])
print "Read method (loop) [1 field] = {} s".format(time.time() - t)
t = time.time()
for id_ in task_ids:
task_model.read([id_], ['id', 'name', 'project_id'])
print "Read method (loop) [3 fields] = {} s".format(time.time() - t)
t = time.time()
for id_ in task_ids:
task_model.read([id_], ['id', 'name', 'project_id', 'message_follower_ids'])
print "Read method (loop) [3 fields + 1 comp.] = {} s".format(time.time() - t)
t = time.time()
task_model.read(task_ids)
print "Read method (batch) = {} s".format(time.time() - t)
t = time.time()
task_model.read(task_ids, ['name'])
print "Read method only (batch) [1 field] = {} s".format(time.time() - t)
t = time.time()
task_model.read(task_ids, ['id', 'name', 'project_id'])
print "Read method (batch) [3 fields] = {} s".format(time.time() - t)
t = time.time()
task_model.read(task_ids, ['id', 'name', 'project_id', 'message_follower_ids'])
print "Read method (batch) [3 fields + 1 comp.] = {} s".format(time.time() - t) The ouput (test done on a remote server, not local): Length = 182 records
Browse records (loop) = 55.3023481369 s
Browse records (batch) = TO IMPLEMENT :)
Read method (loop) = 57.8453090191 s
Read method (loop) [1 field]= 48.6828379631 s
Read method (loop) [3 fields] = 49.1296329498 s
Read method (loop) [3 fields + 1 comp.] = 53.3451161385 s
Read method (batch) = 1.34297895432 s
Read method (batch) [1 field] = 0.35977101326 s
Read method (batch) [3 fields] = 0.394207954407 s
Read method (batch) [3 fields + 1 comp.] = 0.496376037598 s We can see that the performance of Read or browse all fields is slower due to:
What could be done to improve OERPLib:
EDIT: the same test performed locally
|
Hi Sébastien, Here is a quick and dirty cacheable object proxy I created which reads objects and stores them in a cache. I assumed that _cache = {}
_models = {}
class ObjectProxy(object):
"""
Cacheable read-only object proxy
"""
def __new__(cls, oerp, model_name, obj_id, *args, **kwargs):
# Initialize both object and model caches
if oerp not in _models:
_models[oerp] = {}
if model_name not in _models[oerp]:
_models[oerp][model_name] = oerp.get(model_name)
if oerp not in _cache:
_cache[oerp] = {}
if model_name not in _cache[oerp]:
_cache[oerp][model_name] = {}
if obj_id not in _cache[oerp][model_name]:
_cache[oerp][model_name][obj_id] = object.__new__(cls, oerp, model_name, obj_id, *args, **kwargs)
return _cache[oerp][model_name][obj_id]
def __init__(self, oerp, model_name, obj_id):
self.oerp = oerp
self.model = _models[oerp][model_name]
self._cache = {}
self.id = obj_id
def refresh(self):
# To prevent recursion we copy related fields first, clear cache and only then refresh related fields
rel_fields = []
for field in self._cache.values():
if isinstance(field, ObjectProxy):
rel_fields.append(field)
elif isinstance(field, (list, tuple)):
rel_fields += field
self._cache.clear()
for field in rel_fields:
field.refresh()
def __getattr__(self, item):
if item not in self._cache:
# Hacky way to get column (we don't want to re-read fields from the server, as it is redundant)
column = self.model._browse_class.__osv__['columns'][item]
val = self.model.read(self.id, [item])[item]
if hasattr(column, 'relation') and val:
if column.type in ['one2many', 'many2many']:
self._cache[item] = [ObjectProxy(self.oerp, column.relation, item_id) for item_id in val]
else:
item_id, name = val
self._cache[item] = ObjectProxy(self.oerp, column.relation, item_id)
else:
self._cache[item] = val
return self._cache[item]
def __eq__(self, other):
return isinstance(other, ObjectProxy) and other.id == self.id and other.model._name == self.model._name
def __ne__(self, other):
return not isinstance(other, ObjectProxy) or other.id != self.id or other.model._name != self.model._name
def __repr__(self):
return 'Proxy [{},{}]'.format(self.model._name, self.id)
def clear_cache():
_cache.clear()
def create_proxy(oerp, model_name, obj_ids):
if isinstance(obj_ids, (list, tuple)):
return [ObjectProxy(oerp, model_name, obj_id) for obj_id in obj_ids]
return ObjectProxy(oerp, model_name, obj_ids) And here is a usage example: import oerplib
oerp = oerplib.OERP(server='localhost', database='production_local')
oerp.login(user='***', passwd='***')
proxies = ObjectProxy.create_proxy(oerp, 'mrp.production', oerp.search('mrp.production', []))
print len(proxies)
proxy = proxies[-1]
proxy2 = proxies[-2]
print proxy.workcenter_lines[0].name
print proxy2.product_id.default_code
proxy.refresh() Output:
|
Hello Sébastien,
I noticed that upon initialization of BrowseRecord class by calling
oerp.browse('model', id)
the Oerplib populates the newly created browse record with all fields from the server. This becomes slow for large models (especially when you only need to read one field) and therefore I never use them and instead only do calls tooerp.read()
to save time.What I like about browse records is that they are caching, i.e. by loading it once you can access it many times and it will not re-read data from the server.
I was wondering if there is a way to speed up them. One solution would be to implement a lazy initialization BrowseRecord. Upon creation this record will not do anything, but when a field is accessed the record should check its cache first and if the field is not there it should attempt to fetch it from the server. This should improve the speed considerably.
What do you think?
Egor
The text was updated successfully, but these errors were encountered: