Object storage in omega|ml
When storing large files in MongoDB backend, they are broken up into chunks. Thus, when a file with size 118MB is stored using om.datasets.put and subsequently returned using om.datasets.get, a GridSFProxy object is returned.
What is the omega|ml syntax for returning the whole file?
Comments
For clarity, let's separate omega|ml things from Python things:
# omegaml # -- store data as local pickle as a file in omegaml om.datasets.put(file_out, 'myfile.pickle') # -- get the file back om.datasets.get('myfile.pickle') => returns a file-like object (in this case specifically, it has a a.read() method and it is a binary file)# we know that because # we know that because > om.datasets.help('myfile.pickle') ... | get(self, name, local=None, mode='wb', open_kwargs=None, **kwargs) | get a stored file as a file handler with binary contents or a local fileWith this, we're in Python land, specifically
Combining the two, we can write, e.g.
# pickle can load directly from a file object and return a Python object file_in = om.datasets.get('myfile.pickle`) pickle.load(file_in)