Object storage in omega|ml

When storing large files in MongoDB backend, they are broken up into chunks. Thus, when a file with size 118MB is stored using om.datasets.put and subsequently returned using om.datasets.get, a GridSFProxy object is returned. 

What is the omega|ml syntax for returning the whole file?

Comments

  • edited October 2020

    For clarity, let's separate omega|ml things from Python things:

    # omegaml 
    # -- store data as local pickle as a file in omegaml
    om.datasets.put(file_out, 'myfile.pickle')
    
    # -- get the file back
    om.datasets.get('myfile.pickle')  
    => returns a file-like object (in this case specifically, it has a a.read() method and it is a binary file)# we know that because
    
    # we know that because
    > om.datasets.help('myfile.pickle')
    ...
     |  get(self, name, local=None, mode='wb', open_kwargs=None, **kwargs)
     |      get a stored file as a file handler with binary contents or a local file
    

    With this, we're in Python land, specifically

    # Python 
    # for file-like objects, in particular binary files:
    file.read() returns a byte string 
    

    Combining the two, we can write, e.g.

    # pickle can load directly from a file object and return a Python object
    file_in = om.datasets.get('myfile.pickle`)
    pickle.load(file_in)
    


Sign In or Register to comment.