soprano.collection.collection#
Definition of the Collection class.
It handles multiple Atoms ASE objects and mirrors in this sense the structure of the Atoms object itself.
Classes
|
AtomsCollection object. |
- class soprano.collection.collection.AtomsCollection(structures=[], info={}, cell_reduce=False, progress=False, suppress_ase_warnings=True)[source]#
Bases:
object
AtomsCollection object.
An AtomsCollection represents a group of ASE Atoms objects. It handles them together, can perform mass operations on them, and stores arrays of informations related to them.
Initialize the AtomsCollection
Args:structures (list[str] or list[ase.Atoms]): list of file names orAtoms that will formthe collectioninfo (dict): dictionary of general information to attachto this collectioncell_reduce (bool): if True, perform a Niggli cell reduction onall loaded structuresprogress (bool): visualize a progress bar for the loading processsuppress_ase_warnings (bool): suppress annoying ASE warnings whenloading files (default is True)- static check_tree(path)[source]#
Checks if a path is a valid ‘tree’ format for a collection. This is any folder that satisfies the following conditions:
contains a .collection file storing metadata
contains a series of folders matching the list stored in the .collection file, and nothing else
This function will return 0 if both conditions are satisfied, 1 if only the first is, 2 if no .collection file is found, and -1 if the folder itself doesn’t exist.
Args:path (str): path to check for whether it matches or not thecollection patternReturns:result (int): 0, 1 or 2 depending on the outcome of the checks
- chunkify(chunk_size=None, chunk_n=None)[source]#
Split this collection into multiple collections based on either size or number of chunks.
Args:chunk_size (Optional[int]): maximum size of a generated chunkchunk_n (Optional[int]): number of chunks to generateReturns:chunks (list[AtomsCollection]): a list of the generated chunks
- classify(classes)[source]#
Return a dictionary of collections based on the names of assigned classes.
Args:classes (np.ndarray): array of the class to which each structurebelongs. For example [1, 2, 1] will put thefirst and third structures in class 1 andthe other in class 2. The classes can be anyhashable types, like int or str.Returns:classified (dict): a dictionary using class names as keys andsliced collections as values
- filter(filter_func)[source]#
Return a collection composed only of the elements for which a given filter function returns True.
Args:filter_func (function<Atoms>=> bool): filter function. Should take anAtoms object and return a booleanReturns:filtered (AtomsCollection): the filtered version of the collection
- get_array(name, copy=True)[source]#
Get a copy of an array of given name (or a reference if copy=False)
Args:name (str): name of the array to retrieve.copy (bool): if the array should be copied or a reference shouldbe returned instead.Returns:array (np.ndarray): the requested array
- static load_tree(path, load_format, opt_args={}, safety_check=3, tolerant=False, suppress_ase_warnings=True)[source]#
Load a collection’s structures from a series of folders, named like the structures, inside a given parent folder, as created by save_tree. The files can be loaded from a format of choice, or a function can be passed that will load them in a custom way.
Args:path (str): folder path in which the collection should be saved.load_format (str or function): format from which the structuresshould be loaded.If a string, it will be used as afile extension. If a function, itmust take as arguments the loadpath (a string) and any additionalarguments passed as opt_args, andreturn the loaded structure as anase.Atoms object.opt_args(dict): dictionary of additional arguments to pass toeither ase.io.read (if load_format is a string)or to the load_format function.safety_check (int): how much care should be taken to verify thefolder that is being loaded. Can be a numberfrom 0 to 3.Here’s the meaning of the codes:3 (default): only load a folder if it passesfully the check_tree control;2: load any folder that has a valid.collection file, but only the listedsubfolders;1: load any folder that has a valid.collection file, all subfolders. Arraydata will be discarded;0: no checks, try to load from all subfolders.tolerant (bool): if set to true, proceeds to load thestructures into an AtomsCollection, evenif some of the structures could not beread.Returns:coll (AtomsCollection): loaded collection
- run_calculators(properties=None, system_changes=None)[source]#
Run all previously set ASE calculators.
Args:properties (list[str]): list of properties to calculate (dependson type of Calculator used)system_changes (list[str]): list of changes to the structuresince the last calculation. Can beany combination of these five:‘positions’, ‘numbers’, ‘cell’,‘pbc’, ‘initial_charges’ and‘initial_magmoms’.
- save_tree(path, save_format, name_root='structure', opt_args={}, safety_check=3, suppress_ase_warnings=True)[source]#
Save the collection’s structures as a series of folders, named like the structures, inside a given parent folder (that will be created if not present). Arrays and info are stored in a pickled .collection file which works as metadata for the whole directory tree. The files can be saved in a format of choice, or a function can be passed that will save them in a custom way. Only one collection can be saved per folder.
Args:path (str): folder path in which the collection should be saved.save_format (str or function): format in which the structuresshould be saved.If a string, it will be used as afile extension. If a function, itmust take as arguments thestructure (an ase.Atoms object)the save path (a string), and anyadditional arguments passed asopt_args, and take care of savingthe required files.name_root (str): name prefix to be used for structures when a nameis not available in their info dictionaryopt_args (dict): dictionary of additional arguments to pass toeither ase.io.write (if save_format is a string)or to the save_format function.safety_check (int): how much care should be taken not to overwritepotentially important data in path. Can be anumber from 0 to 3.Here’s the meaning of the codes:3 (default): always ask before overwriting anexisting folder that passes the check_treecontrol, raise an exception otherwise;2: overwite any folder that passes fully thecheck_tree control, raise an exceptionotherwise;1: overwrite any folder that passes fully thecheck_tree control, ask for user inputotherwise;0 (DANGER - use at your own risk!): no checks,always overwrite path.
- set_array(name, a, dtype=None, shape=None, args={})[source]#
Add or modify an array of data related to the Atoms objects in this collection.
Args:name (str): name of the array to operate on.a (np.ndarray or function<Atoms, **kwargs>=> Any): the data to assign to the array (mustbe same length as the collection) ora function that takes an Atoms objectas the first argument and returns avalue. This will be mapped over thestructures to create the array.dtype (type): type to cast the values of the array to.shape (tuple [int]): shape of each entry of the array. Will bechecked if provided.args (dict): named arguments to pass to the function providedas a. Will be ignored if an array is passed instead.
- set_calculators(calctype, labels=None, params={})[source]#
Set an ASE calculator on each structure in the collection, and set said calculator’s parameters.
Args:calctype (ASE Calculator type): the type of calculatorto instantiate.labels (Optional[list[str]]): names to use for the calculators’files. If not present, randomgenerated names are used.params (Optional[dict]): parameters of the calculator to set.
- class soprano.collection.collection._AllCaller(all_list, all_class=None)[source]#
Bases:
object
_AllCaller class.
A meta-object that serves the purpose of calling a function on all members of a list in a natural way.
Initialize the AllCaller with an ‘all’ list