SciDataContainer API
Container classes
- class scidatacontainer.Container(items: dict | None = None, file: str | None = None, uuid: str | None = None, config: dict | None = None, compression: int = 8, compresslevel: int = -1, ignore_items: list[str] = [], **kwargs)
Bases:
AbstractContainerScientific data container.
- decode(fp: RawIOBase | BufferedIOBase, ignore_items: list[str] = [], validate: bool = True, strict: bool = True)
Take ZIP package as file object. Read items from the package and store them in this object.
- Parameters:
fp – File object to read from.
ignore_items – List of file paths that are not read into memory.
validate – If true, validate the content.
strict – If true, validate the hash, too.
- encode() Iterator[bytes]
Encode container as ZIP package.
- Yields:
bytes – next chunk of the generated zip file.
- freeze()
Calculate the hash value of this container and make it static. The container cannot be modified any more when this method was called once.
- hash()
Calculate and save the hash value of this container.
- items()
Return this container as a dictionary of item objects (key, value) tuples.
- keys() List[str]
Return a sorted list of the full paths of all items.
- Returns:
List of paths of Container items.
- Return type:
List[str]
- release()
Make this container mutable. If it was immutable, this method will create a new UUID and initialize the attributes created, storageTime and modelVersion in the item “content.json”. It will also delete an existing hash and make it a new container.
- upload(data: bytes | None = None, server: str | None = None, key: str | None = None)
Create a ZIP archive of the DataContainer and upload it to a server.
If data is passed to the function, data will be written to the file. Otherwise the byte representation of the class instance will be written to the file, which is what you typically want.
- Parameters:
data – If given, data to write to the file.
server – URL of the server.
key – API Key from the server to identify yourself.
- validate_content()
Make sure that the item “content.json” exists and contains all required attributes.
- validate_meta()
Make sure that the item “meta.json” exists and contains all required attributes.
- values() List
Return a list of all item objects.
- Returns:
List of item objects of the Container.
- Return type:
List
- write(fn: str, data: bytes | None = None)
Write the container to a ZIP package file.
If data is passed to the function, data will be written to the file. Otherwise the byte representation of the class instance will be written to the file, which is what you typically want.
- Parameters:
fn – Filename of export file.
data – If given, data to write to the file.
- class scidatacontainer.AbstractContainer(items: dict | None = None, file: str | None = None, uuid: str | None = None, config: dict | None = None, compression: int = 8, compresslevel: int = -1, ignore_items: list[str] = [], **kwargs)
Bases:
ABCScientific data container with minimal file support.
- The following file types are supported:
.json <-> dict
.txt <-> str
.bin <-> bytes
- decode(fp: RawIOBase | BufferedIOBase, ignore_items: list[str] = [], validate: bool = True, strict: bool = True)
Take ZIP package as file object. Read items from the package and store them in this object.
- Parameters:
fp – File object to read from.
ignore_items – List of file paths that are not read into memory.
validate – If true, validate the content.
strict – If true, validate the hash, too.
- encode() Iterator[bytes]
Encode container as ZIP package.
- Yields:
bytes – next chunk of the generated zip file.
- freeze()
Calculate the hash value of this container and make it static. The container cannot be modified any more when this method was called once.
- hash()
Calculate and save the hash value of this container.
- items()
Return this container as a dictionary of item objects (key, value) tuples.
- keys() List[str]
Return a sorted list of the full paths of all items.
- Returns:
List of paths of Container items.
- Return type:
List[str]
- release()
Make this container mutable. If it was immutable, this method will create a new UUID and initialize the attributes created, storageTime and modelVersion in the item “content.json”. It will also delete an existing hash and make it a new container.
- upload(data: bytes | None = None, server: str | None = None, key: str | None = None)
Create a ZIP archive of the DataContainer and upload it to a server.
If data is passed to the function, data will be written to the file. Otherwise the byte representation of the class instance will be written to the file, which is what you typically want.
- Parameters:
data – If given, data to write to the file.
server – URL of the server.
key – API Key from the server to identify yourself.
- validate_content()
Make sure that the item “content.json” exists and contains all required attributes.
- validate_meta()
Make sure that the item “meta.json” exists and contains all required attributes.
- values() List
Return a list of all item objects.
- Returns:
List of item objects of the Container.
- Return type:
List
- write(fn: str, data: bytes | None = None)
Write the container to a ZIP package file.
If data is passed to the function, data will be written to the file. Otherwise the byte representation of the class instance will be written to the file, which is what you typically want.
- Parameters:
fn – Filename of export file.
data – If given, data to write to the file.
File type support
- scidatacontainer.register(suffix: str, fclass: Type[AbstractFile], pclass: Type[object] = None)
Register a suffix to a conversion class.
If the parameter class is a string, it is interpreted as known suffix and the conversion class of this suffix is registered also for the new one.
- Parameters:
suffix – file suffix to identify this file type.
fclass – Conversion class derived from AbstractFile.
pclass – Python class that represents this object type.
Built-in conversion classes
- class scidatacontainer.filebase.AbstractFile(data)
Base class for converting datatypes to their file representation.
- abstractmethod decode(data: bytes)
Decode the Container content from bytes. This is an abstract method and it neets to be overwritten by inheriting class.
- abstractmethod encode() bytes
Encode the Container content to bytes. This is an abstract method and it needs to be overwritten by inheriting class.
- Returns:
Byte string representation of the object.
- Return type:
bytes
- hash() str
Return hex digest of SHA256 hash.
- Returns:
Hex digest of this object as string.
- Return type:
str
- class scidatacontainer.filebase.BinaryFile(data)
Bases:
AbstractFileData conversion class for a binary file.
- decode(data: bytes)
Store bytes in this class.
- encode() bytes
Return byte string stored in this class.
- Returns:
Byte string representation of the object.
- Return type:
bytes
- hash() str
Return hex digest of SHA256 hash.
- Returns:
Hex digest of this object as string.
- Return type:
str
- class scidatacontainer.filebase.TextFile(data)
Bases:
AbstractFileData conversion class for a text file.
- decode(data: bytes)
Decode text from given bytes string.
- encode() bytes
Encode text to bytes string.
- Returns:
Byte string representation of the object.
- Return type:
bytes
- hash() str
Return hex digest of SHA256 hash.
- Returns:
Hex digest of this object as string.
- Return type:
str
- class scidatacontainer.filebase.JsonFile(data)
Bases:
AbstractFileData conversion class for a JSON file represented as Python dictionary.
- decode(data: bytes)
Decode dictionary from given bytes string.
- encode() bytes
Convert dictionary to pretty string representation with indentation and return it as bytes string.
- Returns:
Byte string representation of the object.
- Return type:
bytes
- hash() str
Return hex digest of the SHA256 hash calculated from the sorted compact representation. This should result in the same hash for semantically equal data dictionaries.
- Returns:
Hex digest of this object as string.
- Return type:
str
- sortit(data: dict | list | tuple) str
Return compact string representation with keys of all sub-dictionaries sorted.
- Parameters:
data – Dictionary, list or tuple to convert to string”
- Returns:
String representation of data
- Return type:
str
Convenience functions
- scidatacontainer.timestamp() str
Return the current ISO 8601 compatible timestamp as string.
- Returns:
timestamp as string
- Return type:
str
- scidatacontainer.config.load_config(config_path: str = None, **kwargs) dict
Get author identity and server configuration.
This function uses kwargs, the scidata config file and environmental variables as sources for each parameter. The former sources overriding the latter ones.
Users may use the result of this function to inject the author identity when building a new container:
config = load_config(author=”John Doe”, email=”john@doe.com”) dc = Container(config=config)
- Parameters:
str – Path of the config file. If this is None, the default file will be used. This filename is only required for testing.
kwargs – Parameter values as keyword arguments.
- Returns:
A dictionary containing information strings with keys “author”, “email”, “orcid”, “organization”, “server”, “key”.
- Return type:
dict