MetaDatabaseWriter¶

class MetaDatabaseWriter(settings: Optional[dict] = None)

Bases: BaseDataPlugin

What you get by inheriting from MetaDatabaseWriter¶

MetaDatabaseWriter is the base class for writing the metadata corresponding to events fitted by a MetaEventFitter subclass instance and is the end of most poriscope analysis workflows prior to post-processing. MetaDatabaseWriter depends on and is linked at instantiation to a MetaEventFitter subclass instance that serves as its source of nanopore data, meaning that creating and using one of these plugins requires that you first instantiate an eventfitter.

Poriscope ships with a subclass of MetaDatabaseWriter already that writes data to a sqlite3 format. While additional subclasses can write to almost any format you desire, we strongly encourage standardization around this format. Think twice before creating additional subclasses of this base class. It is not sufficient to write just a MetaWriter subclass. In addition to this base class, you will also need a paired MetaEventLoader subclass to read back and use the data you write to any other format for downstream analysis.

Warning

We strongly encourage standardization on the :ref:SQLiteDBWriter subclass, so please think carefully before creating other formats. If you do, your database must be queryable with standard SQL and be able to implement everything required by MetaDatabaseLoader in order to be compatible with the poriscope loaders and workflows, and you will need to also create an associated database loader

Public Methods¶

Abstract Methods¶

These methods must be implemented by subclasses.

abstractmethod MetaDatabaseWriter.close_resources(channel: int | None = None) → None¶

Parameters:: channel (Optional[int]) – channel ID

Purpose: Clean up any open file handles or memory.

This is called during app exit or plugin deletion, as well as at the end of any batch write operation, to ensure proper cleanup of resources that could otherwise leak. Do this for all channels if no channel is specified, otherwise limit your closure to the specified channel. Your files should be flushed and closed here, if they are not in your writing step. If no such operation is needed, it suffices to pass. In the case of writers, this method is also called with a specific channel identifier at the end of any batch write operation (a call to write_events()), and so can be used to ensure atomic write operations if possible.

abstractmethod MetaDatabaseWriter.reset_channel(channel: int | None = None) → None¶

Parameters:: channel (Optional[int]) – channel ID

Purpose: Reset the state of a specific channel for a new operation or run.

This is called any time an operation on a channel needs to be cleaned up or reset for a new run. If channel is not None, handle only that channel, else close all of them. Most database writers will create permanent state changes in the form of data written to the output file, that should be deleted or otherwise set up for subsequent overwrite when this function is called.

Concrete Methods¶

MetaDatabaseWriter.force_serial_channel_operations() → bool¶

Returns:: True if only one channel can run at a time, False otherwise
Return type:: bool

Purpose: Indicate whether operations on different channels must be serialized (not run in parallel).

By default, writer plugins are assumed to not be threadsafe and will run in serial mode when called from the poriscope GUI. If you want to change this, you must also ensure that the parent eventfitter object is threadsafe for pulling data from it. You can play it safe by calling self.eventfitter.force_serial_channel_operations().

MetaDatabaseWriter.get_empty_settings(globally_available_plugins: Dict[str, List[str]] | None = None, standalone=False) → Dict[str, Dict[str, Any]]¶

Parameters:

globally_available_plugins (Optional[ Dict[str, List[str]]]) – a dict containing all data plugins that exist to date, keyed by metaclass. Must include “MetaReader” as a key, with explicitly set Type MetaReader.
standalone (bool) – False if this is called as part of a GUI, True otherwise. Default False

Returns:

the dict that must be filled in to initialize the filter

Return type:

Dict[str, Dict[str, Any]]

Purpose: Provide a list of settings details to users to assist in instantiating an instance of your MetaWriter subclass.

Get a dict populated with keys needed to initialize the filter if they are not set yet. This dict must have the following structure, but Min, Max, and Options can be skipped or explicitly set to None if they are not used. Value and Type are required. All values provided must be consistent with Type.

settings = {'Parameter 1': {'Type': <int, float, str, bool>,
                                 'Value': <value> or None,
                                 'Options': [<option_1>, <option_2>, ... ] or None,
                                 'Min': <min_value> or None,
                                 'Max': <max_value> or None
                                },
                ...
                }

This function must implement returning of a dictionary of settings required to initialize the writer, in the specified format. Values in this dictionary can be accessed downstream through the self.settings class variable. This structure is a nested dictionary that supplies both values and a variety of information about those values, used by poriscope to perform sanity and consistency checking at instantiation.

While this function is technically not abstract in MetaWriter, which already has an implementation of this function that ensures that settings will have the required MetaEventFinder key and Output File key available to users, in most cases you will need to override it to add any other settings required by your subclass. If you need additional settings, which you almost certainly do, you MUST call super().get_empty_settings(globally_available_plugins, standalone) before any additional code that you add. For example, your implementation could look like this:

settings = super().get_empty_settings(globally_available_plugins, standalone)
settings["Output File"]["Options"] = [
                        "SQLite3 Files (*.sqlite3)",
                        "Database Files (*.db)",
                        "SQLite Files (*.sqlite)",
                        ]
settings["Experiment Name"] = {"Type": str}
settings["Voltage"] = {"Type": float, "Units": "mV"}
settings["Membrane Thickness"] = {"Type": float, "Units": "nm", "Min": 0}
settings["Conductivity"] = {"Type": float, "Units": "S/m", "Min": 0}
return settings

which will ensure that your have the 4 keys specified above, as well as two additional keys, MetaReader and Output File. By default, it will accept any file type as output, hence the specification of the Options key for the relevant plugin in the example above.

MetaDatabaseWriter.report_channel_status(channel: int | None = None, init=False) → str¶

Return a string detailing any pertinent information about the status of analysis conducted on a given channel

Parameters:

channel (Optional[int]) – channel ID
init (bool) – is the function being called as part of plugin initialization? Default False

Returns:

the status of the channel as a string

Return type:

str

MetaDatabaseWriter.write_events(channel: int) → Generator[float, bool | None, None]¶

Create a generator that will loop through events in self.eventfitter in channel and call self._write_data() to commit it to file

Parameters:: channel (int) – the index of the channel to commit
Returns:: the progress of the generator, normalized to [0,1]
Return type:: Generator[float, Optional[bool], None]

Private Methods¶

Abstract Methods¶

These methods must be implemented by subclasses.

abstractmethod MetaDatabaseWriter._init() → None¶

Purpose: Perform generic class construction operations.

This is called immediately at the start of class creation and is used to do whatever is required to set up your reader. Note that no app settings are available when this is called, so this function should be used only for generic class construction operations. Most readers simply pass this function.

abstractmethod MetaDatabaseWriter._initialize_database(channel: int | None = None) → None¶

Parameters:: channel (Optional[int]) – int indicating which output to flush

Purpose: initialize your database for writing

In this function, do whatever you need to do in order to prepare your database for writing data to it. This is called at the start of a batch write operation, with an optional channel argument. In the case of a single database file you can ignore channel and simply create the file and database schema. In the case of a single file per channel, you might open a file handle associated to each channel and write any top-level metadata required. We strongly encourage atomic operations, so that file handles are closed in the same function they are opened wherever possible to avoid trailing file handles in the event of an unrecoverable exception.

abstractmethod MetaDatabaseWriter._validate_settings(settings: dict) → None¶

Validate that the settings dict contains the correct information for use by the subclass.

Parameters:: settings (dict) – Parameters for event detection.
Raises:: ValueError – If the settings dict does not contain the correct information.

abstractmethod MetaDatabaseWriter._write_channel_metadata(channel: int) → None¶

Parameters:: channel (int) – int indicating which output to flush

Purpose: Write any information you need to save about the channel

Given a channel, write any channel level information (for example, as provided by the user in the settings dict, or the associated samplerate) to the database files you created in _initialize_database(). Your channels table in yout database should share a key with experiments or ahev some other way of cross-referencing channels to experiments in cases where experiments can have many channels.

abstractmethod MetaDatabaseWriter._write_event(channel: int, event_metadata: Dict[str, int | float | str | bool], sublevel_metadata: Dict[str, List[int | float | str | bool]], event_data: ndarray[tuple[int, ...], dtype[float64]], raw_data: ndarray[tuple[int, ...], dtype[float64]], fit_data: ndarray[tuple[int, ...], dtype[float64]], abort: bool | None = False, last_call: bool | None = False) → bool¶

Parameters:

channel (int) – identifier for the channel to write events from
data (numpy.ndarray) – 1D numpy array of data to write to the active file in the specified channel.
event_metadata (Dict[str, List[Union[int, float, str, bool]]]) – a dict of metadata associated to the event
event_metadata – a dict of lists of metadata associated to sublevels within the event. You can assume they all have the same length.
event_data (npt.NDArray[np.float64]) – the raw data for the event (not filtered)
raw_data (npt.NDArray[np.float64]) – A numpy array of raw event data to be stored as binary in the database.
fit_data (npt.NDArray[np.float64]) – A numpy array of fitted event data to be stored as binary in the database.
abort (Optional[bool]) – True if an abort request was issued in the caller, perform cleanup as needed
last_call (Optional[bool]) – True if this is the last time the function will be called, commit to file and clean up as needed

Returns:

True on successful write, False on failure or ignore

Return type:

bool

Purpose: Write a single event worth of data and metadata to the database.

Given all of the event information above, write whatever subset you want to save to the database for both event metadata and sublevel metadata for each event. We strongly encourage atomic operations, but given event volume, you might consider committing or flushing events only every few hundred events, or opening a file handle for writing the first time this is called and using the open handle for subsequent writes. Ensure that the events table has a refernece to the channels and experiments tables, and that the sublevels tables has a way to reference both of those and the events table for the parent event.

abstractmethod MetaDatabaseWriter._write_experiment_metadata(channel: int | None = None) → None¶

Parameters:: channel (int) – int indicating which output to flush

Purpose: Write any information you need to save about the experiment itself.

Given an optional channel argument, write any experiment level information (for example, as provided by the user in the settings dict) to the database files you created in _initialize_database().

Concrete Methods¶

MetaDatabaseWriter.__init__(settings: dict | None = None)¶: Initialize and set up output environment, save metadata for subclasses.

MetaDatabaseWriter._finalize_initialization() → None¶: If additional initialization operations are required beyond the defaults provided in BaseDataPlugin or MetaReader that must occur after settings have been applied to the reader instance, you can override this function to add those operations, subject to the caveat below.

Warning

This function implements core functionality required for broader plugin integration into Poriscope. If you do need to override it, you MUST call super()._finalize_initialization() before any additional code that you add, and take care to understand the implementation of both apply_settings() and _finalize_initialization() before doing so to ensure that you are not conflicting with those functions.

MetaDatabaseWriter._validate_param_types(settings: dict) → None¶

Validate that the filter_params dict contains correct data types

param settings: A dict specifying the parameters of the filter to be created. Required keys depend on subclass. :type settings: dict :raises TypeError: If the filter_params parameters are of the wrong type