PeakFinder¶

class PeakFinder(settings: Optional[dict] = None)

Abstract base class to analyze and flag the start and end times of regions of interest in a timeseries for further analysis.

Public Methods¶

PeakFinder.close_resources(channel=None)¶: Perform any actions necessary to gracefully close resources before app exit

PeakFinder.construct_fitted_event(channel, index)¶

Construct an array of data corresponding to the peaks for the specified event

Parameters:

channel (int) – analyze only events from this channel
index (int) – the index of the target event

Returns:

numpy array of peaked data for the event, or None

Return type:

Optional[npt.NDArray[np.float64]]

Raises:

RuntimeError – if peakfinding is not complete yet

PeakFinder.enumerate_peaks(sublevel_starts, num_states)¶

Assign unique peak IDs to sublevels labeled as ‘peak’.

Parameters:

sublevel_starts (list[dict]) – List of dictionaries describing sublevels, each with a ‘type’ key.
num_states (int) – Total number of sublevels to process.

Returns:

List of peak IDs or None for non-peak sublevels.

Return type:

list[Optional[int]]

PeakFinder.filter_peaks(peaks, properties, unfolded_level, baseline_std, baseline, samplerate)¶: Filters peaks based on their level and proximity, classifying potential bundles or barcode features. - Type 1: Peaks on the same DNA carrier level (both bases around unfolded_level). - Type 2: Peaks higher than the carrier level (both bases above unfolded_level). - Type 3: Clusters (bundles) of close peaks with same type (1 or 2).

PeakFinder.find_unfolded_blockage_level(data, max_unfolded, baseline_mean, baseline_std)¶

Estimate the level of unfolded blockage based on data distribution.

Parameters:

data (numpy.ndarray) – Array of current values or similar signal to analyze.
max_unfolded (float) – Maximum allowed distance from the baseline to consider as unfolded.
baseline_mean (float) – Mean value of the baseline level.
baseline_std (float) – Standard deviation of the baseline level.

Returns:

Estimated unfolded blockage level.

Return type:

float

PeakFinder.get_empty_settings(globally_available_plugins=None, standalone=False)¶

Parameters:

globally_available_plugins (Optional[ Mapping[str, List[str]]]) – a dict containing all data plugins that exist to date, keyed by metaclass. Must include “MetaReader” as a key, with explicitly set Type MetaReader.
standalone (bool) – False if this is called as part of a GUI, True otherwise. Default False

Returns:

the dict that must be filled in to initialize the filter

Return type:

Mapping[str, Mapping[str, Union[int, float, str, list[Union[int,float,str,None], None]]]]

Purpose: Provide a list of settings details to users to assist in instantiating an instance of your MetaEventFinder subclass.

Get a dict populated with keys needed to initialize the filter if they are not set yet. This dict must have the following structure, but Min, Max, and Options can be skipped or explicitly set to None if they are not used. Value and Type are required. All values provided must be consistent with Type.

Your Eventfinder MUST include at least the “MetaReader” key, which can be ensured by calling super().get_empty_settings(globally_available_plugins, standalone) before adding any additional settings keys

This function must implement returning of a dictionary of settings required to initialize the filter, in the specified format. Values in this dictionary can be accessed downstream through the self.settings class variable. This structure is a nested dictionary that supplies both values and a variety of information about those values, used by poriscope to perform sanity and consistency checking at instantiation.

While this function is technically not abstract in MetaEventFinder, which already has an implementation of this function that ensures that settings will have the required MetaReader key available to users, in most cases you will need to override it to add any other settings required by your subclass. If you need additional settings, which you almost ccertainly do, you MUST call super().get_empty_settings(globally_available_plugins, standalone) before any additional code that you add. For example, your implementation could look like this:

settings = super().get_empty_settings(globally_available_plugins, standalone)
settings["Threshold"] = {"Type": float,
                        "Value": None,
                        "Min": 0.0,
                        "Units": "pA"
                        }
settings["Min Duration"] = {"Type": float,
                            "Value": 0.0,
                            "Min": 0.0,
                            "Units": "us"
                            }
settings["Max Duration"] = {"Type": float,
                            "Value": 1000000.0,
                            "Min": 0.0,
                            "Units": "us"
                            }
settings["Min Separation"] = {"Type": float,
                                "Value": 0.0,
                                "Min": 0.0,
                                "Units": "us"
                            }
return settings

which will ensure that your have the 3 keys specified above, as well as an additional key, "MetaReader", as required by eventfinders. In the case of categorical settings, you can also supply the “Options” key in the second level dictionaries.

Get a list of horizontal and vertical lines and associated labels to overlay on the graph generated by construct_fitted_event()

Parameters:

channel (int) – analyze only events from this channel
index (int) – the index of the target event

Returns:

a list of x locations to plot vertical lines and a list of y locations to plot horizontal lines, labels for the vertical lines, labels for the horizontal lines. Must be lists of equal length, or None

Return type:

Tuple[Optional[List[float]], Optional[List[float]], Optional[List[str]], Optional[List[str]]]

Raises:

RuntimeError – if fitting is not complete yet

Private Methods¶

PeakFinder._define_event_metadata_types()¶

Build a dict of metadata along with associated datatypes for use by the database writer downstream. Keys must match columns defined in _populate_event_metadata() All of this metadata must be populated during fitting. Options for dtypes are int, float, str, bool

Returns:: a dict of metadata keys and associated base dtypes
Return type:: Mapping[str, Union[int, float, str, bool]]

PeakFinder._define_event_metadata_units()¶

Returns:: a dict of metadata keys and associated base dtypes
Return type:: Mapping[str, Union[int, float, str, bool]]

PeakFinder._define_sublevel_metadata_types()¶

Build a dict of sublevel metadata along with associated datatypes for use by the database writer downstream. Keys must match columns defined in _populate_sublevel_metadata() All of this metadata must be populated during fitting. Options for dtypes are int, float, str, bool. Note that this is the type of entries in the associated list, it should not include the list element

Returns:: a dict of metadata keys and associated base dtypes
Return type:: Mapping[str, Union[int, float, str, bool]]

PeakFinder._define_sublevel_metadata_units()¶

Build a dict of sublevel metadata units , or None if unitless. Keys must match columns defined in _populate_sublevel_metadata() All of this metadata must be populated during fitting. it should not include the list element

Returns:: a dict of metadata keys and associated base dtypes
Return type:: Mapping[str, Optional[str]]

PeakFinder._init() → None¶: called at the start of base class initialization

PeakFinder._locate_sublevel_transitions(data, samplerate, padding_before, padding_after, baseline_mean, baseline_std)¶

Get a list of indices corresponding to the starting point of all sublevels within an event. Will be pre-pended with 0 if 0 is not the first entry. Plugin must handle gracefully the case where any of the arguments except data are None, as not all event loaders are guaranteed to return these values. Raising an an acceptable handler.

Parameters:

data (npt.NDArray[np.float64]) – an array of data from which to extract the locations of sublevel transitions
samplerate (float) – the sampling rate
padding_before (Optional[int]) – the number of data points before the estimated start of the event in the chunk
padding_after (Optional[int]) – the number of data points after the estimated end of the event in the chunk
baseline_mean (Optional[float]) – the local mean value of the baseline current
baseline_std (Optional[float]) – the local standard deviation of the baseline current

Returns:

a list of integers corresponding to sublevel transitions

Return type:

List[int]

Raises:

ValueError – if the event is rejected. Note that ValueError will skip and reject the event but will not stop processing of the rest of the dataset
AttributeError – if the fitting method cannot operate without provision of specific padding and baseline metadata and cannot rescue itself. This will cause a stop to processing of the dataset.

PeakFinder._populate_event_metadata(data, samplerate, baseline_mean, baseline_std, sublevel_metadata)¶

Assemble a list of metadata to save in the event database later. Note that keys ‘start_time_s’ and ‘index’ are already handled in the base class and should not be touched here.

Parameters:

data (npt.NDArray[np.float64]) – an array of data from which to extract the locations of sublevel transitions
samplerate (float) – the sampling rate
baseline_mean (Optional[float]) – the local mean value of the baseline current
baseline_std (Optional[float]) – the local standard deviation of the baseline current
sublevel_metadata (Mapping[str, List[Numeric]]) – the dict of sublevel metadata built by self._populate_sublevel_metadata()

Returns:

a dict of event metadata values

Return type:

Mapping[str, float]

PeakFinder._populate_sublevel_metadata(data, samplerate, baseline_mean, baseline_std, sublevel_starts)¶

Build a dict of lists of sublevel metadata with whatever arbitrary keys you want to consider in your event fitter. Every list must have exactly the same length as the sublevel_starts list. Note that ‘index’ is already handled in the base class

Parameters:

data (npt.NDArray[np.float64]) – an array of data from which to extract the locations of sublevel transitions
samplerate (float) – the sampling rate
baseline_mean (Optional[float]) – the local mean value of the baseline current
baseline_std (Optional[float]) – the local standard deviation of the baseline current
sublevel_starts (List[Dict[str, Any]]) – the list of sublevel start indices located in self._locate_sublevel_transitions()

Returns:

a dict of lists of sublevel metadata values, one list entry per sublevel for each piece of metadata

Return type:

Mapping[str, npt.NDArray[Numeric]]

PeakFinder._post_process_events(channel: int) → None¶

Parameters:: channel (int) – the index of the channel to postprocess

PeakFinder._pre_process_events(channel: int) → None¶

Parameters:: channel (int) – the channel to preprocess

PeakFinder._validate_settings(settings: dict) → None¶

Validate that the settings dict contains the correct information for use by the subclass.

Parameters:: settings (dict) – Parameters for event detection.
Raises:: ValueError – If the settings dict does not contain the correct information.