In the NTD Nanoscope experiments [2] , the molecular dynamics of a (single) captured non‐translocating transducer molecule provide a unique stochastic reference signal with stable statistics on the observed, single‐molecule blockaded channel current, somewhat analogous to a carrier signal in standard electrical engineering signal analysis. Discernible changes in blockade statistics, coupled with SSA signal processing protocols, enable the means for a highly detailed characterization of the interactions of the transducer molecule with binding targets (cognates) in the surrounding (extra‐channel) environment.
The transducer molecule is engineered to generate distinct channel blockade signals depending on its interaction with target molecules [2] . Statistical models are trained for each binding mode, bound and unbound, for example, by exposing the transducer molecule to zero or high (excess) concentrations of the target molecule. The transducer molecule is engineered so that these different binding states generate distinct signals with high resolution. Once the signals are characterized, the information can be used in a real‐time setting to determine if trace amounts of the target are present in a sample through a serial, high‐frequency sampling, and pattern recognition, process.
Thus, in Nanoscope applications of the SSA Protocol, due to the molecular dynamics of the captured transducer molecule, a unique reference signal with strongly stationary (or weakly, or approximately stationary) signal statistics is engineered to be generated during transducer blockade, analogous to a carrier signal in standard electrical engineering signal analysis. In these applications a signal is deemed “strongly” stationary if the EM/EVA projection (HMM method from Chapter 6) on the entire dataset of interest produces a discrete set of separable (non‐fuzzy domain) states. A signal is deemed “weakly” stationary if the EM/EVA projection can only produce a discrete set of states on subsegments (windowed sections) of the data sequence, but where state‐tracking is possible across windows (i.e. the non‐stationarity is sufficiently slow to track states – similar to the adiabatic criterion in statistical mechanics). A signal is approximately stationary, in a general sense, if it is sufficiently stationary to still benefit, to some extent, from the HMM‐based signal processing tools (that assume stationarity).
The adaptive SSA ML algorithms, for real‐time analysis of the stochastic signal generated by the transducer molecule can easily offer a “lock and key” level of signal discrimination. The heart of the signal processing algorithm is a generalized Hidden Markov Model (gHMM)‐based feature extraction method, implemented on a distributed processing platform for real‐time operation. For real‐time processing, the gHMM is used for feature extraction on stochastic sequential data, while classification and clustering analysis are implemented using a SVM. In addition, the design of the ML‐based algorithms allow for scaling to large datasets, via real‐time distributed processing, and are adaptable to analysis on any stochastic sequential dataset. The ML software has also been integrated into the NTD Nanoscope [2] for “real‐time” pattern‐recognition informed (PRI) feedback [1–3] (see Chapter 14 for results). The methods used to implement the PRI feedback include distributed HMM and SVM implementations, which enable the processing speedup that is needed.
1.9.2 Nanoscope Cheminformatics – A Case Study for Device “Smartening”
The Nanoscope example can also be considered as a case study for device “smartening,” whereby device state is tracked in terms of easily measured device characteristics, such as the ambient device “noise.” A familiar example of this would be the sound of your car engine. In essence, you could eventually have an artificial intelligence (AI) listening to the sound of your engine to similarly track state and issue warnings like an expert mechanic with that car, without the need for sensors, or to supplement sensors (reducing expense, providing secondary fail‐safe). Such an AI might even offer predictive fault detection.
1.10 Deep Learning using Neural Nets
ML provides a solution to the “Big Data” problem, whereby a vast amount of data is distilled down to its information essence. The ML solution sought is usually required to perform some task on the raw data, such as classification (of images) or translation of text from one language to another. In doing so, ML solutions are strongly favored where a clear elucidation of the features used in the classification are also revealed. This then allows a more standard engineering design cycle to be accessed, where the stronger features thereby identified may play a stronger role, or guide the refinement of related strong features, to arrive at an improved classifier. This is what is accomplished with the previously mentioned SSA Protocol.
So, given the flexibility of the SSA Protocol to “latch on” to signal that has a reasonable set of features, you might ask what is left? (Note that, all communication protocols, both natural (genomic) and man‐made, have a “reasonable” set of features.) The answer is simply when the number of features is “unreasonable” (with enumeration not even known, typically). So instead of 100 features, or maybe 1000, we now have a situation with 100 000 to 100s of millions of features (such as with sentence translation or complex image classification). Obviously Big Data is necessary to learn with such a huge number of features present, so we are truly in the realm of Big Data to even begin with such problems, but now have the Big Features issue (e.g. Big Data with Big Features, or BDwBF). What must occur in such problems is a means to wrangle the almost intractable large feature set of information to a much smaller feature set of information, e.g. an intial layer of processing is needed just to compress the feature data. In essence, we need a form of compressive feature extraction at the outset in order to not overwhelm the acquisition process. An example from the biology of the human eye, is the layer of local neural processing at the retina before the nerve impulses even travel on to the brain for further layers of neural processing.
For translation we have a BDwBF problem. The feature set is so complex the best approach is NN Deep Learning where we assume no knowledge of the features but rediscover/capture those features in compressed feature groups that are identified in NN learning process at the first layer of the NN architecture. This begins a process of tuning over NN architectures to arrive at a compressive feature acquisitiuon with strong classification performance (or translation accuracy, in this example). This learning approach began seeing widespread application in 2006 and is now the core method for handling the Big Feature Set (BFS) problem. The BFS problem may or may not exist at the initial acquisition (“front‐end”) of your signal processing chain. NN Deep Learning to solve the BFS problem will be described in detail in Chapter 13, where examples using a Python/TensorFlow application to translation will be given. In the NN Deep Learning approach, the features are not implicitly resolvable, so improvements are initially brute force (even bigger data) since an engineering cycle refinement would involve the enormous parallel task of explicitly resolving the feature data to know what to refine.
1.11 Mathematical Specifics and Computational Implementations
Throughout the text an effort is made to provide mathematical specifics to clearly understand the theoretical underpinnings of the methods. This provides a strong exposition of the theory but the motivation for this is not to do more theory, but to then proceed to a clearly defined computational implementation. This is where mathematical elegance meets implementation/computational practicality (and the latter wins). In this text, the focus is almost entirely on elegent methods that also have highly efficient computational implementations.
2