The Science of Individuality Measurement Algorithm (SIMA) – Overview

SIMA is a discipline

SIMA is one discipline. Statistics is a different discipline.

SIMA is for and about individuals – N of 1.

In contrast to SIMA, descriptive statistics describes groups of individuals. Inferential statistics makes inferences from samples of individuals to populations.

Figure 1 illustrates the relationship between SIMA and statistics. See how SIMA comes between “Collect Data” and “Aggregate.” Statistics aggregates. See what this means.

SIMA applies to Complex Adaptive Systems (CAS) of many types – the “Individual” in Figure 1. Often CAS are nested. This approximates a nested hierarchy with various examples of CAS.

  • Earth’s biosphere investigated as a whole
    (Big N of 1)
  • The world economy
  • One national economy
  • One person
  • One nervous system
  • One brain
  • One neuron
    (Little N of 1)

SIMA is a contribution to measurement science

SIMA is an extension of measurement science. In the context of directed network graphs with nodes and edges, SIMA software computes Interaction-over-Time (IoT) scores that quantify edges.

IoT scores quantify the direction and amount of evidence for interactions-over-time. Positive scores quantify the amount of evidence that higher levels of activity in one node are associated with higher levels of activity in a second node. Positive IoT scores are excitatory. Negative IoT scores are inhibitory.

Every IoT score is mathematically standardized. Every IoT score is one score from a distribution of potential scores that has mean = 0 and standard deviation = 1 unless 0 is the only potential score.

Distributions of potential scores are defined by the input data in combination with an operationally defined scoring protocol.

Use IoT scores to describe and help predict how individual CAS “work” over time.

Figure 2 illustrates the tripartite definition of how CAS “work” over time as operationally defined with SIMA. “Work” is comprised of function, response, and agency. Figure 2 illustrates “work” for a person in the context of health and health care.

In addition, use sets of IoT scores to describe coordinated action as an emergent property of CAS.

Figure 3 introduces IoT scores as an extension of the International System of Units (SI). SI is comprised of seven base units. Figure 3 shows a few of many derived units computed from base units. In turn, SIMA computes values of a new category of derived units computed from multivariate time series for other measures. IoT scores are expressed in a standardized unit of measure called a Bagne.

Use SIMA to diagnose

CAS function, as introduced in Figure 2 in the previous section on measurement science, can be more or less ordered. Disorder often threatens the well-being and indeed the continued existence of a CAS such as a human. Disorder often is a life or death issue.

The International Classification of Diseases (ICD) and the Diagnostic and Statistical Manual of Mental Disorders (DSM) include many early attempts to classify functional disorders such as Type 2 diabetes mellitus and major depressive disorder. 

Such diagnoses often are based on data collected at one or just a few times such as clinic visits. Contrast this with how data are gathered in modern intensive care units. Now it is possible to collect upwards of 100,000 data points per second.

Figure 4 illustrates how we can capitalize, both in terms of health and wealth, on such multivariate time series data. Each node represents one time series variable. The size of each node represents an average level of activity over time for an individual CAS. The width of each arrow represents level of functional and effective excitatory or inhibitory pairwise connectivity as measured by SIMA. SIMA uses Boolean independent and dependent events to extend beyond pairwise connectivity.

Notice how node size and arrow width are largely independent in Figure 4. This important point will be illustrated for Type 2 diabetes. A healthy person with normal levels of insulin and glucose could be illustrated with two modest sized nodes and a wide red arrow for the inhibitory effect of insulin on glucose as quantified by SIMA. Developing Type 2 diabetes and insulin resistance could be illustrated with larger nodes with a narrower arrow. Insulin sensitizing drugs would tend to increase arrow width and decrease node size. This illustrates a mechanism of treatment effect.

Figure 4 suggests how SIMA users can extend diagnostic taxonomies beyond signs and symptoms of disorder to measures of order and disorder per se.

An individual CAS exhibits emergent system properties. SIMA quantifies coordinated action as an emergent property with an extended set of arrows as suggested by Figure 4. Quantifying emergent system properties appears to be especially important for understanding and treating neuropsychiatric disorders.

Consider dividing most omic sciences into two main groups. Genomics is becoming a new gold standard for identifying individuals such as humans and cancers. Use genomics to identify the individual human CAS as in Figure 4. In contrast, action omics such as transcriptomics, proteomics, and metabolomics address nodes that can vary and fluctuate in level over time.  Figure 4 illustrates action nodes. Consider SIMA as a tool for interactomics at biological and psychological levels of investigation.

Molecular medicine focuses on nodes. The next step for precision, personalized, and P4 medicine might well be to extend this focus beyond nodes to edges. Figure 4 represents edges as arrows quantified by SIMA.

SIMA is a tool that offers opportunities to develop new taxonomies of ordered and disordered function in humans and other types of CAS.

 

Use SIMA to evaluate

Use SIMA to evaluate. Benefit and Harm Scores are a variation of IoT scores for evaluative investigations such as clinical trials. Reverse the positive or negative signs of IoT scores as might be necessary so that all positive scores are toward or beneficial and all negative scores are untoward or harmful. To illustrate, both lower levels of “bad” cholesterol and higher levels of “good” cholesterol can be identified as being toward and beneficial.

Figure 5 introduces how SIMA is a common metric that helps users quantify and balance many beneficial and harmful effects scientifically. Use Benefit and Harm Scores to help reduce the dimensionality of treatment evaluation problems from many to one.

Apply SIMA to multivariate time series

Apply SIMA to multivariate time series broadly defined as being two or more repeated measurements of two or more variables. At least one time series must operate as an independent or predictor variable and one time series as a dependent or predicted variable. Increase power by using more repeated measurements.

Figure 6 illustrates multivariate time series for drug evaluation with one type of drug, three response variables, 16 repeated measurements, and one patient.

Figure 6Figure 6

Figure 7 illustrates multivariate time series with 143 repeated measurements for two hormones.

Figure 8 illustrates multivariate time series for seven brain regions of interest (ROIs) and 180 repeated measurements.

SIMA integrates

SIMA helps enable systems science that is integrative and holistic yet detailed. FIGURE 9 from the cited source illustrates integration at the biological level of investigation for living systems. 

SIMA helps build on measures and technologies developed as part of reductionist science as when it is possible to collect multivariate time series about transcripts, proteins, lipids, carbohydrates, metabolites, electrophysiological variables, and brain activity levels. As examples, apply SIMA to multivariate time series about action variables about proteins or brain activity levels to quantify pathways and networks as indicated by Figure 9 starting at the level of each individual.

Figure 9 focuses on the biological level of analysis. Apply essentially the same approach to help integrate across various levels of investigation such as biological, psychological and social; across various levels of temporal resolution as for brain-behavior relationships; and for different types of Complex Adaptive Systems such as those identified on the first home page figure. 

Figure 10, cited from the Journal of Medical Internet Research, helps illustrate how SIMA helps advance systems science. More specifically, SIMA quantifies the lines in Figure 10 that represent "Interrelationships/Dynamics." 

Big Data

Data can be big in at least three different ways as illustrated in Figure 11 with subjects, times, and action variables. Action variables; unlike genetic characteristics used to help identify species, individual people, and tumors; can vary and fluctuate in level over time.

 

Figure 12 represents the capabilities of statistics as a discipline to process big data in the context of the Randomized Controlled Trial (RCT) designs currently used for the evidentiary foundations of drug development, drug regulation, and evidence based medicine. Many such trials test one primary hypothesis defined on one primary response variable using change scores computed from one baseline measurement and one endpoint measurement for each subject. Figure 12 illustrates the amount of data used classical RCT designs. Such designs can be described as "data poor" - at least with respect to the amount of data used to test primary hypotheses. In fact, most RCTs collect more data (response variables and repeated measurements) that should be processed more adequately with help from SIMA to gain new and valuable insights.

Classical or current first generation RCT designs often require many subjects, which is expensive. In addition, classical RCT designs use group averages in a manner that irretrievably confounds individuality with treatment effects. This convention impedes disciplines such as personalized or precision medicine.

Figure 13 uses volume to portray the advantages of using SIMA together with statistics as illustrated by the spiral figure on SIMA Overview. The RCT designs portrayed in Figure 13 are "data rich" compared to classical RCT designs.

"Data rich" Precision RCT (PRCT) designs use more repeated measurements to increase power and Overall Benefit and Harm Scores to provide more comprehensive and integrated evaluations of safety AND efficacy (effectiveness). The Scoring Details & Capabilities page includes additional advantages of using PRCT designs. 

SIMA from different perspectives

Earlier material has presented SIMA from a measurement perspective.

Consider SIMA from additional perspectives. Consider SIMA as a tool for:

  1. Separating signals from noise in multivariate time series

  2. Pattern finding

  3. Empirical induction – Drawing generalized conclusions and helping to make predictions from data

  4. Artificial intelligence.