DataSpeaks,
Inc. Issues a Call for Leadership
Advancing the Next Big Idea in Software
The
next really big new idea in software will help enable humanity
to develop practical scientific understanding of mechanisms by
which complex systems such as biological systems, patients, and
economies work, i.e., function internally, respond to their environments,
and act as agents on their environments. The same software also
will help enable us to describe how complex systems change and
adapt.
It appears
that our ability to understand complex adaptive systems has been,
until now, about where our understanding of infectious diseases
was before the discovery of germs.
We as a people have made enormous scientific and practical progress
in understanding relatively simple time invariant systems. But
our understanding of complex adaptive systems remains in its infancy.
The
discovery required to speed practical scientific understanding
of complex adaptive systems appears to have been made. This discovery
also has the potential to jumpstart economic productivity and
human welfare. This discovery is a computational method for using
time ordered data to measure temporal
contingencies between events as required to understand complex
adaptive systems. The events are defined on variables and sets
of variables. The computational method also has been described
as measuring interactions or longitudinal associations. The value
of the required software rests on the fundamental value of measurement
in science.
Contingencies matter.
Those who landed the Spirit and Opportunity rovers on Mars did
not violate any natural laws as they advanced human will to explore
the universe. They used the laws of nature by managing to account
for and control many contingencies. We may need to measure and
account for contingencies - what follows what - in order to understand
the nature of complex adaptive systems, including ourselves, and
to control our destinies (see responsible agency). Measurement of temporal
contingencies helps make them a new subject matter for scientific
investigations.
I discovered the required software technology primarily by
serendipity while working on a specific
problem. The technology is well described as temporal contingency analysis.
I founded DataSpeaks, Inc., a one-man startup with technical
expertise, prototype software, proof
of concept demonstrations, knowledge about how the technology can
address fundamental unmet
needs in huge markets, and an intellectual property portfolio.
I am calling for leadership to create a company to advance DataSpeaks’
software. DataSpeaks must overcome the awesome power of the statistical
establishment, apparently our primary competition, in matters where
it counts.
The
statistical method is great for
describing groups and making inferences from representative samples
to populations. The statistical method does not account for individuality
and time as required to measure, describe, elucidate, and visualize
mechanisms by which complex adaptive systems
work,change, and adapt. For lack of better
computational methods to understand mechanisms, we are getting
swamped in data about systems, not realizing the potential
of many great data collection and computer technologies, and often
suffering needlessly. In addition, we are hampered in developing
artificial systems that learn.
This is a Call for Leadership -
business, scientific, academic, and political leadership - as well as
leadership in ethics.
DataSpeaks’ software might be as consequential and valuable as
the Web browser. The software is basically simple, computationally intensive,
and often disruptive. Early adopters can achieve huge competitive advantages. Business
leadership needs to be strong and experienced enough to create and manage a
company that thrives without being Netscaped. This is a public call for
leadership in diverse
markets, a strategy that may help keep DataSpeaks from being Netscaped.
The new technology is covered by issued patents.
The new technology helps enable sciences of various types of
systems. DataSpeaks calls for scientists who investigate complex adaptive
systems to lead with the new software technology primarily through
demonstrations, discoveries, and publications.
Many great technologies have been invented outside academia.
Leading universities established departments of electrical engineering after
Edison. They started courses in aerodynamics after the Wright Brothers.
Similarly, this new technology calls for academic leaders to establish new departments,
grow the intellectual mass of the new methodology, educate tomorrow’s leaders,
and provide better counsel about how to process data. Life sciences centers
that lead with the new technology would have advantages in becoming the leading
life sciences centers in the world.
The new software technology can make clinical research more ethical. In
addition, this new technology appears to support a new scientific worldview in which
individuals can be held accountable as responsible agents. The new technology
also can be misused to threaten humanity. Therefore this also is a call for
leadership in ethics.
I also call for political leadership that can help deliver
the benefits of this new software technology to the people.
Please peruse DataSpeaks.com for more information. The
information is presented from a somewhat personal and historical perspective. I
hope that you can share some of my passion and excitement for this adventure.
Then consider responding to this Call for Leadership
by DataSpeaks, Inc. if
you are qualified, think big, and have the will and the resources required to help
advance DataSpeaks’ software as the next big idea in software.
DataSpeaks’
Software: What it Does, How it is New, and Why it is Valuable
Curtis A. Bagne, Ph.D.
Scientist, Inventor, and Founder of DataSpeaks, Inc.
1. Introduction
2. Why We Are Getting
Swamped in Data
2.1.
Hypothesis-Driven Science and Data-Driven Discovery Science
2.2. The Data
Snapshots/Data Movie Analogy
2.3. Sources of
Resistance
3. Getting Out of
Data Swamps
3.1. Developing
Data Movies
3.2. Benefits of
Developing Data Movies
3.3. Data
Integration Is Not Sufficient
4. Patents
5.
Opportunities and Challenges of Being First
5.1. Primary
Competition
5.1.1.
The Statistical Establishment
5.1.1.1.
Statisticians
5.1.1.2.
Deference toward Statisticians
5.1.2. MQALA
and the Statistical Method - Technical Differentiation
5.2. Stephen
Wolfram and Forrest Gump
5.3.
Departments of Empirical Induction
5.4. The Time
is Right
6. Eight
Selected Market Opportunities
6.1.
Revitalizing the Pharmaceutical Industry
6.1.1. Revitalizing Drug
Discovery
6.1.2. Re-engineering
Clinical Research
6.1.3.
Competing Visions for Clinical Research and Practice
6.1.3.1. The Old Vision
for Clinical Research and Practice
6.1.3.2. A New Vision for
Clinical Research and Practice
6.1.4.
Opportunities and Challenges
6.2.
Reforming Health Care
6.2.1. Health Care
Providers
6.2.1.1. Clinicians
6.2.1.1.1. Diagnosis
6.2.1.1.2.
Treatment Evaluation
6.2.1.1.3. Forces
for Change
6.2.1.2.
Health Care Administrators
6.2.2. Health Care
Payers
6.2.2.1. Health
Status Measures and Measurement of Benefit/Harm
6.2.2.2.
Individualization, Treatment Guidelines, and Payment Policies
6.2.3. Patients, Potential
Patients, and Lay Caregivers
6.2.4. Consumer
Driven, Market Oriented Health Care Reform
6.3.
Improving Public Health
6.4.
Visualizing How Brains Work, Change, and Adapt
6.4.1. Visible Brain,
Visible Human
6.5. Improving
Prediction of Economies and Capital Markets
6.6.
Modifying Behavior
6.7. Advancing
Responsible Agency
6.7.1. Scientific
Worldviews
6.7.2. Agency
6.7.3. Responsible
Agency
6.7.4. Leadership
6.8.
Reinvigorating Machine Learning and Artificial Intelligence
7.
Acknowledgements
APPENDIX A: How to
Develop Data Movies - A Primer on How DataSpeaks Interactions® Works
APPENDIX B: Three Proof-of-Concept
Demonstrations
DataSpeaks’
Software: The Next Big Idea in Software
Curtis A. Bagne, Ph.D.
Scientist, Inventor, and Founder of DataSpeaks, Inc.
1. Introduction
DataSpeaks, Inc. offers a major new category of software that empowers users to make
discoveries, act more intelligently, and provide better services. In addition,
users of health
care, financial,
and other knowledge and information intensive services should demand that
service providers use DataSpeaks’ software because the software often enables patients
to receive better care and clients in areas such as financial services to
receive more intelligent service.
We are
getting swamped in data about complex adaptive systems. DataSpeaks
can help overcome this problem by making data useful through empirical induction. We make data speak.
® We make data speak more effectively to human interests
and needs.
DataSpeaks’ software will help users understand how complex systems
(1) work, (2) change, and (3) adapt. We are, each of us individually, a system.
Each of us is made up of subsystems - a nervous system, a cardiovascular
system, an immune system, a metabolic system, etc. Many systems are nested.
Each of us is part of larger systems - entire populations, social
systems, economic and financial systems, ecosystems, etc. People create systems
for business and production.
Complex adaptive systems exist in environments. Both systems
and environments are assumed to have parts and attributes that can be measured
repeatedly. Furthermore, it is assumed that at least some measured variables
fluctuate in level over time in a more or less coordinated manner. Coordinated activity
helps define complex adaptive systems. This coordination can be described by
measuring interactions for individuals over time.
Complex
adaptive systems (1) function internally, (2) respond to their
environments including treatments, and (3) act as agents on their
environments. Together, these three types of mechanism - function,
response, and agency - will be said to describe how systems work.
Furthermore, mechanisms by which complex systems work can change
through processes such as development and aging. In addition, systems can adapt through mechanisms such as evolution and learning.
Systems can become disordered and respond to interventions such as medical treatments
as well as economic and health care policies. Systems work, change, and adapt
at different levels and types of system organization and understanding - e.g., physical, chemical, biological,
psychological, social, economic, and cultural. Systems such as capital markets
change as participants adapt and behave differently.
Emergence is adaptation that crosses
thresholds such as speciation to form new types of systems. Biological systems
emerged several billion years ago and continued to evolve. New types of systems
continue to emerge. Additional levels are being added to hierarchies of control
and coordination. Designed systems result from human agency.
Science is an advanced expression of human adaptation.
We come to understand complex systems scientifically as we
discover and describe mechanisms by which systems work and adapt as well as how
these mechanisms change over time. These mechanisms involve interactions and
temporal contingencies that describe coordination.
DataSpeaks’ software is new, unique, and valuable because it
appears to be the first software system that actually and effectively measures
interactions or temporal contingencies over time for individual systems. DataSpeaks’ software product is called
DataSpeaks Interactions®. The software applies to time ordered data
from measures and yields values of new measures. DataSpeaks Interactions®
elucidates mechanisms. It helps tell us how systems work. It also measures,
apparently for the first time, apparent benefit/harm as an interaction over
time between repeated measurements of treatments and repeated measurements of
health for the same individual.
The value of DataSpeaks
Interactions® rests on the fundamental value of measurement in
science. One essential
feature of DataSpeaks Interactions® is especially notable
because it both contrasts sharply with prevailing practice in processing
dimensional data and has major scientific import. When applied to dimensional
time ordered data, DataSpeaks Interactions® first defines
potentially large numbers of discrete independent and dependent events and
determines their presence or absence on most repeated measurement occasions.
Then it measures interactions, temporal contingencies, or longitudinal
associations between various types of independent events and various types of
dependent events. The resulting measures of interaction are new dimensional
variables.
This method of going from dimensional variables to new
dimensional variables through discrete events appears to have major scientific
import. Scientists often seek natural laws such as e=mc2, which
describe functional relationships involving measured quantities such as energy,
mass, and the speed of light. Many scientists use sets of functional
relationships to form mathematical models.
DataSpeaks Interactions® provides fundamentally
new measures of interaction that can be used to form mathematical models of the
mechanisms by which complex systems work and adapt as well as how these
mechanisms change.
The new measures of interaction provided by DataSpeaks
Interactions® essentially are measures of temporal contingency. By
measuring temporal contingencies, temporal contingencies can be investigated
scientifically. The new measures of temporal contingency appear to be a key for
a new method to advance scientific investigations from simple time invariant
systems to complex adaptive systems. The new measures of interaction will help
shape scientific
worldviews.
In addition, DataSpeaks Interactions® appears to
advance a union of apparent opposites (determinacy versus mere contingency) by
making something as apparently ephemeral and non-consequential as temporal contingencies
a subject matter for functional relationships, mathematical models and
scientific laws. Furthermore, as we shall see, temporal contingencies appear to
be productive in nature through mechanisms such as natural selection and learning -
mechanisms that help shape and characterize complex adaptive systems.
DataSpeaks Interactions® helps enable systems
science. Sciences of various systems
can empower and motivate us to act more intelligently. It can help us make better
predictions and decisions. Sciences of systems can help us discover new
products such as drugs,
develop such products more efficiently, improve services such as health care
and investment advice
and help us design systems that meet human needs. DataSpeaks Interactions®
will improve economic productivity and enhance human welfare.
But first, in order to accomplish all of this, leaders need to
understand why we are getting swamped in data about systems and what we can do
to overcome this problem.
2. Why We
Are Getting Swamped in Data
We are getting swamped in data about systems primarily
because of two interdependent problems. One problem is the data
problem. Much of the data that we currently have and continue to collect is
of limited value because the prevailing data collection design provides little
of the information about time and individuality that is needed to understand
complex adaptive systems. I will introduce this problem and its solution with
an analogy that illustrates
why data movies are better than data snapshots for understanding mechanisms.
The other problem is the software
problem. Current software does not measure the interactions that describe
the mechanisms by which individual complex systems work and adapt as well as
how these mechanisms change. Furthermore, current software does not measure
benefit/harm in evaluative investigations such as clinical trials as
illustrated in Appendix A. This is the fundamental
problem that is solved by DataSpeaks Interactions®.
I will introduce the problem of getting swamped in data in the
context of hypothesis-driven science versus data driven discovery science for biological
systems. Similar problems exist in disciplines that deal with other types of
systems. Then I will present the analogy.
I also will anticipate and address reasons why some people will delay progress
and resist getting out of data swamps.
2.1. Hypothesis-Driven
Science and Data-Driven Discovery Science
Science is evolving to advance objective understanding of
nature. Traditionally, many scientists have tested hypotheses involving a few
variables at a time that form small parts of more complex systems. Scientists were
admonished by statisticians not to
collect data except insofar as it could be used to test specific hypotheses. These
aspects of hypothesis-driven science, together with a lack of high throughput measurement
technologies and a lack of extensive data collection and storage infrastructures,
helped keep scientists out of data swamps. But these limitations also impeded
scientific progress.
Recently, scientific practice has been evolving toward discovery
science, which encourages scientists to collect data whether or not the data are
to be used for testing specific hypotheses. Discovery science has been said to
be data-driven instead of hypothesis-driven. Completion of the Human Genome Project (HGP) is a major
achievement of discovery science (http://www.nhgri.nih.gov/).
The HGP and related gene sequencing projects have and continue to produce vast
quantities of useful data.
Sequence data describe genes, which are relatively static or
timeless compared to the dynamic mechanisms of life. Sequence data have been
described figuratively with terms such as maps, parts lists, and genetic snapshots. DataSpeaks Interactions®
was not developed for such timeless data. But it can help make timeless
data, including gene data, useful.
“Genomes to Life” is a follow-up program on HGP (http://doegenomestolife.org/). As the
name suggests, it is intended to address the dynamic mechanisms of living
systems. Using the terminology that I introduced above, the Genomes to Life Program can be described as elucidating
how biological systems work, change, and adapt. The Human Proteome Project
(HPP) of the Human Proteome Organization (HUPO) is another major follow-up
program, which seeks to identify both proteins and their mechanisms (http://www.hupo.org/).
The data processing methods and software for
hypothesis-driven science, which still prevail, are not well suited for
discovery science. As a result, discovery science is heavy on data and short on
discovery. This problem is illustrated by the low and declining productivity of
the pharmaceutical industry,
even after HGP.
Hypothesis-driven science has had a glorious and productive
history largely because scientists did most of the intellectual heavy lifting. Scientists form
theories and hypotheses in their heads. This can involve thought experiments. Then
the key data processing step is to reject null hypotheses. But methods for rejecting
null hypotheses are not sufficient to make scientific discoveries or to convert
data into scientific information and knowledge - to make data speak through empirical induction.
Hypothesis-driven science works best for relatively simple
time invariant systems such as planetary systems that have been well described
with few variables. Part of the reason why discovery science has not been more
productive for entire biological systems is that biological systems involve
data about more variables than scientists can process well in their heads.
The data processing methods and software used by hypothesis
driven science have additional limitations for discovery science. Many
hypotheses attempt to address mechanisms. But for lack of software to measure
mechanisms over time for individuals, most hypotheses are about the levels of
variables at particular times. This problem involving levels is discussed more
fully elsewhere in the context of drug discovery.
It is difficult to understand mechanisms scientifically
without measuring mechanisms. Demonstration 1 measures mechanisms in the context
of reproductive endinocrinology. Similarly, it often is difficult to test hypotheses
about the safety and efficacy of treatments without measuring the benefit/harm
of treatments. Appendix A illustrates the measurement
of benefit/harm. Failure to measure mechanisms and failure to measure
benefit/harm are standard operating procedures in academe and industry.
Another limitation involves individuality.
Decoding the human genome was a great achievement. But describing it as “the
human genome” is an oversimplification because there are more human genomes
than people who are not identical twins. Small differences in genomes can make
important differences. The human and mouse genomes are about 98% similar.
About 1.8 million single nucleotide polymorphisms have been
identified by the SNP Consortium (http://snp.cshl.org/).
These SNPs can exist in many combinations to help account for human
differences. Many of these differences are relevant to human health, health
disorders and responses to treatments. Statistical
methods of data processing that are based on measures of group variability
and central tendency tend to obscure, rather than elucidate, individuality. It
often is valuable to distinguish individual differences from measurement
errors.
Humans appear to be more complex than mice or corn not
because humans have more genes but because the products of gene expression and
other substances are involved in more complex and higher orders of control and
coordination, including the control of gene expression. This suggests that it
will be valuable to measure the interactions that describe emergent mechanisms and coordination in addition to decoding genomes and measuring
the levels of gene expression.
Personalized medicine is one of the great promises of the
modern era in biology (http://www.ornl.gov/sci/techresources/Human_Genome/medicine/medicine.shtml).
A major reason why this promise remains elusive is that key data processing methods
of hypothesis-driven science continue to dominate but are largely incompatible
with personalized medicine because conventional methods do not measure
interactions or adequately account for time and individuality. Individuals
often are not well represented by group averages.
Hypothesis testing will remain an important part of science.
But scientists should demand more from software to help them understand complex
adaptive systems as required for making discoveries and advancing their careers.
2.2. The
Data Snapshots/Data Movie Analogy
I will introduce the solution to the data problem with an analogy that begins
by distinguishing “data snapshots” from “data movies.” Start by thinking of
data as recorded experience.
Consider the business and financial pages of your favorite
newspaper. The information consists primarily of prices for stocks and mutual
funds as well as the levels of various indices, economic time series, and
measures of business performance together with changes from previous days or other
periods.
The quantitative information in the business pages for a
particular day illustrates a “data snapshot.” The corresponding “data movie” consists of the same information for a series of
days over a period of time, preferably a prolonged period with many repeated
measurements. Each variable in a data movie is a time series variable.
Data movies provide more information about time and
individuality than data snapshots. Scientists often acknowledge that data
movies would be superior for certain problems. However, most scientists
continue to collect data snapshots. In addition, clinical trials that collect
repeated measurements data often are analyzed as it the data were collections of
data snapshots. Leaders in science, technology, business, industry, and
academia need to understand why they are not experiencing the advantages of
data movies.
Please exercise your imagination to appreciate the “information advantage” of data movies. Imagine
that you don’t know anything about American football and that you are being
challenged to discover the rules of the game from recorded experience without a
rule book or instruction - much as investigators such as biologists are being
challenged to use data to discover the mechanisms by which complex systems work
and adapt and how these mechanisms change.
You have a choice - investigators often really do - as you
seek to discover the rules. Assume that one frame in a movie is the same as a
snapshot and that 100,000 frames are adequate for a movie of one entire game.
Your first option is to choose a collection of 100,000 snapshots, one snapshot from
each of 100,000 individual games. Each snapshot could be of a play of any type,
a huddle, a timeout or a half-time show. The snapshots are of 100,000 games and
there is no information about temporal order. This first option is called the “extensive design” because it involves the collection of
data from each of many individuals, usually only a little data from each.
Your second option is to choose a movie of one entire game.
This second option is called the “intensive design”
because it involves the collection of a lot of data from only one individual.
Assume that the collection of snapshots and the movie for
one entire game occupy the same amount of space in computer memory or on a CD.
In this regard, both options appear to have the same amount of information.
Which option would you choose? Why?
I suggest that the movie would be a better choice because it
includes information about temporal contingencies in the game. For example, the
team with the ball has four consecutive chances to score or advance the ball at
least 10 yards or else they give up the ball where they are on the field. Information
about temporal contingencies is valuable for discovering how nature works, changes,
and adapts as well as for evaluating the temporal criterion of causal and other
predictive interactions.
Snapshots are timeless. Timeless data are great for showing
structures of systems in space. Movies are superior because they provide the
additional information required to understand dynamic mechanisms.
Statistical methods are
great for investigating measures of team and player performance such as average
yards rushing, average yards passing and average number of sacks. But statistical
methods by themselves are of limited value for understanding the rules of
football or the dynamic mechanisms of individual complex adaptive systems.
To help further appreciate the information advantage of data in temporal
order, imagine trying to understand football if the frames in the game movie
were shown in random order. The “additional information” advantage of data
being in temporal order is an underutilized world of evidence to gain
scientific understanding. People rely on temporal order to help make sense of
experience, but most of our data processing software works best for timeless
data.
Data snapshots are not good for elucidating mechanisms or
for quantifying the benefit/harm of treatments. Data movies are apt to become
the gold standard for collecting data to help understand how complex systems
work, change, and adapt. However, there are sources of resistance.
2.3. Sources
of Resistance
Given that intensive designs and data movies can provide an information advantage, why aren’t people using
them more often and effectively? Here are some reasons. Some are good. Some
reasons are bad.
First, here is a good reason for not collecting data movies.
In some cases, the objective is to investigate treatments that affect the
survival of systems such as patients. Here the treatment is considered to be
either present or absent for each individual and the objective is to
investigate time until a largely irreversible change in a system such as death.
In survival investigations, time is mostly irrelevant except for length. Most
of us care about survival. But survival investigations reveal little about
dynamic mechanisms or the benefit/harm of treatments as it becomes evident over
time for individuals.
In some cases, it is necessary to destroy systems in order
to collect data. For example, it might be necessary to sacrifice animals. When
this is necessary, data movies are not possible. However, many new
nondestructive and minimally invasive measurement technologies are being
developed that can monitor systems over time. For example, it is becoming
possible to image activities in individual cells without destroying the cells.
A major value of these technologies is that they allow collection of data
movies.
An often specious reason for not collecting data movies is
that they are considered to be more expensive than collections of data
snapshots. But, returning to our analogy,
would it be more expensive to obtain a movie of one entire game or to form a
collection of 100,000 data snapshots, one from each of 100,000 games? It can be
expensive to recruit, screen, and enroll large samples of individual subjects for
extensive designs. In addition, there are the opportunity costs of failing to
understand complex adaptive systems. Often it would be best to allocate scarce
data collection resources to some combination of intensive and extensive designs - moderate numbers of repeated
measurements for each individual in samples with moderate numbers of
individuals.
A nexus of reasons for resisting intensive designs and data
movies derive from habits learned while accommodating limitations of dominant methods
better suited for investigations of simple time invariant systems than for complex
adaptive systems. Such habits can be impediments to progress. Progress often
requires changing the status quo. Brilliant people are not immune to bad habits
and the status quo, perhaps especially when the habits and the status quo help
define professional identities and when the habits have been enormously
successful for limited but important classes of problems. Sources of resistance
based on the status quo may be some of the most difficult challenges to
overcome while advancing DataSpeaks Interactions®.
Science makes generalizations
and looks for patterns. It appears that methods best suited for time invariant
systems often have led investigators to favor generalizations across
individuals at the expense of making generalizations and looking for patterns
over time. A comprehensive understanding of nature, especially of complex
adaptive systems, appears to require both types of generalization. Methods for
generalizing across individuals and methods for generalizing over time are different
but complementary, not antithetical. For example,
DataSpeaks Interactions®, which generalizes and describes patterns
over time, often is complementary to software for statistical analyses.
Another important habit and source of resistance involves
experimental control. Randomization is important for achieving experimental
control and isolating the effects of particular variables. However, it also
appears that methods best suited for time invariant systems often have led
investigators to favor randomization of individuals to different treatment
groups at the expense of randomization of treatments to different periods of
time for the same individual. However, Gordon H. Guyatt, a leader in
evidence-based medicine, has identified randomized N-of-1
clinical trials as the gold standard for evidence based medicine under many
circumstances (see, for example, http://www.cche.net/usersguides/applying.asp
). As with generalization, both types of
randomization often can be complementary. A good
choice, enabled by DataSpeaks Interactions®, often would be to use
both types of randomization simultaneously in particular investigations. This
option is described as a “double randomization design” in Section 2.5 of Patent
6,317,700.
I have had investigators express concern that results
obtained from intensive designs may not apply to populations. This is true. But
one way toward understanding both individuals and populations is to collect
data movies from samples of individuals that represent populations. This approach
also is an excellent way to investigate individual differences in mechanisms,
disorders, and responses to treatments. There often is value in understanding
individuals such as people, patients, populations, economies, capital markets,
and ecosystems whether or not they can be sampled across individuals.
Some investigators collect or advocate the collection of lots
of data snapshots because they think they
have no other choice. Call this the first case. Other investigators use the
intensive design and collect data movies because they have no other choice, the second case. How is it that some
investigators think they have no other choice but to collect data snapshots
while other investigators essentially have to collect data movies if they are
to collect data at all?
This exemplifies the first case. The editor on an online
journal, editorializing about “individuality and medicine,” recently said that
we have no other choice for the sake of preventive medicine but to invest in
what essentially are large collections of data snapshots, perhaps with a
survival component. His conviction that there is no other choice testifies to
the power of the statistical establishment to
limit options. This case characterizes much of “large scale biology.” Important
money is riding on extensive data collection
designs as if they were the best choice for personalizing medicine.
Large scale biology has value. But with tens of thousands of
genes and hundreds of thousands of proteins and other biologically active substances,
and at least hundreds of thousands of variations in both sets, and both sets
working in nearly unlimited numbers of combinations together in people with
different histories in different environments to cause both normal function and
disorder, it is an open question if there are enough people in the world for extensive designs alone
to achieve their planned promise of understanding human health and disease. Furthermore,
understandings of existent systems are apt to become outdated because people
are agents creating new agents and our future.
Biological mechanisms are complex compared to the rules of
football as the latter was illustrated in our analogy. How much will it really help to go from
100 individuals (games or patients) to 100,000 individuals or perhaps 1,000,000
or more individuals if we don’t capture and use more information about temporal
contingencies? Current conceptions of large scale biology might not be efficient
strategies for understanding biological systems on the way to preventing health
disorders.
This exemplifies the second case. Those who investigate individuals
such as economies and capital markets are prone to collect data movies largely
because individuals are so inclusive and unique that sampling of individuals
has been largely precluded and often is considered irrelevant. If such
investigators are going to continue collecting data, data almost must be
collected repeatedly. Investigators of unique systems have to accept
individuality and try to account for time by generalizing and seeking patterns
over time.
Collection of data movies is not sufficient for understanding
complex adaptive systems. After all, data swamps include lots of data about
economies and capital markets. Scientific understanding does not just flow from
data movies, especially when they have many variables.
This introduces the primary problem that hinders our
understanding of complex adaptive systems and keeps us in data swamps. This is
the software problem. Current software does not measure interactions, temporal
contingencies, or longitudinal associations. If economists and investors can
not do a better job with data movies given the financial incentives for better
prediction, why should investigators such as biologists collect data movies?
This has been a good reason to avoid intensive data
collection designs. It has been a good reason not to collect data movies -
until now.
3. Getting
Out of Data Swamps
Investigators can get out of data swamps by taking two major
steps. The first step is to collect more data movies, preferably under conditions
of experimental control. Fortunately, our capability to collect data movies is
growing rapidly. Microarrays allow collection of data on thousands of variables
at each time. Technologies such as functional brain imaging and Web-enabled
monitoring devices are increasing the collection of time series data by orders
of magnitude.
Data movies should be collected using time series experimental designs whenever
feasible. Such designs vary independent variables over time for individuals and
preferably randomize different levels of independent variables to different
periods of time for the same individual. Such randomization contrasts with
randomization of individuals to different treatment groups. Both forms of
randomization can be used together when it is feasible and desirable to use
samples of individuals to make inferences about populations.
Experimental control and randomization would help assure
that values of the measures of interaction obtained with DataSpeaks Interactions®
are valid measures of causal interactions. DataSpeaks Interactions® is
exceptionally well-suited to evaluate the temporal criterion of causal
interactions with or without experimental control and randomization.
Measures can not be valid unless they are reliable. There
are two major aspects of reliability when considering measures obtained with
DataSpeaks Interactions®. In general, the reliability of the new
measures of interaction can be improved by collecting data from more repeated
measurements. This first aspect of reliability works in a manner somewhat
analogous to how large samples increase statistical power. More repeated
measurements help overcome unreliability of measurement in data that are
processed with DataSpeaks Interactions®.
The second aspect involves the reliability of computation. Given
the input data and a scoring protocol, measures obtained with DataSpeaks
Interactions® are as reliable as computation.
Collecting data movies can be counterproductive without DataSpeaks
Interactions® because one data movie has more data than one data
snapshot with the same number of variables.
3.1.
Developing Data Movies
The second major step for getting out of data swamps is solve
the software problem by processing data movies
with DataSpeaks
Interactions®. DataSpeaks Interactions® can be
viewed as a new category of software - computational measurement software - to
“develop” data movies, which is described more fully in the patents section. Development
consists of measuring patterns of interaction that describe how complex
adaptive systems work, change, and adapt. Development makes data movies useful.
DataSpeaks Interactions® is the heretofore
missing step for using computation to benefit from the “information advantage” of
data movies as compared to data snapshots. DataSpeaks Interactions®
enables more and better use of data movies. Fortunately, computing
infrastructure is beginning to have sufficient power to develop data movies of
complex adaptive systems.
Appendix A is a brief primer on how DataSpeaks
Interactions® works.
3.2.
Benefits of Developing Data Movies
An old maxim is that if one wants to investigate something
scientifically, measure it. DataSpeaks Interactions® measures
the interactions that describe the mechanisms by which complex systems work and
adapt as well as how these mechanisms change. Thus, in a very real way,
DataSpeaks Interactions® helps enable systems science. DataSpeaks
Interactions® provides operational definitions of “interaction,” an
ill-defined but increasingly used concept in science. The value of DataSpeaks
Interactions® rests on one of the very foundations of science, which
is measurement.
Measurement also enables visualization. Values of measures
can be graphed and visualized in various ways. Visualization aids understanding.
By developing data movies, DataSpeaks Interactions®
makes data and interactions more visual. An old proverb is that a picture
is worth a thousand words. Similarly, a visual display can do more to aid
understanding than a computer memory full of data.
To the best of my knowledge, no one else has ever measured
and visualized interactions between variables over time for individual systems
as functions of the relevant analysis parameters. This statement is supported
by the whole long process that led to my patents.
Development of data movies with DataSpeaks Interactions®
appears to be a critical missing step to help understand many complex
adaptive systems. The 2003 Nobel Memorial Prize in Economic Science went to
Robert F. Engle and Clive W. J. Granger for their methodological work involving
volatility of time series. Apparently this is the first time this prize has
gone to econometricians. This speaks to the growing recognition of the importance
of methods for processing time series data. However, such methods remain in
their infancy, which is evidenced by the depth of controversy about what time
series data say about economic policies and investment decisions.
Visualization of economic and capital market data movies
still is limited primarily to showing trends. When different trends are shown
side by side, people are largely left to form subjective impressions about how different
variables interact in ways that might be predictive. The task of forming
reproducible subjective impressions, both within and across individual people, becomes
enormously difficult as the number of variables increases. Demonstration 2 illustrates the measurement of
interactions between economic time series.
To appreciate the difficulty of visualizing predictive
interactions, consider all the economic and financial variables that are
reported in your daily newspaper. Imagine an enormous chart showing trends for hundreds
of variables over a prolonged period of time. Imagine trying to understand this chart. No wonder different
investigators - and individual investigators at different times - often arrive
at different conclusions and make different predictions from the same data. They
often focus on different parts of the entire system, draw different conclusions
and make different predictions. And so people long for one-handed economists
and suffer the consequences of polarizing disputes about economic policies. DataSpeaks
Interactions® measures interactions by computation and yields
dimensional measures of direction and degree. These measures are potential
antidotes to excessively polarized rhetoric.
Mathematical modeling may be viewed as an alternative to
forming subjective impressions about predictive interactions. Mathematical
models can be extremely valuable. One major problem is that such models currently
are formed without measuring the interactions that describe mechanisms by which
complex systems work and adapt as well as how these mechanisms change. Thus the
models are subject to the limitations of data processing methods that do not
adequately account for time and individuality as well as the limitations of
subjective impressions of those who model. DataSpeaks Interactions® can
help inform the development of mathematical models, including models used in
computational biology and economics.
DataSpeaks Interactions® measures interactions by
computation. Computation is superior to subjective impressions because results
can be obtained with transparent and reproducible procedures that can be
expressed in protocols and applied to large complicated databases with many
variables. Although measurement of interactions for large data movies or sets of
data movies will remain a major task, DataSpeaks Interactions®
provides many new options for addressing such tasks systematically with the
power of computing. Many data processing strategy options can be automated to
seek predictive patterns with little human intervention. Application of
DataSpeaks Interactions® to data movies will help get us out of data
swamps.
3.3. Data
Integration Is Not Sufficient
Biological systems have many parts and subsystems. We need
to understand the mechanisms by which all these parts and subsystems work,
change, and adapt together. This has been called “integrative biology.” A
recent Google search on this phrase yielded 35,000 hits.
This two-step strategy for getting out of data swamps -
collecting more data movies and developing them with DataSpeaks Interactions® -
supplements a prevailing trend for understanding systems, which is to integrate
data of many types from different sources.
Data integration is important for understanding systems. Data
integration does create new opportunities for scientific insight. But data
integration is not integrative biology. In order to get from data to knowledge
and understanding more efficiently, we need the right kind of data (more data
movies) and a means of processing the data (DataSpeaks Interactions®)
that is better because it measures and visualizes interactions to elucidate
mechanisms.
Data collection and communication infrastructures are being
developed that are fast, widely accessible, and have huge memories. Now we need
better software to process the data so that investigators do not have to do so
much of the intellectual
heavy lifting. DataSpeaks Interactions® has great
potential to capitalize on this opportunity.
Anticipate both that collections of data snapshots will be largely
outmoded by collections of data
movies and that those data movies will be processed primarily by
DataSpeaks Interactions®.
4. Patents
Two U.S. patents have been issued to me, Founder of
DataSpeaks, Inc. Foreign patents are pending. These patents claim features,
applications, and uses of the methodology embodied by DataSpeaks Interactions®.
Patent
6,317,700 - “Computational Method and System to Perform Empirical Induction” - was issued on
11/13/2001. All 104 claims that were
sought were approved. My understanding is that this can be characterized as a
broad foundational patent.
I use the term “empirical induction” to describe procedures
for drawing generalized conclusions and making predictions from data. My
patents and this Web site are limited primarily to computational methods of
empirical induction - methods that can be implemented as computer algorithms
and performed by computers. Computer algorithms for empirical induction can be
used to draw generalized conclusions and make predictions from data that
records experience.
“Empirical induction” is intended to be a broad and growing
category that encompasses a number of different methods and algorithms.
Computational methods of empirical induction that have been embodied in various
ways by software include the statistical method, neural networks, genetic
algorithms, cellular automata, and chaos theory. These include artificial
systems, inspired by biology, whose inner workings and results can appear to be
as indeterminate as those of real systems, which they model.
Patent
6,317,700 claims a new computational method of empirical induction that is
called the Method for the Quantitative Analysis of Longitudinal
Associations or MQALA. “Longitudinal associations” is another way to describe
interactions or temporal contingencies, both causal and non-causal.
MQALA can be described as a suite of computational tools
specifically designed to measure, discover, analyze, synthesize, describe, and
visualize patterns of temporal contingency in data movies for individual complex adaptive
systems. Such patterns include mechanisms by which individual complex systems work, change, and adapt.
MQALA is a computational measurement method of empirical
induction. This exemplifies a computational measurement method: density equals
the mass of a substance per unit volume. Given two variables, mass and volume,
Archimedes essentially discovered or invented a mathematical method to compute
values of a new measure, density. Similarly, given data for two variables or
sets of variables, I have discovered or invented a mathematical method or
algorithm to compute values of new measures of interaction over time between variables
and sets of variables for individual systems.
The results of both of these measurement procedures are
descriptive rather than inferential. Measures of density describe a relatively
static property of materials. Measures of interaction describe dynamic aspects
of systems. Compared to the method or algorithm for measuring density, MQALA is
substantially more complex. But MQALA is a computational measurement method
none-the-less.
Measures often are useful. Measures of density have proven
useful for problems such as those involving buoyancy and for the classification
and identification of materials. For example, Archimedes developed the concept
of density while trying to determine if a crown was made of pure gold. I invented what came to be MQALA while trying to analyze
health diary data.
The concept of density and Archimedes’ measurement method
has proven to be useful in science and practical affairs. Similarly, I
anticipate that the concept of interaction as operationally defined by MQALA
and embodied by DataSpeaks Interactions® will withstand the test of
time.
Measures of interaction obtained with DataSpeaks
Interactions® are useful in that they are quantitative conclusions,
generalized over time ordered data, about systems. To illustrate, consider the N-of-1 clinical trial for the “drug for blood pressure” example
that is presented in Appendix A.
The second part of this example used data for drug dose and 20 health variables
collected daily over 100 days. Assume that the resulting overall benefit/harm score
was large and positive. Such a score quantifies the conclusion, generalized
over all the data, that the drug was beneficial.
Generalized conclusions of this type can be used to help make
predictions that can be acted on accordingly. For example, the patient
investigated, knowing the results of her N-of-1 clinical trial
were as just described, could take the drug on the 101st day with
some confidence that the drug would be beneficial. Data from the 101st
day could either strengthen or weaken this conclusion. Data could be collected
daily and the growing mass of data could be analyzed daily to monitor any change
in response.
As illustrated in Appendix A and demonstrated in Appendix B, DataSpeaks Interactions® can be
used together with the statistical method to
extend generalizations from the individuals
actually investigated to populations that samples of individuals represent.
U.S.
Patent 6,516,288 - “Method and System to Construct Action Coordination Profiles” - was issued 2/4/03. All 111
claims that were sought were approved. This patent extends the claims of 6,317,700.
The concept of action coordination profiles was inspired by
exposure to motion capture technology and by the analogy of coordinated motion.
Basically, sets of action coordination profiles show how every variable or selected
sets of variables in a data movie for a system interacts with every other
variable or selected sets of variables for that system. Profiles could be used
to characterize different patterns of coordinated activity such as walk, trot,
canter, and gallop for horses. In addition, action coordination profiles can be
used to measure the amount and strength of evidence for coordinated motion. For
example, I would anticipate that a series of repeated golf drives by an expert
golfer would be more coordinated than the same series for a beginner.
Patent
6,516,288 also effectively extends the analogy of coordinated motion to
additional types of action and additional types of systems. As examples, the
actions can be physical, chemical, biological, behavioral, mental, or social. The
systems can be objects of investigation such as brains, organisms, patients,
economies, investment markets, populations, machines, and processes. In
addition, the concept of coordinated action can be extended to how two or more
individual systems interact.
Issued patents can help provide
a competitive advantage in business.
5. Opportunities
and Challenges of Being First
This call for leadership involves an opportunity to be first
with a major new category of software that has great potential to improve
economic productivity and human welfare.
This opportunity calls for great leadership because it
involves great challenges. Expect rewards to be commensurate with the
opportunity and performance.
First, here is some of the
evidence that DataSpeaks is first in its category.
“Temporal contingency
analysis” may be the best way to describe what MQALA does as described in Appendix A and demonstrated
in Appendix B. MQALA is the methodology embodied
by DataSpeaks
Interactions®. However a recent Google search on the
phrase “temporal contingency analysis” yielded zero hits. This result is surprising
especially because temporal contingencies appear to be the primary way that
people and other organisms learn directly from experience - what follows what. Temporal contingencies seem to
capture the essence of what has been called the school of hard knocks. Temporal
contingencies also are the source of much scientific inspiration and invention.
“Computational
measurement software”
may be the best way to describe the new category of software that DataSpeaks
Interactions® represents. However a recent Google search on the
phrase “computational measurement software” also yielded zero hits. This result
is surprising because science and many practical applications of scientific
understanding are grounded on measurement. In addition, values of many measures
are derived by computation from other measures.
I followed many such descriptors for years before and after
Google existed and before and after filing for my patents. I find it amazing
that this opportunity still remains. Perhaps the scientific quest for the immutable
laws of nature is deterring us from investigating anything as ephemeral and
apparently inconsequential as the temporal contingencies that describe how complex
systems work, change, and adapt.
Einstein insisted that nature does not play with dice. But
perhaps dice are a part of nature, natural and man made. Contingency and chance
might be important after all. E=mc2 was instrumental in the
development of the atomic bomb. But we may need to recognize the importance of
contingencies if we are to understand the behavior of those who might use the
bomb.
Some business people fear that being first means that there
is no market. The potential market for DataSpeaks Interactions®
comprises all those who could benefit from better understanding of complex
adaptive systems and from products and services enabled by such understanding.
This definition of a market is too broad to help develop the
financials of a business plan. But it is helpful in dealing with the
competition and overcoming specific sources of resistance.
5.1. Primary
Competition
Identifying and understanding the competition has proven to
be more difficult for me than discovering, inventing, and patenting MQALA. But identifying and understanding
the competition, provocative and disturbing as it might be for some, appears to
be crucial for progress and success.
Basically, the problem of identifying and understanding the
competition is simple, once it is recognized. Statisticians rule, at least
where it counts. But the statistical method is not well suited to account
for individuality and time as required to efficiently understand mechanisms. Over
reliance on the statistical method, perhaps more than anything else, is what is
making it difficult for people to benefit from better understanding of complex
adaptive systems such as people, patients, organisms, brains, populations,
societies, economies, ecosystems, and productive processes.
A deeper understanding of the problem and challenge will
make it easier to succeed. The competitive issues cut to the heart of
scientific methods for understanding nature, or more specifically, for understanding
complex adaptive systems. These issues call for greater study and dialog. Here
is some additional detail to help stimulate inquiry.
The statistical establishment is the primary competition for
DataSpeaks
Interactions®. Statisticians are the primary source of
resistance to MQALA. In addition, everyone who defers too much to statisticians
on issues related to drawing generalized conclusions and making predictions
from data contributes to the problem of understanding complex adaptive systems.
This is not as bad as it may seem. The “statistical method,” as I have chosen to use this
descriptor here in the context of complex adaptive systems, is the best method
of empirical induction for what it does well - describing groups and making
inferences from representative samples of individuals to populations.
Inferential statistics is one great way to overcome measurement error by using
data for multiple individuals. But the statistical
method is not a way to account for individuality and time or to describe
mechanisms as illustrated with the analogy. I know of nothing wrong about the statistical method for what it does well.
Unlike the statistical method,
MQALA is great to account for individuality and time as well as to describe
mechanisms. MQALA also is a great way to account for measurement error within
individuals. But MQALA has nothing to say about describing groups as
collections of separate individuals or about making inferences from samples of
individuals to populations. The two methods are entirely different. I am not
aware of any inherent conflict between these two methods. The two methods work
best for different types of data and for different types of problems.
Furthermore, MQALA and the statistical method often are complementary. MQALA provides measures of the mechanisms
by which individual systems work and adapt as well as how these mechanisms
change. MQALA also provides measures of benefit/harm. When obtained from two or
more individuals, values of these new measures can be analyzed statistically as
described in Appendix A
with an example that uses the randomized multiple N-of-1
clinical trial design. In addition, Appendix B demonstrates how values of a measure of
hormone interaction can be analyzed statistically.
Given all this, a fundamental part of our leadership challenge is to help
the world to distinguish between MQALA and the statistical method and to demonstrate optimal
uses for both methods. This challenge involves both establishment issues and
technical issues.
5.1.1. The Statistical Establishment
The statistical method is
well established. To appreciate this establishment, it helps to distinguish two
major components - statisticians themselves
and people who defer too much to statisticians.
Power is at stake.
5.1.1.1.
Statisticians
Many statisticians are enshrined in departments of
statistics and biostatistics as well as informatics programs. Collectively, statisticians
have huge intellectual mass. The writings in some of their professional
journals are arcane. The mathematical formulations can make ordinary people
like me long for their native tongues. Perhaps the brightest minds can
elaborate the most complicated “solutions.”
Despite all this intellectual mass, I doubt if any
statistician can analyze a randomized multiple N-of-1
clinical trial as proposed in Appendix
A with anything as simple as a single-group t-test on the mean - that
is, without MQALA. This
clinical trial example is important because this design may be the most
productive and ethical experimental design to evaluate the benefit/harm of
treatments for the management or control of chronic disorders. This design
enables the new vision
of integrated clinical research and practice. In addition, Demonstration 1 of Appendix B demonstrates use
of the single group t-test to analyze values of a measure of hormone interaction
from a sample of 6 ewes. And if MQALA makes statistical analyses so simple, how
many investigators really need statisticians for this and many other related tasks
involving mechanisms and benefit/harm - tasks that often are best addressed by developing data movies
with DataSpeaks
Interactions®?
Excellent data movies exist for capital markets. Do you see much evidence that
statisticians do much better than the rest of us with their investment
portfolios?
The demands and limitations of the statistical method itself raise additional
important power issues. This can be illustrated in the context of clinical
trials. The statistical method generally requires rather large samples of
patients and study designs that can not optimize the care of the individual
patients who become “subjects.” Such trials often require large organizational
support, big budgets and, quite often, government regulation. This fosters big
establishments. The statistical method has done little to help empower individuals
with N-of-1 clinical trials, which are the gold standard
for evidence based medicine (see, for example, http://www.cche.net/usersguides/applying.asp
).
Statisticians largely have a lock on what gets published and
funded on matters involving empirical induction. They help make key
decisions about approving drugs. These decisions can ultimately benefit us or harm
us, kill us or save us. The power of statisticians may have served us well as
we started to move from anecdote to science in disciplines such as medicine.
Peer review by statisticians can continue to serve us well on issues of
describing groups, sampling populations, and making inferences about
populations as required to guide public policies. But statisticians should not
be allowed to block other methods of empirical induction such as methods that
can help explicate mechanisms of complex adaptive systems.
Statisticians have grown accustomed to their prerogatives. I
learned that at least one noted statistician resented my patents that put
information into the public domain without peer review. Patents primarily just
have to be innovative and useful. We should expect some statisticians to resent
the fact that I have published this document to the Web.
Although I am not a statistician, statistics has had a
formative influence on my life. I taught statistics to undergraduates as a
psychology graduate assistant. I worked with a prominent statistician at
Dartmouth on a project to evaluate the Surveillance, Epidemiology and End
Results (SEER) program of the National Cancer Institute. He threw my farewell
party and went on to cofound the journal Statistics
in Medicine. I took post-doctoral training in a mental health statistics
program at the School of Public Health at the University of North Carolina.
More recently, I helped teach quantitative methods to medical school residents.
I have known statisticians as colleagues and friends. All this made it more
difficult for me to recognize the statistical establishment as the primary
competition for DataSpeaks, Inc.
I have presented MQALA to a number of statisticians. They
see probabilities and some simple formulae that also are used in statistics.
Statisticians conclude that MQALA is not good statistics. The statisticians are
right, at least on one of these two counts. MQALA is good. But MQALA is not
good statistics. MQALA is not statistics at all.
I have been referred to statistics textbooks. One
pharmaceutical industry statistician ran extensive simulations with what came
to be MQALA. We gave two presentations and the abstracts were published. These
abstracts are cited in Patent
6,317,700. Although the simulations were supportive of the new
methodology, the presentations and abstracts did not elicit much response. The
statistician left industry and is pursuing other interests. I am grateful for
his efforts.
My experience is that statisticians generally are bright and
well intentioned people. I suspect that they resist MQALA primarily because it
falls outside the lens of their experience. Simple solutions
to major problems can elude leading minds for a long time. This has precedent.
Here is a classic example.
Surgical death rates in a Viennese
hospital apparently were over 50%. Ignaz Semmelweis and others advocated hand
washing, a practice of particular value for thought leaders that did autopsies
and went from autopsy rooms to surgical and delivery rooms without washing
their hands. The simple hand washing solution was resisted, even ridiculed,
apparently because it did not make sense. This was before the discovery of
germs and the germ theory of disease. The value of hand washing was outside the
lens of experience for leaders in
power at the time.
Microscopes helped investigators see germs. DataSpeaks
Interactions® will help investigators see interactions in a way that
interactions never have been seen before. More than a century after the germ
theory of disease, investigators still are trying to understand the
interactions that describe germs - particularly how they act as agents to cause
disease and respond to agents that might cure disease. Measuring and seeing
interactions should speed progress in practical scientific understanding. But
it will take time for investigators to appreciate what they see even after they
see interactions measured.
Appendix B
includes three demonstrations that show how interactions are measured with
DataSpeaks Interactions®.
Statisticians have assumed the mantle of responsibility for
protecting large swaths of the scientific community from error. But they view
their responsibility through a narrow lens. As a result and despite providing
valuable services, statisticians have become a major impediment to progress.
5.1.1.2.
Deference toward Statisticians
The second major component of the statistical establishment is
the excessive deference that most people show toward statisticians. The combination
of the obscurity of their arcane writings, daunting difficulty of the
mathematical formulations and power of statisticians over the lives of
scientists and other people seems to inspire awe and deference. Statisticians
can seem to get away from addressing their supplicants with quiet
inscrutability.
Leaders in various fields have deferred me to statisticians
almost immediately as soon as they understand that my work has something to do
with analyzing data. At times I feel dismissed because my Ph.D. is not in
statistics. I’ve had successful people throw up their hands and say “you are way
above me” or “we are not in the same ballpark” almost as soon my technology
reminds them of statistics. Such apparent manifestations of deference have
happened for a method, MQALA, which is basically as simple as defining discrete
events and using 2 x 2 contingency tables to measure interactions or temporal
contingencies (See Appendix A).
Some experts who work with data seem to have found this simplicity insulting, off-putting,
or below them. The ruling presumption seems to be that longstanding problems must
have difficult solutions. But simple solutions can be best, even if they
require changing professional habits.
Some people seem to assume that the best way to solve longstanding
and important problems involving the use of data must be to do more of what
statisticians have been doing for decades. This appears to be part of why some
life science centers are turning more to higher mathematics. Although higher
mathematics often is helpful, the more immediate and simple solution to large
and important classes of unsolved problems appears to involve the computation
of measures of interaction and temporal contingency before statistical analyses.
Appendix A describes
this simple solution in the context of measuring benefit/harm in clinical
trials. Appendix B
demonstrates this simple solution with an interaction between hormones and
connectivity in the brain. These appendices also illustrate how measurement of
interactions first can simplify mathematical and statistical treatment of data.
Excessive deference toward statisticians has been costly.
Great minds that invent microarrays, functional imaging machines, and Web
enabled monitoring devices often defer or delegate to statisticians for data
processing issues outside of describing groups, sampling individuals, and
making inferences from samples to populations. As a result, much of the value
of their discoveries and inventions has yet to be realized.
The statistical method
became established before MQALA was invented. My
impression is that either method could have been established first. If MQALA
had been first, statisticians might be trying to break into the MQALA
establishment.
This might explain why the statistical method was
established first. People have been known to seek certainty and absolute truth.
Scientists have been known to seek the immutable laws of nature in accord with
deterministic worldviews. Statistics in this context might be viewed as an
attempt to separate mathematical truth from error. This might be the basis for
much of the lofty position afforded to statisticians. In contrast, MQALA just
measures temporal contingencies that describe the mechanisms by which complex
systems work, change, and adapt. MQALA also appears to
support a worldview
that can hold individuals accountable as responsible agents that deserve to be
rewarded, punished, and honored.
Perhaps some leaders are tired of deferring and
delegating their crown jewels to statisticians. Besides, deference to authority
is not very scientific.
5.1.2. MQALA
and the Statistical Method - Technical Differentiation
Given apparent difficulties in distinguishing different
computational methods of empirical
induction and their varied uses, it may help to differentiate
methods. Although there are more than two methods that are relevant to
empirical induction, this section focuses on MQALA and its primary competition, the statistical method or statistics.
The differentiation between MQALA and the statistical method
is not so much a matter of strengths and weaknesses as a matter of different
methods for different types of problems and different types of data. I present these
comparisons and differences in bold broad strokes. These statements call for additional
investigation. Many of these statements make similar points in different ways.
Statistics is about groups and populations. MQALA is about
individual systems. Individual systems include populations investigated as
entire populations.
Statistics works best for time-invariant systems. MQALA can
be used to measure how interactions change as systems change over time.
Being largely timeless, statistics struggles to account for mechanisms
of internalfunction, response, agency, and
adaptation. By accounting for time, MQALA is well suited to account for the
mechanisms of function, response, agency, and adaptation as these were
distinguished above.
Statistical measures of correlation were designed for and
work best for cross-sectional data. Measures obtained from MQALA were designed
for and work best for multiple time series data.
Statistics works best for linear relationships. MQALA
appears to work for both linear and nonlinear relationships. MQALA informs
users about the form of relationships between measures of interaction and
various analysis parameters.
Statistics uses “interaction” to describe non-additive
relationships between independent variables in time invariant systems. MQALA uses
“interaction” to describe relationships between one or more independent variables
and one or more dependent variables in systems that can change over time. I
sometimes use “dynamic interaction” to distinguish the way MQALA uses
“interaction” from the way that statistics uses “interaction.”
Statistics works best with homogeneous groups - an important
reason why clones and inbred strains of organisms are highly prized from a
methodological perspective. In addition, MQALA works well for individuals that may
be unique and can be investigated over time.
Statistics works best for hypothesis driven science. MQALA facilitates
data driven discovery science. MQALA also expands the scope of hypothesis
driven science by enabling statistical tests of measured dynamic interactions.
Statistics works best when there are few variables. Thus,
there is much emphasis on variable reduction by
investigators of complex systems. MQALA works well for many variables, although
demand for computational resources can increase rapidly. In addition, MQALA can
be used to help accomplish variable reduction. Appendix A describes an example that reduced one independent
variable, drug dose, and 20 dependent health variables to one variable that
quantified apparent benefit/harm.
Statistics works best for analyses involving systems’ parts.
MQALA works well for analyses of systems’ parts and syntheses of how many parts
work together to form more or less coordinated systems.
Statistics works best for cross-sectional data. Thus, for
example, clinical trials often are analyzed with pre- and post-treatment difference
scores in health variables. This practice effectively squeezes out time to form
a single timeless number for each patient. MQALA works for longitudinal data
with two or more variables, preferably time series data. MQALA was not intended
to work at all for cross-sectional data.
Statistics works best with experimental designs for groups. MQALA
works with time
series experimental designs for individuals. Often it is productive
to combine both types of design.
Prevailing inferential statistical
methods work best for rejecting null hypotheses. MQALA measures the amount
and strength of evidence for positive and negative interactions over time
between independent and dependent variables. Although MQALA uses contingency
tables and hypergeometric probabilities to compute values of measures, the
measurement process by itself does not test hypotheses.
Statistics works best when individuals are randomized to
different treatment groups and treatments remain fixed for each individual. In
contrast with MQALA, levels of independent variables must change over time for
individuals in order to get nonzero values for measures of dynamic interaction
or temporal contingency. In other words, the statistical
method often works with categorical independent variables while the
independent variables for MQALA can be dimensional variables with two or more
levels that must change over time to get non-zero interaction scores.
Both MQALA and statistics are most apt to yield valid
results with experimental data. Both methods can be used with non-experimental
data. MQALA does a better job in evaluating the temporal criterion of causal and
other predictive interactions in non-experimental data.
MQALA is a measurement system that yields scores and values
of measures. Statistics is a method and system for analyzing the values of
measures.
Statistics often works with dimensional variables as
dimensional variables. In contrast, MQALA must convert series of values for
dimensional variables into sets of series for discrete events that can be
either present or absent on most measurement occasions. This has been described
before and
in Appendix A.
Statistics often uses measures of central tendency and
variability that work best for groups. MQALA computes measures of longitudinal
association, interaction, or temporal contingency for individuals.
MQALA works well in data mining for interactions in time
series data with two or more variables when the data are about individuals. The
statistical method works best in mining essentially
timeless data about multiple individuals.
The statistical method can be described as being good for
developing data snapshots of groups as in a census. MQALA can be described as
being good for developing data
movies of individuals including entire populations.
In summary,
the statistical method as currently
used often seeks to separate truth from error in a timeless and
changeless world of immutable laws where individuality is not
important. This is not the world of complex adaptive systems.
MQALA accounts for individuality and time in a real world that
changes over time, where temporal contingencies help shape complex
adaptive systems, and where individuals can be held accountable
as responsible agents.
MQALA and the statistical method
are distinct and often complementary. The two methods
often should be integrated to help make statistics relevant when science also
needs to account for time and individuality in a world that includes complex
adaptive systems.
5.2. Stephen
Wolfram and Forrest Gump
I will hazard to venture a few comments about another
potential competitor or alternative to MQALA in the space occupied by computational methods
relevant to understanding nature - namely cellular automata and the Principle
of Computational Equivalence as enunciated by Stephen Wolfram.
Stephen Wolfram - genius, entrepreneur, and primary creator
of Mathematica,
apparently the world’s leading scientific software system for technical
computing and symbolic programming - authored A
New Kind of Science. The title of this tome, the high status of
its author, and the phenomenon generated by publication of this book suggest some
recognition of need to advance scientific methodology.
Wolfram’s book includes nearly a thousand original pictures
that allow, as described on the book’s dust jacket, “scientists and
non-scientists alike to participate in what promises to be a major intellectual
revolution.” Many of these pictures show patterns generated in accord with
rules that determine whether or not particular cells should be black or white.
Similarly, MQALA deals with discrete events that are considered to be either present
or absent as described in Appendix
A. Perhaps there is some deep connection between MQALA and cellular
automata as the latter is enunciated by Wolfram. I invite such explorations.
Many of Wolfram’s pictures show patterns formed when the
consequences of rules or programs unfold over steps. Some of these patterns are
strikingly similar to patterns observed in systems found in nature. Some
patterns reveal randomness. Simple programs can produce complexity.
It appears as if these rules or programs are being presented
as alternatives to immutable laws of nature as sought by some more conventional
scientists. If so, the scientific quest according to Wolfram would appear to be
discovery of rules or programs that account for the unfolding of nature. It
appears as if such discoveries would occur primarily in the heads of
scientists.
In contrast, MQALA does not start with, depend on, or assume
that there are any rules, programs, theories or immutable laws of nature. MQALA
starts with time ordered data and asks the data to speak through software that
reveals patterns. MQALA uses computation to measure temporal contingencies to
facilitate understanding of complex adaptive systems. Perhaps randomness such
as that found by Wolfram helps make contingencies interesting because contingencies matter.
Perhaps the essence in the world of complex adaptive systems
is less like either rules or immutable laws of nature that merely unfold over
time in a determined and sometimes random way and more like Forrest Gump’s
saying, “Shit happens.” “Life is like a box of chocolates: You never know what
you’re going to get.”
“Shit happens” seems to resonate with many people. A recent
Google search on this phrase yielded about 70,100 hits, heavily represented by
computer programmers characterizing different religions. DataSpeaks Interactions®
can be said to measure how shit happens with some regularity. This rather crude
description might be welcomed by those who are tired of deferring to the power
of the statistical
establishment. It can bring some fresh air to what some people may
see as the stuffy world of scientific formalisms.
Shit happened to me as I discovered MQALA step by step. Shit
happened when many other scientific discoveries resulted from serendipity.
Perhaps I deserve more credit for persistence and self-confidence in pursuing
MQALA than for discovering MQALA itself.
Perhaps it is time for us as a people to escape the fate of
determinism and the hopelessness of pure randomness by accepting the awesome responsibility
of measuring the temporal contingencies of nature and creating the future in
accord with this knowledge and a respect for fundamental human values.
Of the two methods, MQALA and cellular automata, MQALA appears
to be closer to data as it is being and can be collected now. In contrast to
being a new kind of science, MQALA can be described as back-to-basics science
where the basics are measurement, data, and experimental control. As such,
MQALA shows more promise for getting us out of data swamps.
The sections on scientific worldviews and responsible agency
provide more information related to Wolfram.
5.3.
Departments of Empirical Induction
Different methods and algorithms for empirical induction work best
for different types of problems and with different types of data. It will be
increasingly important, given this growing diversity, to make relevant and useful
distinctions before acting.
One way to foster intelligent use of different methods for
drawing generalized conclusions and making predictions from data might be, as
suggested in Patent
6,317,700, to establish departments of empirical induction where
experts in MQALA, the
statistical method, neural networks, genetic algorithms, cellular automata,
chaos theory, etc. can thrash it out, foster new types of study designs,
develop intellectual mass, educate leaders, develop mathematical models and
theories of real systems, develop artificial systems, and advance science and
engineering. Such departments would have the potential to become new “centers
of calculation.”
Departments of empirical induction could replace departments
of statistics and biostatistics. The new name would be less prejudicial in
favor of a particular method that happened to be established first. The new
name would suggest openness to new possibilities. The new name would help users
recognize that they need to make relevant choices. The new departments would
help educate people who could keep up with and lead new and emerging ways of
doing science. Establishment of new departments of empirical induction calls
for strong leadership as described in the Call for Leadership and discussed in the section on leadership.
5.4. The
Time is Right
The time is right for the
advancement of MQALA as
embodied in DataSpeaks
Interactions® software.
Interactions are widely discussed, even in our most
prestigious scientific journals. But they never seem to be effectively measured
over time for individuals. There is much surprisingly loose talk and writing about
protein-protein interactions, drug interactions, environmental interactions and
all sorts of interactions. A fundamental problem is that the concept of interaction
is not given a common, clear, specific, and objective scientific meaning, which
comes from concrete operational definitions and measurement procedures that can
be applied in many disciplines.
MQALA, as embodied in DataSpeaks Interactions®,
provides the required operational definitions. DataSpeaks Interactions®
actually measures interactions so that they can be investigated scientifically.
The same measures can be used to help solve practical many problems.
MQALA is a measurement system. Users can select many options
as they develop scoring protocols that are appropriate for specific
investigations and problems. No one scoring protocol is appropriate for all
data sets and problems. Appendix
A presents some of the scoring options. Appendix
B illustrates several specific scoring protocols.
The time is right in terms of the “omics” of biological
science - genomics, transcriptomics, proteomics, metabolomics, physiomics, etc.
DataSpeaks Interactions® can help add time, function, and
individuality to the omics by going beyond identification and characterization
of substances to describe how substances help form systems that work, change,
and adapt.
The time is right in terms of shifting paradigms of science
and growing ferment about scientific methods. I have already described an
apparent shift from hypothesis
driven science to data driven discovery science. I have also
commented about Wolfram’s
book, A New Kind of Science. MQALA
offers hope to bridge the fault lines that are developing among different
conceptions of science. MQALA can help build toward a new consensus about what
science is and does with respect to investigations of complex adaptive systems.
The time is right in that many leaders perceive great
opportunities to be where different disciplines such as biology, medicine, and
computer science converge. They are right. Leaders speak of collaboration, interdisciplinary studies, integration, and
convergence. MQALA and DataSpeaks Interactions® provide a common
methodology that can be applied to investigate how complex systems of many
types work, change, and adapt.
In addition, DataSpeaks Interactions® can be used
to investigate interactions between and among variables normally considered to
be subjects of different disciplines. One example is the neural control of behavior. As such DataSpeaks
Interactions® can foster interdisciplinary and collaborative investigations
and take advantage of these opportunities.
A common methodology is a great unifier. The market opportunity sections
indicate how DataSpeaks Interactions® can be applied to many
problems. As such, it can help unify many disciplines.
The time is right in terms of institutional responses to the ferment in
science and the feeling that science, despite all its breakthroughs, is not
living up to its potential to improve human welfare. As examples, the United
States National Institutes of Health recently announced the NIH Roadmap for Medical
Research (http://www.nih.gov/news/pr/sep2003/od-30.htm).
“With this theme, New Pathways to Discovery, the NIH Roadmap addresses the need
to understand complex biological systems. Future progress in medicine will
require quantitative knowledge about the many interconnected networks of
molecules that comprise cells and tissues, along with improved insights into
how these networks are regulated and interact with each other. Researchers
predict that more precise knowledge of the combination of molecular events that
lead to health or disease will help to revolutionize the practice of medicine
in the 21st century.”
“New Pathways to Discovery also sets out to build a better “toolbox” for
today’s biomedical researchers.” DataSpeaks Interactions® has the
potential to be the primary breakthrough in such a toolbox. Notably, the NIH
announcement does not seem to anticipate actually measuring the interactions
that help define biological systems and help describe how patients respond to
their environments, including treatments.
The Food and Drug Administration (FDA) of the United States also
is demonstrating institutional response to changing imperatives by issuing
draft guidelines for personalized medicine (http://www.fda.gov/bbs/topics/NEWS/2003/NEW00969.html).
These are intended to provide guidance on how “to individualize therapy by
predicting which individuals have a greater chance of benefit or risk -- thus
helping to maximize the effectiveness and safety of drugs. FDA believes that
pharmacogenomic testing can be smoothly integrated into drug development
processes.”
“This is FDA’s first step towards integration of this new
field into the process of demonstrating that new drugs are safe and effective…”
FDA is beginning to recognize the importance of individuality. DataSpeaks
Interactions® measures interactions, including the benefit/harm of
treatments, for individuals.
The Grand Challenges in Global Health initiative is another
response to changing imperatives (http://www.grandchallengesgh.org/ArDisplay.aspx?ID=29&SecID=302).
The Foundation for the National Institutes of Health (FNIH) and the Bill &
Melinda Gates Foundation sought to identify and fund “proposals for research on
these critical scientific and technological problems that, if solved, could
lead to important advances against diseases of the developing world.” Given
that one of the Founders of Microsoft is primarily responsible for the impetus and
sponsorship of this Grand Challenge, it is both ironic and a rare opportunity
that the key to health improvement is better software, namely DataSpeaks
Interactions®.
My response
to the “Call for Ideas” of The Grand Challenges in Global Health initiative was
written with knowledge about how to meet this challenging idea.
The time is right when The New York Times recently identified
“Does Science Matter?” as number 1 in a list of 25 of the most provocative
questions facing science (http://www.nytimes.com/2003/11/11/science/11MATT.html).
There are serious challenges to science throughout much of the world. Perhaps
it is time for a methodology of science that accounts for the temporal
contingencies that describe and shape complex adaptive systems and appears to
support a worldview that can hold individuals accountable as responsible agents.
Some leaders question the importance of information
technology when they ask, “Does IT matter?” (http://itmatters.weblog.gartner.com/weblog/index.php?blogid=10).
At least one major pharmaceutical research and development facility is cutting
back on IT. Perhaps IT needs a new big idea to help regain relevance.
MQALA will help make science and IT matter because contingencies matter
as mentioned in the Call for Leadership
and described more fully in the responsible agency section.
The time is right in terms of political need. The United
States and various countries appear to be highly polarized on many great issues
of our day including economic policy, environmental policy and the impact of
our foreign and military policies at home and abroad. Much of the vitriol that
may be tearing us apart can be traced to our failures to understand complex
adaptive systems. Science is not keeping up with demands for answers as stakes
increase. Political platforms end up taking positions on issues that no one
really understands. People get blamed rather than ignorance. Many leaders can
be held accountable for not doing more to advance scientific understanding.
New scientific methods that work to help people understand
systems and to create systems that work to achieve generally accepted goals could
help restore our trust in intelligence, rationality, and scientific evidence.
The time is right for additional
reasons that will be presented in the context of eight selected market opportunities.
6. Eight Selected
Market Opportunities
Development of DataSpeaks, Inc. as a business appears to be
the most efficient, effective, and fastest way to deliver the benefits of DataSpeaks Interactions®
to people despite protestations I have experienced from academe.
DataSpeaks Interactions® may be a bigger market
opportunity than the Web browser. Creation and application of scientific
understanding with a computational method of empirical induction may be a bigger
opportunity than browsing in a world that still needs software that accounts
for mechanisms, individuality, and time. In addition, DataSpeaks Interactions®
will drive collection of data
movies as well as adaptation
and emergence in a
world with an expanding horizon of possibilities.
Here are eight selected and overlapping market opportunities
for DataSpeaks Interactions®.
- Revitalizing the Pharmaceutical Industry
- Reforming Health Care
- Improving Public Health
- Visualizing How Brains Work, Change, and Adapt
- Improving Prediction of Economies and Capital Markets
- Modifying Behavior
- Advancing Responsible Agency
- Reinvigorating Machine Learning and Artificial
Intelligence
All of these opportunities can be considered as different
facets of one large opportunity in the software market.
One aspect of our business development strategy could be to
create an avalanche effect through synergy among
different markets commensurate with the leadership and resources that can be
brought to bear on the advancement of DataSpeaks Interactions®.
In addition to these eight market opportunities, DataSpeaks
Interactions® has the potential to drive demand for computing
infrastructure and many data collection technologies. Companies with such
technologies should help advance DataSpeaks Interactions® to
increase demand for their own products and services. Internet2 (http://www.internet2.edu/) also could help
enable DataSpeaks Interactions®. DataSpeaks Interactions®
is demanding of computational resources.
DataSpeaks Interactions® could be developed as
Excel add-ins or Excel add-ons. Such versions probably would have limited
functionality but could help seed the market.
All eight specific market opportunities can improve economic
productivity and human welfare. Each opportunity has its pros and cons. Early
adopters in all markets could achieve huge competitive advantages. Of the eight
market opportunities, I recommend that Visualizing How Brains Work, Change, and Adapt
should be pursued first.
6.1.
Revitalizing the Pharmaceutical Industry
DataSpeaks
Interactions® can help revitalize both drug discovery and
clinical research.
Productivity of the pharmaceutical industry as measured by
approvals of new chemical entities has been declining despite huge and rapidly
growing research and development budgets, development of amazing new data
collection technologies, combinatorial chemistry, and decoding of various
genomes. The decline of productivity in this huge industry is illustrated at http://www.technologyreview.com/articles/hall1003.asp
, which also links to my response in the forum.
Patent
6,317,700 identifies over 20 ways that MQALA can help revitalize drug discovery and
development. The current emphasis on mergers, acquisitions, partnerships, organization
of work groups, design of collaborative work spaces, and blockbuster drugs might
not be sufficient for a long term revitalization of the pharmaceutical
industry. The pharmaceutical industry requires fundamental innovation and a
change in culture.
DataSpeaks Interactions® is an option that can
empower those who are affected by the pharmaceutical industry to revitalize the
industry and improve the cost-effectiveness of drugs and drug research.
6.1.1. Revitalizing Drug Discovery
DataSpeaks
Interactions® has the potential to revitalize drug
discovery by actually measuring the interactions that describe mechanisms of
health and many functional disorders as well as mechanisms of action.
Mechanisms of action include mechanisms of new and established treatment agents
as well as mechanisms of agents such as germs and allergens that can cause
health problems.
Demonstration
1 of Appendix B is
a quantitative description, obtained with MQALA, of the mechanism
by which two hormones interact. Claims in Patent
6,516,288 cover ways that MQALA can be used to describe mechanisms
in which hundreds or thousands of variables can interact in a more or less coordinated manner over time. Specific disorders of
coordination appear to be diagnostic of specific health disorders.
Actual measurement of interactions that describe mechanisms
of health and disorder is a fundamental change from current practice in
diagnosing many health disorders. The status quo in medical diagnosis largely
is restricted to measuring high or low levels of different variables such as
blood pressure, cell counts, mental depression, hormone levels, glucose levels,
and various lipid fractions.
Diagnosis by measurement of high or low
levels of various health variables has been helpful. But such measurement is
only a beginning because levels alone say so little about what specific
disorder’s are, how patients should be treated, what drugs and other agents do,
and what new drugs must do and not do in order to be safe and effective.
DataSpeaks Interactions® helps enable a new
strategy for medical diagnosis. The new strategy for medical diagnosis can be
described as moving beyond levels of action to measures of interaction that
describe mechanisms.
Some problems of diagnosing health disorders in terms of
high and low levels of health variables will be illustrated in the context of
hypertension. One aspect of the problem is that there probably are about as
many different types of hypertension as there are different mechanisms that can
produce high blood pressure. Apparently thousands of endogenous and exogenous
variables can affect blood pressure. Merely knowing that blood pressure is high
says little about what causes it to be high and what should be done about high
blood pressure. Diagnoses need to be more specific in order to identify genetic
predictors and individualize treatments as well as to target drug discovery and
development more effectively.
Another aspect of the problem of diagnosing by levels of
health variables can be illustrated with a simple case in which there are only
two interactants. Many combinations of levels of two interactants can yield a
more or less effective interaction much as many pairs of values (integers and
fractions) can yield a particular mathematical product such as 24. This
suggests that levels of particular individual interactants at particular times
say little about the functional integrity of coordinated biological systems. This
fundamental problem of diagnosis by levels becomes worse when there are many
interactants.
I will illustrate some aspects of the failure to actually
measure interactions that describe mechanisms of health and disorder by
reference to aspects of my personal work experience. Many readers might be able
to relate to these experiences.
I once worked with a psychiatrist who essentially said that
my job was to produce statistically significant results - any results, including
significant correlations between values of variables that were measured repeatedly
for some subjects. Then he would explain the results in terms of interactions
between endogenous substances in nervous systems - interactions that could be
up or down regulated by treatments. Then we could publish the results together
to advance our careers. This could be described as a publication questing
through significance questing scheme.
Failure to actually measure dynamic interactions between
endogenous substances and how these interactions may be affected by treatments
opens doors to pure speculation in science. The psychiatrist I worked with
became infuriated when I resisted his scheme. He diagnosed me as having “flat
affect.”
Similarly, dysregulation theories of various disorders were
popular, at least in the 1980s when I worked in psychiatry. I tried to convince
investigators to actually measure the interactions that could put their theories
to the test. Citations in Patent
6,317,700 document some of my unsuccessful efforts.
Statistical review could be expected to help reduce many
such abuses. Given the volume of published literature that manifests such
abuses and untested theories, my impression is that review has not been very
successful. Talk of interactions and their up and down regulation suggests pent
up demand for actually measuring interactions. But statisticians are not
prepared to help meet this demand or control these abuses because actual
measurement of interactions that describe biological mechanisms appears to be
outside the lens of their experience.
Abuses permitted by failure to measure interactions could be overcome by
establishing departments
of empirical induction.
DataSpeaks Interactions®
can make it easier to identify mechanisms of treatment by actually measuring ordered
or disordered mechanisms and how these mechanisms are affected by treatments. Demonstration 1 illustrates the measurement of
interactions in the context of reproductive endocrinology. In addition,
DataSpeaks Interactions® could be used to measure how interactions
change upon administration of drugs that might affect hormones or block or
sensitize hormone receptors.
Here is another example of measuring mechanisms of drug
action. The section about functional
brain image analysis and Demonstration 3
describe how DataSpeaks Interactions® can be used to measure
functional connectivity between and among brain regions. Disordered
interactions could be diagnostic of many functional brain disorders. In
addition, interactions could be measured before and after drug administration
to investigate how drugs may affect functional connectivity involving thousands
of brain regions simultaneously. This illustrates one potential high throughput
method for investigating drug effects, a method that appears to remain essentially
untested.
DataSpeaks Interactions® can make drug target
discovery and validation more efficient. It can help squeeze much of the
mystery and unpredictability out of drug discovery and development.
6.1.2. Re-engineering Clinical Research
Re-engineering the clinical research enterprise is an
important part of the NIH Roadmap (http://nihroadmap.nih.gov/clinicalresearch/index.asp).
“Clinical research is the linchpin of the nation’s biomedical research
enterprise.” The boundaries between NIH clinical research and pharmaceutical
industry clinical research appear to be blurring. Both need to be
re-engineered.
Efforts to re-engineer clinical research are laudable. However,
infrastructure improvements called for in the current NIH plan show little or no
evidence of recognizing how fundamental the re-engineering effort needs to be
in order to achieve substantial progress.
Clinical research still is rather primitive despite all the
trappings to the contrary. Progress is being made. But this progress in
evaluating treatment effects and in translating research results into clinical
practice is unnecessarily slow and expensive. Progress so far is just an inkling
of what we will be able to achieve after clinical investigators start measuring
the benefit/harm of treatments over time and across variables for individual
patients before any statistical
tests. Appendix A, Patent
6,317,700 and one of my reprints
illustrate the actual measurement of benefit/harm.
A primary function of clinical trials is to evaluate the
safety and efficacy of treatments. This is being done without measuring the
benefit/harm of treatments as interactions between treatment and health
variables for individual patients. Statistical tests are being performed on
health variables, not benefit/harm scores. This critical distinction is not
often made.
People who are not experts in clinical research and who defer
to the experts often seem surprised to hear that clinical trials do not measure
and test the benefit and harm of treatments. In contrast, experts in clinical
research seem to find it difficult to believe that there is any alternative to
performing statistical tests on health variables broadly defined. These
unfortunate facts testify to the awesome power and influence of the statistical
establishment.
These points are true despite the fact that benefit/harm
scoring has ancient roots. Survival often depends on temporal contingencies. Animals
learn from temporal contingencies. People and other organisms have learned to avoid
encounters with poisonous plants and dangerous animals most directly from the temporal
contingencies of encountering them. People were learning how to take care of
themselves and each other long before the first randomized controlled group
clinical trial. Clinicians learn and gain insights from the temporal
contingencies of providing care to individual patients. Patients sometimes
disobey doctors’ orders because of the unpleasant contingencies of treatment. Such
learning is described briefly in the behavior modification section.
Benefit/harm scoring measures the health related temporal
contingencies of treatment. Evidence from temporal contingencies is distinct
from and often complementary to evidence from
group comparisons. But evidence based on temporal contingencies, which would
account for individuality and time, is barely used in most scientific
evaluations of treatments.
One formative influence on what came to be MQALA was my interest in
algorithms that used response to drug challenge,
de-challenge, and re-challenge to help evaluate evidence for adverse drug
reactions. This evidence illustrates temporal contingencies. Occasionally
clinicians are held accountable for harming patients or continuing expensive
ineffective treatments if clinicians do not take reasonable steps to monitor
how individual patients respond to treatments even if the treatments were used
in accord with treatment guidelines based on group clinical trials. This
suggests some primacy of evidence based on temporal contingencies over evidence
based on group comparisons. In addition, N-of-1 clinical
trials, which produce evidence essentially based on temporal contingencies,
have been identified as the gold standard for evidence based medicine.
Benefit/harm scoring with DataSpeaks Interactions® relies
on the same type of evidence as the old drug challenge, de-challenge, and
re-challenge algorithms. But it develops this evidence with many additional steps.
As examples of additional steps, MQALA evaluates the same type of evidence by
computation from data rather than by subjective impressions in peoples’ heads.
MQALA measures both harmful and beneficial effects. Treatment does not have to
be merely present or absent but can vary in dose over time. MQALA can account
for effects on tens or hundreds of health variables simultaneously. In
addition, MQALA can account for drug-drug interactions, temporal parameters
that affect interactions between treatment and health, and how evidence for
benefit/harm varies over time when, as examples, patients adapt to or are
sensitized to the effects of treatments.
So what does the discovery of MQALA and the invention of
DataSpeaks Interactions® mean in terms of re-engineering clinical
research? Here are some initial impressions together with some information
about why these changes would be valuable. I present this in the context of
using drugs, which can vary in dose over time, for the management or control of
chronic disorders. This is an important part of clinical research but not the
only part.
One implication of MQALA is that conventional parallel group
clinical trials generally will become unethical in addition to being needlessly
expensive, hugely inefficient, and largely inconclusive about major issues. The
new alternative would be randomized multiple N-of-1
clinical trials with one or more groups. A multiple N-of-1 clinical trial
consists of a coordinated set of N-of-1 clinical trials. DataSpeaks
Interactions® would be used to measure apparent benefit/harm over
time and across one or more health variables. The statistical
method would be used to test the null hypothesis of no benefit/harm in either
single group or multiple group designs.
Both single and multiple group multiple N-of-1
designs would randomize doses, which could include zero dose as placebo, to
time periods. Multiple group designs also could include randomization of
patients to different treatment groups. In general, different doses of the same
drug would be evaluated both in individual patients and single groups of
patients in a manner analogous to the way drugs can be titrated in clinical
practice.
DataSpeaks Interactions® can be applied to data
from many conventional clinical trials. This would be a good way to become
familiar with the new methodology and mine old trials for new understanding and
insight. However, the value of reanalyzing conventional clinical trial data is
limited because conventional clinical trials are not designed to provide
reliable and valid measures of treatment effect or benefit/harm for individual
patients. For example, conventional designs do not distinguish placebo responders
from true responders to investigational agents. This avoidable failure makes it
difficult to target drug development to patients most apt to benefit and away
from patients most apt to be harmed. For example, conventional clinical trial
designs make it difficult to capitalize on the success of the Human Genome
Project. Conventional trial designs make it difficult to find genetic
predictors of differential response because conventional trials do not provide
reliable, valid, and specific measures of benefit/harm for individual patients.
Conventional clinical trials that test health variables are
largely inconclusive with respect to major questions such as whether or not a drug
should be approved for marketing. Part of this problem derives from the fact
that most treatments affect more than one health variable and that tests on
many health variables in single trials create problems involving the management
of statistical significance levels. This limitation creates huge problems in
selecting health variables for inclusion in trials and selection of variables
for primary tests in particular trials that gather data on more than one health
variable. This limitation, together with a lack of computational procedures for
combining results from many clinical trials that test different health variables,
means that drug approval might be more a matter of social consensus than
scientific test. Since social consensus is subject to the effects of power and
influence, approval or disapproval of a drug with a given benefit/harm profile is
not very scientific and predictable.
An alternative approach, based on MQALA, was proposed in Appendix A. Appendix A
illustrates how the results of a randomized multiple N-of-1
clinical trial with various doses and 20 health variables could be analyzed
with a single group t-test on mean overall benefit/harm scores, one such score
from each of the 20 individuals in a group. Similar use of the t-test is shown
in Demonstration 1
of Appendix B.
Here is another facet of the re-engineering effort that
would be made possible with MQALA. MQALA would allow users to separate
determination of treatment effects from how patients and clinicians value these
effects. Thus, for example, clinical trial results could be made available on
an interactive Web page where patients with their clinicians could decide on
how they value the different effects of particular treatments. For example,
some drugs cause impotence. A particular patient could decide if this is a
beneficial or harmful effect and how important this effect is compared to
effects on other health variables. After a patient and clinician selects a set
of weights and directions, he could click to see if the drug is apt to be
beneficial or harmful to him based on his own preferences and weights. Similarly,
two or more drugs could be compared. This also would help personalize or
individualize treatment.
Other facets of the re-engineering effort would involve
packaging or otherwise providing medications to facilitate randomized N-of-1 clinical trials, preferably with multiple doses, and
the collection of the required data movies, preferably over the Web.
Information about planned doses could be supplemented with data about actual
doses and levels of drug or drug metabolite levels in bodily fluids. Health
variables could include information about variables usually collected in
laboratories, symptom rating scales for variables often used to collect data
about safety and efficacy, computerized measures of physical and mental
performance, and quality of life measures. MQALA can be used to relate
treatment effects on different levels of drug effect hierarchies.
Clinical research was largely set asunder from clinical
practice decades ago with the advent of conventional randomized group clinical
trials. Many of the effects were salutary. But this largely created problems of
separate budgets for research and practice and of translating the results of
clinical research into clinical practice. The solution that needs to be
re-engineered is to integrate clinical research with clinical practice. MQALA
would help make this possible because best practices for clinical research
would be essentially identical to best practices for providing health care. But
this also calls for re-engineering health
care systems.
6.1.3. Competing Visions for Clinical
Research and Practice
Discovery of a computational method to measure apparent
benefit/harm of treatments over time and across variables for individual
patients might be the primary innovation that needs to be considered while
re-engineering much of clinical research and practice. To illustrate, I will use
broad strokes and statements made from a point of view to paint the old and the
new competing vision about how clinical research can be conducted. Both visions
assume that we have promising drug candidates with potential value for the
management or control of chronic health disorders. We start with the first clinical
trials for men and women.
6.1.3.1. The Old Vision for Clinical
Research and Practice
The old vision essentially requires us to start by making some
slightly educated guesses about doses and indications. Major decisions have to
be made when there is the least amount of clinical data. Generally, relatively
narrow groups of subjects are targeted with rather small sets of predetermined
doses. Early mistakes mean that the drug is targeted to patients that will not
benefit, to patients that will be harmed, and away from some patients that
could benefit. In addition, the doses could be wrong. Such mistakes have killed
many drug development programs. Since the old vision does not include
procedures to collect reliable and valid measures of benefit/harm for any
patient, it is difficult to improve targeting of drug development and patient
care. Similarly, since the old vision does not include procedures to find
optimal doses for individual patients, it is difficult to improve dosing.
Typical clinical trials collect data on more than one health
variable and for more than two time points. However, it is best to have one
primary statistical test per clinical trial. When the statistical test is
performed on a health variable, this means that data about health variables
other than the primary health variable is underutilized. Similarly, data on
more than two time points often largely goes to waste because change scores
generally are based on only two repeated measurements, a baseline and an
endpoint. Failure to measure benefit/harm fosters disputes and controversy
about selection of health variables and time points.
One type of controversy involves the type of health
variables for primary statistical tests. Many clinicians seem to favor
objective laboratory measures, which are quite specific and often favor
specialties that clinicians happen to practice. Patients, and often employers
and governments that often help pay for health services, may have difficulty
appreciating the import of specific laboratory measures. Many patients, some
clinicians, and some payers might favor more subjective health status measures
such as the SF-36 Health Survey that are more comprehensive and can be used to
help evaluate the relative value of different treatments, including treatments
for different disorders. In addition, prevailing analytic methods make it
difficult to evaluate relationships between the laboratory measures and health
status assessments.
The old vision often discourages systematic assessment of
treatment emergent events for several reasons. Systematic assessment together
with conventional methods of data analysis usually target adverse events
compared to unanticipated beneficial effects. Systematic assessment can
increase reporting rates of adverse events, thereby making drugs look bad
compared to drugs evaluated with more haphazard voluntary reports. Haphazard
reports are subject to many uncontrolled factors that increase variability of
reporting and make it difficult to detect drug effects. Haphazard reporting has
been favored by some to avoid learning about adverse effects of drugs. This
puts patients at risk and makes financial loses mount when drug development
projects are terminated late and when drugs have to be withdrawn from the
market.
The old vision amasses information in small increments that
are difficult to integrate. Particular trials might help answer particular questions
such as whether or not a particular dose of a particular drug is better than
placebo with respect to a particular health variable over a particular period
of time defined by baseline and endpoint assessments. This specificity makes it
difficult to make broad decisions about approval of drugs for marketing and use
of drugs for particular patients. Broad decisions need to be based on more
comprehensive and realistic evaluations.
Since conventional clinical trials are not designed to
optimize treatment of subjects that participate, the trials often have to be
paid for with large research budgets that are separated from health care
budgets. In addition, the ethics of such trials often are challenged for good
reason. Furthermore, the old vision fosters large regulatory agencies that tend
to pass judgment on substantive rather than procedural issues and abridge the
rights of individual patients that could benefit from new drugs.
The old vision almost guarantees conflict and failure at
great expense, currently over $800,000,000 for approval of each new chemical
entity.
After approval, new drugs enter the less controlled world of
clinical practice where there is considerable chance that new drugs will be
recalled, that the responsible companies will be subject to liability, and that
the regulatory agencies that may be blamed will respond with fewer approvals
and more expensive drug approval requirements.
Approvals for marketing generally unleash expensive
marketing blitzes that are subject to abuse, do little to improve disease
management, and may fail to collect additional information from actual clinical
practice that can expand scientific knowledge about drugs and optimize
treatments for individual patients.
The old vision supports a number of large establishments and
leaves executives pleading to an angry, disillusioned, and suffering public for
patience, understanding, high drug prices, and protection of the status quo. Some
recognize that “The patient is waiting.” But the old vision - the standard operating
procedure - helps assure that patients will continue to wait.
The status of evaluating safety and efficacy without
measuring the benefit/harm of treatments for individual patients is similar to the
status of investigations of infectious diseases before the discovery of germs.
6.1.3.2. A New Vision for Clinical
Research and Practice
This
new vision would make clinical research more scientific, productive,
cost-effective, and ethical. The key to this vision is the actual
measurement of the benefit/harm of treatments as interactions
over time between measures of treatment and measures of health
for individual patients. This vision would integrate the best
clinical research with the best clinical practice. The measurement
of benefit/harm would be accomplished by DataSpeaks Interactions®.
The new vision would make clinical research both more standardized
and adaptable. Standardized trials can be more adaptable when the design
includes fixed contingencies based on objective measures of benefit/harm. It is fitting for the new vision to be more
adaptable than the old vision because the former is based on MQALA, a methodology for
investigating how complex systems work, change, and adapt.
MQALA helps enable N-of-1 clinical
trials. This includes coordinated sets of N-of-1 clinical trials or multiple
N-of-1 clinical trials. I predict that such clinical trials will become the
gold standard for much clinical research and practice. Appendix A includes aspects of a multiple N-of-1
clinical trial for high blood pressure.
N-of-1 clinical trials can be designed with adaptive dosing
and adaptive data collection. Multiple N-of-1 clinical trials can be designed
with adaptive patient selection. Adaptability would be based on contingencies
that are based in turn on measured results. All contingencies and measurement
procedures would be specified in research protocols that make the procedures
objective, transparent, and reproducible.
MQALA makes it easy to process data from N-of-1
clinical trials that include more than two doses for each individual.
Accordingly, doses can be randomized to periods using procedures that allow
generally escalating doses over successive periods of time. Benefit/harm would
be monitored over time by computation as a function of dose for individual
patients. Dose escalation could stop after it was determined that higher doses were
producing no additional benefit or if higher doses were shifting the balance
from increasing benefit to increasing harm.
Since benefit/harm scoring is equally sensitive to both
beneficial and harmful effects, evaluation of both safety and efficacy could
begin immediately after the second assessment of health and a change in
treatment, including the initiation of treatment, for each individual patient.
Investigators would have to specify whether higher levels on each health
variable are either beneficial or adverse.
Multiple N-of-1 clinical trials with adaptive dosing would
make it easier for investigators to identify optimal doses for each individual
patient as well as distributions of optimal doses and average optimal doses for
entire samples of patients or any subset of patients.
Adaptable dosing calls for programmable dosing and data
collection devices that could be designed to help maintain blinding or masking
in clinical trials. Such devices should be Web enabled.
The new vision also would allow adaptive collection of data
for health variables such as laboratory, symptom, performance, and quality of
life or generic health status variables. One data collection strategy would be
to follow hits on more general queries with more specific queries. This
strategy is illustrated with the SAFTEE instrument (Systematic Assessment of
Treatment Emergent Events). More information about adaptive data collection can
be found at www.qualitymetric.com.
My interest in SAFTEE was another formative influence on
what came to be MQALA. I continue to hold that a major factor that has impeded
widespread adoption of more systematic, comprehensive, and adaptive data
collection about the effects of treatments has been the lack of a methodology
such as MQALA for processing the data. With MQALA, event rates, which tend to
be higher with systematic assessment, have little bearing on benefit/harm
scores except in extremes when events are either almost always present or almost
always absent. The new vision overcomes a major source of resistence to
systematic assessment for adverse events.
The new vision also calls for adaptive patient selection. This
becomes possible because DataSpeaks Interactions® can be applied to
data from multiple N-of-1 clinical trials to provide
reliable and valid measures of how individual patients respond to treatments. These
measures would make it possible to efficiently identify specific diagnostic and
genetic predictors of differential response to treatments. The fact that the
same gold standard methodology can be used for both clinical research and
clinical practice would help make adaptive patient selection feasible in terms
of patient numbers.
Adaptive patient selection would mean that clinical trials
could begin with wide varieties of subjects and that targeting could be
improved during the course of trials. This would increase the odds that new
drugs would be found to be safe and effective for at least some specific groups
of patients.
One implication of adaptive multiple N-of-1 clinical trial strategy
is that essentially the same clinical trial design could be used for clinical
trials for many different drugs for many different types of patients. Once
established, the new vision would reduce the need to redesign clinical trials.
In addition, it would be easier to combine results from different trials and
compare the cost-effectiveness of different treatments.
The new strategy is designed to overcome many of the
failures of the old vision.
6.1.4. Opportunities and Challenges
DataSpeaks
Interactions® can revitalize the pharmaceutical industry
by actually measuring, discovering, analyzing, synthesizing, and visualizing
interactions that describe biological and treatment mechanisms as well as the
benefit/harm of treatments. This is a big market opportunity. However, the new technology is disruptive
because it calls for changes in the way health disorders are diagnosed and for
a new vision of clinical research.
The pharmaceutical industry was an early adopter of high
throughput data collection technologies. Perhaps no other major industry is so
thoroughly controlled by the statistical
establishment. Perhaps no other industry is so thoroughly mired in data
swamps in a way that so dramatically reduces its potential.
Organizations - pharmaceutical companies, contract research
companies, tools companies, universities, institutes - that become early
adopters of DataSpeaks Interactions® could have huge competitive
advantages even if pivotal trials continue to be conducted for some time with
outmoded methods.
Perhaps it is time for at least one major pharmaceutical
company to break with the pack and lead with data processing based on a new
method that accounts for time and individuality and that actually measures
biological mechanisms as well as the benefit/harm of treatments.
6.2. Reforming
Health Care
Almost everyone recognizes that health care in the United
States needs reform. However, fundamental health care reform largely is
gridlocked because there are almost as many proposals for reform as there are
constituencies and points of view. Furthermore, some constituencies are
represented by mammoth organizations with great momentum. Gridlock suggests
that no one really knows what to do and that potential reformers have yet to recognize
a fundamental underlying problem that can be agreed upon now and addressed in
concert. Recognition of this problem and discovery of a technology that can
help solve the problem creates a huge market opportunity.
Health care reform has
earned a bad name. It has become a political and ideological football. Large
bureaucracies would be counterproductive. Grand reorganizations are not
sufficient. Cash infusions are not apt to unleash competition in markets or
increase efficiency and accountability. Malpractice litigation does not seem to
be improving health care. Malpractice reform does not create systems that reduce
medical errors. Sensitivity, compassion, and good intentions are not sufficient
to overcome ignorance. Current practices compromise ethics.
In addition to these conventional aspects of health care
reform, health care systems should be expected to deliver new benefits made
possible by developments such as comprehensive genotyping, Web enabled health
monitoring devices, and home health care. To a large extent, failure to reform health
care means that many breakthroughs in life sciences and technology will not
help patients.
Health care reform is stalled for (1) lack of sufficient
scientific understanding about how health-related systems work, change, and
adapt and (2) lack, until now, of a scientific method to obtain the required
scientific understanding in an efficient manner. Perhaps many different health
care constituencies could rally around these points and start to achieve health
care reform step by step. Health care needs to become a system that manages
contingencies more effectively.
Given strong leadership, DataSpeaks Interactions® software
has the potential to help unleash health care reform. All other sections of
this Web site help support this claim. Perhaps more important than anything
else, fundamental health care reform will have to overcome the awesome power of
the statistical
establishment. Statistical methods rule where it counts in evidence
based medicine. But, for example, statistical methods are almost useless for
processing the time ordered data in the charts of individual patients. People
are left to process such data in their heads, which is overwhelming, odious, costly,
unscientific, and prone to error. People can not process so much data in their
heads and keep up with the demands of the modern age.
This focus on a technology that can improve scientific
understanding of health related systems does not mean that health care reform
depends primarily on professional scientists and researchers. To the contrary,
access to MQALA, with its capability to account for time and individuality, will
empower most health care constituencies, including many patients and potential
patients, to improve their practice and behavior in accord with scientific
understanding.
It often is said that most treatment episodes are
experiments of sorts, especially in the context of chronic disorders. Diagnoses
are somewhat like hypotheses. Treatments are interventions. Clinicians diagnose
and treat and watch what happens. Depending on what they happen to see,
clinicians may change doses or treatments and try again. Currently this whole
process is haphazard and subjective, accounting for much avoidable suffering
and cost.
MQALA applies to time ordered data for individual systems. This
has major implications in terms of strategies for health care reform. It means
that individual clinicians, individual practices, individual hospitals, and
individual hospital systems can start pursuing health care reform one by one. Those
who reform first and fastest and best are apt to dominate their markets.
Technological solutions also are earning a bad name in health
care. After all, new diagnostic and treatment technologies have been known to
increase the cost of care. DataSpeaks Interactions® is a different
type of technology. It has the potential to increase the cost-effectiveness of
most other technologies. Savings from reductions in lost and damaged lives, treatments
for iatrogenic conditions, wasted treatments and procedures, costly professional
time, and legal liability together with productivity improvements that result
from increasing professional and patient empowerment have the potential to
offset the cost of new software, training, and infrastructure required to
support the software.
DataSpeaks Interactions® is not a final solution
for reforming health care anymore than it is a drug or diagnostic system. But
it is a software tool that can be used to reform health care, just as it is a
tool that can be used to help discover and develop new drugs and diagnostic systems.
Adoption of this tool will be disruptive. Health care reform based on
scientific understanding and methods will require leadership and much hard
work. Most of the science has yet to be done.
DataSpeaks Interactions® can help keep health
care reform from being a one time event. It can help health systems to become
learning systems so that they can improve continuously and adapt to new
circumstances including new treatments and new and newly resistant pathogens.
Despite being based on technology, fundamental health care
reform of the type envisioned here can make it easier to provide more personal
and humane health care in addition to better outcomes for whole populations of
people at prices people are willing to pay. In addition, it will make health
care a more satisfying experience for millions of dedicated employees and for
people who use health care services. Furthermore, DataSpeaks Interactions®
can shift some of the burden of health care back to people who will be
empowered to be more responsible
agents to maintain and improve their own health and that of their
loved ones.
The health related systems that we need to understand to
reform health care include biological systems. In this respect, the solution to
the growing crises in health care is similar to the solution for the growing
crises in the pharmaceutical
industry. Other health related systems that we need to understand
include populations,
economies, behavioral and
social systems, as well as brains
and artificial
intelligence systems that are touched upon in other sections of this
Web site.
Discussion of health care reform is a tall order. I will
offer a few initial ideas about meeting the challenge to help jump start the
process of reforming health care from a new scientific and technological perspective.
Health care will be considered to comprise several major
interconnected markets - providers, payers, as well as patients and potential
patients.
DataSpeaks Interactions® could make all the
difference in reforming health care. This makes it worthwhile to consider how
DataSpeaks Interactions® can be of value and how it can penetrate
the health care market step by step. Section 2.8.2 of Patent
6,317,700 includes ten reasons why MQALA would be valuable from a
practical perspective largely in the context of health care for the management
or control of chronic disorders.
6.2.1. Health Care Providers
Health care providers will be considered to include (1)
clinicians and all their support personnel including nurses and people who
provide diagnostic and therapeutic services and (2) administrators and managers
of health care practices, hospitals, and hospital systems.
6.2.1.1. Clinicians
DataSpeaks
Interactions® can help clinicians both to diagnose health
disorders and to evaluate treatments for health disorders. Doing so would help clinicians
advance their careers, medical science, and patient welfare.
6.2.1.1.1. Diagnosis
Many functional health disorders involve disordered interactions
between and among biologically active substances and other health related
variables. I described diagnosis by measurement of disordered interactions and
mechanisms in the context of drug discovery.
DataSpeaks
Interactions®appears to be the first software
package to actually and effectively measure potentially diagnostic interactions
using time ordered data for individual patients. This opens doors to many
opportunities. Here is a simple example.
Intensive care units often monitor both blood pressure and
pulse rate. I suspect that measures of interaction between these variables
might provide important diagnostic information about the status of
cardiovascular systems that is not revealed by the way these measures currently
are used. This opportunity needs to be investigated by cardiologists. The
beginning of this process is basically simple - measure the interactions in
basically the same way for each of many patients, identify variations in these
measures across individuals and over time for individuals, and identify what
the variations mean in terms of diagnosis and treatment.
Clinicians in many specialties will be able to find similar
opportunities. In this way, many clinicians would become familiar with MQALA and value of measuring
interactions. The analogy
described some advantages of using data movies, rather than data snapshots, for
understanding dynamic processes such as health and disorder. The section on revitalizing the
pharmaceutical industry described the value of going beyond levels of variables to measuring interactions in medical
diagnosis.
Companies that sell health monitoring and functional imaging
devices could increase demand for their products by providing modules of
software based on MQALA to process the data.
6.2.1.1.2. Treatment Evaluation
The current development status of evaluating treatment effects
in clinical practice is akin to the status of diagnosis before the modern era
of laboratory tests and imaging procedures. The benefit/harm of treatments, as
it becomes evident over time for individual patients, largely is evaluated
subjectively, even when the required data are available in medical records.
Subjective evaluation of treatment effects for individual
patients often reminds me of the old practice of clinicians diagnosing diabetes
by tasting urine to see if it is sweet. Both practices are
subjective, odious, and often inadequate.
DataSpeaks faces a major challenge in trying to get people
to use DataSpeaks
Interactions® for measuring the benefit/harm of
treatments. This challenge is illustrated as follows. I was presenting a
lecture at a university bioinformatics seminar. Most of the lecture presented
results obtained by applying my prototype software to portions of a yeast, cell
cycle control, time series, gene expression, microarray dataset that is
publically available from Stanford.
Since
most people seem to have trouble understanding what it means to
measure interactions and I anticipated that the audience might
include clinicians, I used a clinical example. I showed a graph
with two time series variables - a measure of treatment and a
measure of health. The graph showed an association that suggested
substantial evidence for benefit. I explained that the summary
score for one measure of this association was 10.76. Since this
summary score was one score from a distribution of potential scores
with a mean 0 and standard deviation 1, 10.76 could be interpreted
as providing substantial evidence for benefit. (The distribution
of potential scores was defined by the combination of the data
and the scoring protocol as described in Appendix
A.)
At this point a physician/researcher
asked essentially what good does it do to measure the interaction when one can
see that the data suggest benefit. I did not challenge him by asking what good
does it do to measure sugar levels in urine when you can taste that the urine
is sweet. Perhaps I should have. The lecture effectively closed the door on
what appeared to be a promissing opportunity at a university where I had been a
faculty member years before.
New measures can be keys to scientific progress and improved
clinical practice. But when people have always gotten away without measuring something
as important as benefit/harm, people seem to forget that measurement is
important. Apparently I am the only one who really knows how to measure
benefit/harm over time and across variables for individual patients (see reprint
and patents).
DataSpeaks Interactions® is disruptive. DataSpeaks
calls for strong leadership to help advance a technology that is critical and
disruptive.
Health care needs DataSpeaks Interactions®. It is
nearly impossible to form reliable and precise subjective impressions about
treatment effects, especially when there are many repeated measurements of many
health variables obtained while treatments fluctuate in level over time.
Imagine trying to process subjectively all the data that are available for
individual patients in intensive care. Imagine trying to form reliable subjective
impressions about how treatments affect measures of functional connectivity
that can be obtained from functional
brain imaging. Measures of functional connectivity have the
potential to be objective measures for diagnosing functional brain disorders
and monitoring responses to treatment.
My impression is that most data relevant to benefit/harm is
just ignored basically for lack of computational methods to process the data
and to display results. Objective measures of benefit/harm have real advantages
as do objective diagnostic measures.
Most people do not realize that apparent benefit/harm can be
measured by computation from the relevant time ordered data for individual
patients. I suspect that this reflects the power of the statistical establishment.
Measurement of interactions between variables for individuals appears to be
outside the lens of experience
for people better trained to describe groups and make inferences from samples
to populations.
Failure to measure benefit/harm for individual patients is
costly in terms of health and money. Adverse drug reactions can continue until
disaster. Unexpected benefits are not accounted for and beneficial treatments
are terminated. Costly treatments are continued when there is no overall
benefit and even when there is overall harm. Treatment is not individualized
and optimized scientifically. Opportunities are lost to educate caregivers and
patients as well as to capture experience that can improve treatment of other
patients.
6.2.1.1.3. Forces for Change
Forces are converging to begin objective measurement of interactions
so that diagnosis and treatment can be improved. The key force is the discovery
of a data processing method, MQALA,
for measuring interactions over time and across variables for individual
patients.
Another force is the completion of the Human Genome Project
and the emergence of genetic testing. This puts a premium on the identification
of specific disordered interactions and sets of disordered interactions that
are diagnostic of functional disorders. These measures of interaction can be
used to help identify genetic predictors of disorders. Similarly, reliable, valid,
and specific measures of benefit/harm can be used to help identify genetic
predictors of differential responses to treatments. Health care involves enough
patients and can collect sufficient data to help make identification of genetic
predictors feasible without major increases in research budgets. Costs would be
controlled when clinical research and clinical practice are integrated.
Another force to begin measurement of interactions by
computation derives from the rapidly increasing amounts of data that can be
included in medical records for individual patients. Devices are being
developed that can monitor physiological functions within bodies, radio the
data out of bodies, and connect to the Web. Similarly, twenty minutes of
functional brain imaging could increase the amount of data in patient records
by orders of magnitude from what is now typical. All such data should be
considered to be part of patient records. DataSpeaks Interactions®
removes a primary bottleneck for using such data to improve health and health
care. Development of data movies such as those provided by these examples can
help make collection of the data worthwhile and increase demand for the data
collection equipment.
A previous sectiondescribes
how the measurement of benefit/harm with DataSpeaks Interactions®
can be a guiding force in re-engineering clinical
research. The new vision would help integrate clinical research with
clinical practice. The pharmaceutical industry could be a leader in helping to
bring this integrated vision to fruition, especially if it wants to make money
providing disease management services and needs to learn more about its drugs by
collecting data in secure but centralized repositories.
DataSpeaks Interactions® can help empower
clinicians and put them in charge of their own destinies by providing the tools
that they need to help optimize the care of their individual patients and by
helping to make more clinicians a creative force in generating bodies of
scientific knowledge required to improve health and health care.
6.2.1.2. Health
Care Administrators
Administrators are the other key part of health care
providers. Currently, health care administrators and managers appear to be in a
thankless, difficult, high stakes, no win, monkey in the middle position
between clinicians that need to be paid and payers that have been given good
reasons to think that they are not getting good value for their money. But
administrators also are in excellent positions to read the future and help
shape a better future for health care.
One way to approach the problem of health care
administrators is to compare two industries, the health care industry and the
credit card industry. I will do this in terms of personal experience that many
people in developed countries can appreciate. Both industries deal with
sensitive confidential data.
I live in Michigan and traveled to Florida. When I got home
I realized that my credit card was missing. During one call, I was able review
every recent transaction - who was credited, from where, when, and for how
much. Fortunately, the only problem seemed to be a lost card. My new card
arrived the next day. Credit card companies that could not keep track of
transactions efficiently would go out of business.
Compare this to when I go or take my children to a new
doctor. I’m often handed a clipboard with some forms that ask about health
histories and vaccinations. I find the vaccination questions especially troublesome.
I recognize that vaccinations are important. But why are they asking me when doctors
administered the vaccinations and have the records? I do not remember the
details of our vaccinations accurately, fear for the consequences of not
remembering, and suspect that some people have memories and records worse than
mine. These sorts of experiences make some people wonder about the quality of
service for which money is being paid.
Nothing about this comparison of two industries is new.
Administrators and many other people recognize that transaction costs and
problems created by poor transactions account for up to about one third of health
care costs. Efforts are being made to control such costs. Indeed a number of
available information technology services can help with the administrative
business of health care - ordering, billing, paying, prescribing, referrals,
scheduling, access to patient records, access to the medical literature and
treatment guidelines, etc. England is embarking on a $17 billion bet on
information technology primarily to improve the administrative business of
health care.
DataSpeaks
Interactions® is of value here because it offers a new
reason and strategy for bringing health care into the information age.
Measurement of interactions, including the benefit/harm of treatments, will
become vital to providing quality health care. Measurement of interactions
depends on the collection and processing of electronic data.
Part of the administrator’s
problem has been that information technology has been focused primarily on the
administrative business of health care. In this respect, it is the
administrator’s problem. Clinicians, who have been known to be rather headstrong
when it comes to changing their behavior and having their behavior managed, are
left with a good out from embracing information technology to improve health
care. Until now, clinicians could accept primary responsibility for providing
good health care. Administrators only have to provide good pay and
infrastructure for clinician practices.
DataSpeaks Interactions® takes away this
clinicians’ out. Information technology becomes vital and integral to providing
quality health care.
Similar points can be made by comparing diagnostic practice and treatment practice in health care. If
there are important unanswered questions about diagnosis, it is expected that
clinicians will order laboratory tests and other diagnostic procedures. In
contrast, if there are important unanswered questions about the effects of
treatments, it still is acceptable to monitor the levels of a few variables and
to rely primarily upon subjective impressions about benefit/harm. There is no
expectation that clinicians actually measure the benefit/harm of treatments apparently
because no one has known how to measure the benefit/harm of treatments. Now
that MQALA, a procedure
that can be used to measure benefit/harm, is in the public domain, various health
care constituencies will have raising expectations.
Anticipate that objective measurement of benefit/harm might
become as integral to cost-effective health care when there are uncertainties
about the effects of treatments as objective diagnostic procedures are integral
to health care when there are uncertainties about diagnosis. At this early
stage in our development of scientific understanding of how biological systems
work and change, there is no shortage of uncertainty to drive application of information
technology that can reduce uncertainty, improve lives, and control costs.
Clinicians might need good information technology far more
than administrators. As information technology is developed to make diagnosis
and treatment more individualized, scientific, and responsive to patient needs
and preferences, administrative services could be piggy backed in on the
central requirement that a specific new type of information technology is
required to provide quality health care in the modern age. This also means that
major information technology initiatives in health care, such as that in England,
should be set up to include capabilities to collect and develop data movies.
6.2.2. Health Care Payers
Both cost-effectiveness and systems of payment are major
issues in health care reform. Payers include insurance companies, governments,
patients, and employers that provide health care benefits. Although DataSpeaks Interactions®
can be used to help investigate impacts of various payment systems on health,
society, jobs, and economic competitiveness when the software is applied to
relevant time series data, the focus here is on the cost-effectiveness of
health care.
People value good health and many people are willing and
able to pay, some more than they are paying now. But people also generally expect
good value for their money. Increasing the value of health care that actually
is provided might be more important to reform than spending more money. More
money is apt to follow reform that provides better value.
There appears to be growing interest in accountability in
health care. This is evidenced by the Foundation for Accountability (FACCT, http://www.facct.org/facct/site/facct/facct/home#webmd).
FACCT advocates for and helps enable a “person-centered
health care system.” However, FACCT appears to be limited by the prevailing scientific worldview
and the statistical
establishment. For example, although FACCT provides a number of
quality measures, it appears that they have yet to realize that it is possible
to measure the benefit/harm of treatments for individual patients as described
in the new vision for
clinical research, Appendix
A, a reprint,
and elsewhere on this Web site. Measurement of benefit/harm can advance
accountability in health care.
DataSpeaks Interactions® will both make it
possible to provide better value and increase expectations to provide better value
in health care. Anticipate that concerns about cost-effectiveness will shift
from specific treatments, diagnostic procedures, and facilities to more global advances
in information technology for health care systems. These advances will be
integral to diagnosis and treatment and help create integrated health
care systems adaptive to the diverse and changing needs of individual patients.
Figuratively, health care needs central nervous systems that can actually
process, rather than merely collect and store, time ordered data. Such systems eventually
will account for time and individuality.
New technologies can raise expectations for more productivity
and better value. DataSpeaks Interactions® could raise these
expectations dramatically. The new vision can raise expectations because it
describes specific steps that can be taken now to solve fundamental and costly
problems in clinical research and practice.
Payers often turn to outcomes research for guidance in
setting health care policy. However, outcomes research has been severely
constrained by the statistical establishment. Now
that MQALA has been
discovered, major constraints can be overcome to improve outcomes research in
at least two major ways. The first
way involves the use of health status measures and measurement of benefit/harm.
The second major way to improvement involves
individualization, group averages, treatment guidelines, and payment policies.
6.2.2.1. Health Status Measures and
Measurement of Benefit/Harm
As described before, users of the statistical method in clinical trials typically
perform statistical tests on health variables. As examples, trials for blood
pressure drugs usually test measures of blood pressure and trials for clinical
depression usually test depression scores. Among other problems, this makes it
difficult to obtain comprehensive evaluations of treatments and to compare
value with trials that test different treatments and health variables.
Health outcomes researchers have developed generic health measures
such as the SF-36 Health Survey. This includes use of computer adaptive test
technology to provide versions that are short, precise, and valid. Some versions,
especially versions with rather short recall intervals, can be administered
repeatedly. For more information, see www.qualitymetric.com.
Such developments in measurement of health status are major
achievements. However, the value of these achievements is severely limited when
users merely substitute health status scores for values of more conventional
health variables while performing primary statistical tests. Mere substitution
of variables tends to force avoidable and unproductive disputes about whether
primary statistical tests should be performed on generic health status
measures, more disease specific health status measures, or more conventional
diagnostic measures. Testing health variables also fosters an unnecessary and
costly proliferation of clinical trials as when different trials for the same
treatment test different variables.
The alternative to testing health variables is to use DataSpeaks Interactions®
to measure overall benefit/harm and profile benefit/harm across many health
variables before conducting statistical tests. Some advantages of measuring
benefit/harm are included in the “new vision” section.
MQALA also
would make it easier to investigate and understand how benefit/harm with
respect to generic health status measures is coordinated with benefit/harm with
respect to other health measures such as diagnostic measures, laboratory tests,
symptom surveys, measures of physical and mental performance, role functioning,
and health perceptions.
Each type of health measure has its own constituency. Use of
MQALA to investigate
interactions between and among variables at different levels of health measurement
hierarchies could help bridge gaps between clinicians who might favor clinical
trials that measure laboratory variables and patients and employers that might
be more concerned about how treatment affects performance at home and at work.
6.2.2.2. Individualization, Treatment
Guidelines, and Payment Policies
Perhaps the greatest impact of the discovery of MQALA on outcomes research and
health care policy is that it provides, largely for the first time, crucial scientific
information about diagnosis and the benefit/harm of treatments that helps enable
individualization of health care. Knowledge of patients’ genomes together with diagnostic
information from snapshots of patients’ conditions seldom are sufficient to
individualize and optimize patients’ treatments at this early stage in our
development of medical understanding.
Customers and clients in other businesses are coming to
expect more individualization in products and services as we move from the
industrial age to the information age. Many people want to individualize their
appearances, their homes, their cars, their lifestyles, and their financial
services. Perhaps patients also will come to expect health care that is both
evidence-based and individualized.
To a considerable extent, health care still is bucking the
trend toward individualization. Major trends in health care policy are towards
treatment guidelines for large groups and towards restrictive formularies and restricted
access. These trends are more akin to public health policy than individualized
health care. The opposing tends tend to pit clinicians and managers against
each other and reduce distinctions between clinicians and public health workers.
Section 2.8.3 of Patent
6,317,700 discusses relationships between the public health and
individualized care approaches to medicine from a historical perspective and
how some conflicts between the two approaches can be resolved by MQALA. A
subsequent sectionof this pagedescribes how MQALA can help improve public health.
Both public health policy and individualized health care can
improve health. However, the public health approach alone largely is in
conflict with achieving one of the great promises of the Human Genome Project,
which is to help enable more individualized or personalized health care. Such
conflicts need to be resolved in order to help achieve many health breakthroughs.
The apparent conflict between treating individual patients
as individuals or as group averages relates to Paul Meehl’s distinction, dating
back to the mid-1950s, between clinical (subjective, impressionistic) and
actuarial or statistical (mechanical, algorithmic) prediction. Actions based on
actuarial predictions generally yield better outcomes. However, there has been
much resistance to implementing actuarial prediction. Perhaps much of this resistance derives from fundamental
limitations of statistical prediction as well as from people overestimating
their powers of prediction and wanting to feel important.
This illustrates a fundamental limitation of statistical
prediction. Suppose that a particular patient with a chronic disorder has been
treated with a particular drug in accord with a treatment guideline that is
based on a consensus about the results of many randomized group clinical
trials. Also suppose that response of the patient to drug challenge, de-challenge, and re-challenge
suggests that the drug is causing serious liver toxicity. The prudent and
expected choice would be to change or discontinue the recommended treatment.
This may be a rather extreme example. But treatment often needs to be
individualized or personalized both in terms of biological responses of
individual patients and their preferences. In many cases such as this with an
adverse drug effect, essentially clinical prediction trumps actuarial
prediction.
A number of “algorithms” for assessing adverse drug
reactions account for response to drug challenge, de-challenge, and
re-challenge. But since such “algorithms” do not actually measure benefit/harm
as an interaction between measures of treatment and health, they are largely
subjective and impressionistic.
Much resistance to actuarial prediction might be a
convenient and motivated overreaction to situations in which essentially
clinical prediction trumps actuarial prediction. One approach to overcoming this
resistance may be to adopt an additional algorithmic method, MQALA, which uses
time ordered data to account for individuality and time.
MQALA also would advance randomized N-of-1
clinical trials, the gold standard for evidence based medicine (see, for
example, http://www.cche.net/usersguides/applying.asp).
One factor limiting advancement of this gold standard has been the lack a
better method for analyzing data from N-of-1 clinical trials. The statistical method is not well suited for
analyzing the results of N-of-1 clinical trials, especially when there are more
than two doses. Appendix A
illustrates how MQALA can be used to analyze N-of-1 clinical trials by measuring
the benefit/harm of treatment.
MQALA enables a scientific means for personalized prediction
and optimal care for individual patients. MQALA helps enable scientific
treatment of individuals as individuals. The statistical
method helps enable treatment of individuals as group averages. Both
methods often need to be used together since each of us is both an individual
and a member of various groups.
Use of MQALA together with the statistical method can help provide the best of
both worlds in health and health care - optimal treatment of individuals that
is based largely on MQALA and optimal treatment policies for groups that are
based largely on the statistical method. Using the terminology of the dialectic
process, the thesis is the statistical method, the antithesis is MQALA, and the
synthesis toward which we should strive is the complementary
use of MQALA and the statistical method.
In practice, I would expect this synthesis to mean that
treatment episodes often would begin with provisional diagnoses and provisional
treatments in accord with the most relevant treatment guidelines that are
available. Then, especially when there is clinically significant uncertainty
about outcomes, treatment would be optimized based on actual measures of
apparent benefit/harm together with any additional relevant statistical
information. Treatment guidelines would be modified if necessary in accord with
the accumulation of new experience. This entire process would tend to “close
the loop” so that clinical guidelines would continually guide practice and
experience gained from practice would continuously update guidelines. This
process would become part of future health care systems as clinical research
and practice are integrated in accord with the new vision.
If treatment of each individual is optimized with help from
MQALA, average health is apt to improve. At the same time, use of the statistical method may be the best way to bring
experience gained from other patients to bear on the treatment of each
individual.
The development of treatment and health care guidelines is
becoming more challenging as diagnoses become more specific almost every day. Many
guidelines may need to become genome and history specific. Perhaps the only way
to obtain such guidelines is to integrate clinical research and clinical practice.
This requires advances in information technology both to create the guidelines
and to make the guidelines accessible when needed. Quality health care requires
intelligence that goes beyond the experience of individual clinicians. The
health care system itself will have to become intelligent.
Health care providers should anticipate that payers may not
continue to pay providers that do not actually measure the benefit/harm of
treatments when there is important uncertainty about outcomes. In addition, payers
could start to expect the use of randomized N-of-1
clinical trials when such trials are cost-effective. Such trials are the gold
standard of evidence based medicine and help provide valid measures of
benefit/harm. In addition, measurement of benefit/harm could be used to help
hold providers accountable for outcomes. This also could help shift payment
from payment for treatments and other procedures to payment for measured
outcomes.
Many payers are concerned about raising drug prices. Drugs
can be one of the most cost-effective means for improving health. But payers can
and should act to make drugs more cost-effective.
Supporters of the status quo in the pharmaceutical industry often
argue that drugs must be expensive in order to support the huge costs of drug
discovery and development. These supporters suggest that cutting drug prices
would be like killing the goose that lays the golden eggs. This view supports
certain interests of the statistical establishment and
its subjects, which includes the pharmaceutical industry.
We must not kill the pharmaceutical industry. The
alternative is to make drugs more cost-effective by revitalizing the pharmaceutical industry.
Payers should challenge the status quo in the pharmaceutical
industry. More specifically, payers should challenge the pharmaceutical
industry to try DataSpeaks Interactions® for measuring the
benefit/harm of treatments and the mechanisms of health, disease, and treatment.
Such trials of the new methodology probably are the most important steps that we
can take now on the way to making drugs more cost-effective by making drug
discovery and development more productive and efficient.
The apparently growing expectation that every health problem
calls for an expensive professional solution could bankrupt payers, especially
if this expectation absolves people from certain responsibilities concerning
their own health and that of their loved ones. There might be limits to how
much collective payers such as governments, insurance companies, and employers should
pay to treat the consequences of over indulgent and self destructive behaviors.
6.2.3. Patients, Potential Patients, and Lay
Caregivers
Perhaps one of the most important steps that can be taken now
to improve the cost-effectiveness of health care is to empower patients and
their families to take more responsibility for their own health. New information
technology, scientific understanding, and scientific worldviews may converge
to help make patients, potential patients, and lay caregivers more responsible.
Leaders have a responsibility
to help make the required technology and scientific understanding available to
the public to improve health and the general welfare.
MQALA can
help empower people to act more responsibly. I will work from the very general
to some of the more specific implications of MQALA and DataSpeaks Interactions®.
As described in the responsible agency section, the MQALA scientific
worldview apparently supports holding people responsible for their behavior,
especially under certain conditions that knowledgeable leaders can help provide.
This contrasts with the prevailing scientific worldview, which tends to view
people as victims and passive respondents, controlled by their environments.
A worldview that enhances responsibility is fundamentally
important to improving the cost-effectiveness of health care. The prevailing
view appears to be that someone or something else is primarily responsible for most
of our health problems, both for causes and for cures. I suspect that many
professionals, including some tort attorneys, as well as some industries such
as the diet industry benefit from the prevailing view. But there are points
beyond which it is counterproductive to pass off responsibility and pay for
others to be responsible.
Everyone is an agent, realizing that even chemicals and
germs are agents. Some people are responsible agents. Some people are more
responsible than others and some people have more responsibilities. Our leaders
have particularly large responsibilities. Agents can be responsible for good
and bad.
Health effect monitoring can help make people more responsible.
Section 4.2.2.2 of Patent
6,317,700 describes health effect monitoring. Health effect monitoring is fundamentally
different from health monitoring in that only the former monitors both
independent or treatment and dependent or health variables and attempts to elucidate
causal and other predictive interactions for the monitored individual.
MQALA and DataSpeaks Interactions® help enable health effect monitoring at the level of
individuals. The independent variables could be measures of prescription drugs,
over the counter drugs, alternative and complementary medicines, dietary
components, allergens, pollutants, stress, and behaviors - almost anything that
can fluctuate in level over time for individuals, can be monitored over time,
and can affect health.
The dependent variables for health
effect monitoring could be laboratory variables; measures obtained with Web-enabled
health monitoring equipment; symptom and health rating scales including those
that are Web enabled; as well as mental, physical and work performance - almost
anything that can fluctuate in level over time for individuals, can be
monitored over time, and can be considered as a measure of health. Users would
specify whether higher levels of dependent variables were beneficial or harmful.
It would be best to collect health
effect monitoring data under conditions of experimental control such as randomized N-of-1 clinical trials.
This would help assure that the resulting scores are valid. However, when
experimental control and blinding or masking are not feasible, temporal
analysis parameters that are part of DataSpeaks Interactions® can be
used to evaluate the temporal criterion of causal and other predictive
interactions involving networks of variables.
DataSpeaks Interactions®
would be used to measure benefit/harm in health effect monitoring. Processing
the data with DataSpeaks Interactions® makes the data useful to
advance scientific understanding and improve decision making.
Health effect monitoring
can be a vital new tool for disease management programs and evidence based
medicine. Some disease management programs already emphasize individuality and
responsibility (http://www.healthmedia.com/research/strechers_insights.html).
DataSpeaks would take this further.
DataSpeaks Interactions® helps close the loop
involving clinical research and practice. Episodes of treatment and other
interventions for particular patients or clients would begin and be guided by
best available information from other people. If there is concern and
uncertainty about optimizing response, patients or clients could enter health
effect monitoring programs. The resulting information about benefit/harm could
be used both to optimize the care of particular patients and improve treatment
and care protocols for other patients or clients. Research and practice would
be integrated and the problem of translating research results to clinical
practice would be avoided.
Health effect monitoring may have particular power to
motivate behavioral change because the results are most directly relevant to
the person that the data are about. People often seem to consider themselves immune
from consequences of behavior that befall other people. In addition, results
from health effect monitoring may help modify behavior. Beneficial effects may
reinforce and harmful effects may punish the behaviors that produced them.
6.2.4. Consumer Driven, Market Oriented
Health Care Reform
DataSpeaks
Interactions® can help enable consumer driven, market
oriented health care reform by, for example, measuring biological mechanisms
that can become disordered to cause dis-ease, measuring benefit/harm as well as
mechanisms of treatment effect, and enabling health effect monitoring as a
fundamental new tool for disease management and evidence based medicine.
Perhaps one of the most important implications of this
approach is that it enables a bottom up
approach to health care reform. With this bottom up approach, most individual stakeholders
can start to take specific steps now to reform health care in an atmosphere of
experimentation and shared learning. This could be called a Thousand Points of
Light approach to health care reform.
I already offered suggestions about what individual
clinicians and groups of clinicians could do both in the context of diagnosis and treatment.
I made additional suggestions in the context of drug discovery and drug development.
I also made suggestions for health care administrators and managers.
Health care providers should offer health effect monitoring services. Although taking
personal responsibility for health generally may decrease demand for particular
individuals to require professional services, providers that do the best job in
gaining recognition for helping to keep patients healthy are apt to attract
more patients when they need professional services. Health care will continue to require a lot of professional
services. Health care providers could, for example, make health effect
monitoring services available through their Web sites.
Payers have considerable leverage for reforming health care.
The sections on payers
include a number of recommendations. Payers, including charities, foundations
and governments, could channel some of their funds from feeding the beast that
has grown out of the old
vision for clinical research and practice to support pilot programs
that nurture the new
vision of integrated research and practice.
Patients,
potential patients, and lay caregivers also can help drive
cost-effective health care reform. They could use health effect monitoring services and
outcomes data while favoring providers accordingly. Various patient support
groups could organize and use health effect monitoring services both to help
optimize the health of their individual members and contribute to bodies of
scientific information about their health concerns.
In contrast to this bottom up approach, many health care
reform efforts have taken top down
approaches in which conflicting
interests of massive groups collide like galaxies to create great
turmoil but little organized or progressive change. Health care systems appear
to require central nervous systems. This requires information technology that
includes DataSpeaks Interactions®. Development of information
technology systems in health care should proceed with greater urgency as people
recognize that these systems may be at least as critical to clinicians’ tasks of providing quality health
care as they are to administrators’
tasks of processing transactions.
Most constituencies could start pressuring the pharmaceutical industry,
which has the potential to provide some of the most cost-effective treatments,
to reinvigorate itself by starting to measure biological mechanisms as well as
the benefit/harm of treatments and the mechanisms of treatment effect. Agencies
that regulate the pharmaceutical industry also need some encouragement to
foster reform.
Data processing infrastructure and services companies such
as IBM could help make DataSpeaks Interactions® software accessible
to the world with some confidence that their modest but visionary investments
will pay off.
6.3. Improving Public Health
Some of the most dramatic advances in health derive from
practices that have been applied to whole populations. Such public health practices
include providing safe drinking water and sanitary sewage disposal.
The previous market opportunity focused on how MQALA can help reform health care
through applying DataSpeaks
Interactions® to individual patients. In contrast, this section
focuses on how DataSpeaks Interactions® can improve public health by
applying DataSpeaks Interactions® to individual populations.
Populations in this section are collective entities or composite individuals
that are geographically defined.
The market for DataSpeaks Interactions® in
epidemiology is not large compared to the other six market opportunities
addressed by this document. However, this market could have large impact if
policies based on use of DataSpeaks Interactions® improve public
health. This could help advance DataSpeaks Interactions® as well as
health. In addition, this market provides additional opportunities. One
opportunity is to expand the market from epidemiology to clinical epidemiology.
Another opportunity might be a good strategy for advancing
DataSpeaks Interactions®. Although the statistical establishment is
our primary competition, it will be important for DataSpeaks to gain
recognition from the statistical establishment. However, achieving this recognition
can be tricky. Much depends both on how statisticians react and how those who
often defer too much to
statisticians react.
First, I will focus on some strategic considerations of
particular relevance to statisticians.
Multiple time series epidemiological data currently appear to
be some of the most difficult data for statisticians and epidemiologists to
process on the way to scientific understanding. However, over the last decade
or so, much time series epidemiological data has become available from devices
that monitor environments. Much time series data involves the health effects of
ambient air pollution. The Health Effects Institute, which is a partnership of the U.S. Environmental
Protection Agency and industry (http://www.healtheffects.org),
has been a leader in the use of such data. Many articles based on this work have
been published including Fine Particulate Air Pollution and Mortality in 20
U.S. Cities: 1987-1994 that was published in the New England Journal of
Medicine (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=11114312&dopt=Abstract).
The methods used in such studies appear to represent the gold standard for
analyzing multiple time series epidemiological data. In addition, the
results of such studies have been used to establish air pollution policies. My understanding is that much of the data is
publicly available and could be used to demonstrate new data processing methods
such as MQALA.
Section 2.5 of Patent
6,317,700 illustrates how MQALA and the statistical method can be used alone or in
combination for epidemiological investigations. This section, which was written
as a hypothetical example before I was aware of ongoing work that used multiple
time series data in epidemiology, is a distinct alternative to the current gold
standard methods.
In brief, MQALA
would be used to measure interactions between daily air pollution levels and
daily death rates in each of several geographically defined regions. Then a
single group t-test on mean interaction scores would be used in an attempt to
reject the null hypothesis of no relationship. A similar use of the t-test is
shown in Demonstration 1
of Appendix B.
I propose comparing
the strengths and weaknesses of methods based on MQALA with the strength and
weakness of the gold standard methods that have been published for
understanding relationships involving measures of air pollution levels and
measures of morbidity and mortality.
One basis for
comparing new and gold standard methods would be ease of use and understanding.
After measuring interactions with MQALA, the initial statistical challenge
reduces to performing single-group t-tests on means (see Demonstration 1). The
t-test is one of the simplest, most basic and widely understood tools for
statistical inference. As such, MQALA is not apt to increase demand for
statisticians. However, MQALA provides values of new
measures that often need to be processed in additional ways with the statistical method. This includes development of
mathematical models based on the new measures of interaction. This has the
potential to restore demand for statisticians. But this could be a major and
difficult change from the status quo.
Other bases of
comparison would include how well the different methods account for potential
confounders such as weather, for the effects of various pollutants that might
work together in linear or non-linear ways, for how pollutants might be related
to syndromes of events and for the temporal criterion of causal and other
predictive interactions. As suggested by Section 2.6 of Patent
6,317,700, MQALA has features to address many such issues. However, this
would require substantial additional work.
Second, I will focus on some strategic considerations that
could affect the advancement of MQALA from the perspective of those who often defer too much to statisticians.
Policy makers might favor the relative simplicity enabled by
MQALA. However, policy makers generally are slow to embrace the results of new
methods as a basis for making policies. Perhaps one of the most important
considerations here would be to get buy in from statisticians who are experts
with the established methods. For such reasons, demonstrations with MQALA
should involve experts with established methods whenever possible. In addition,
demonstrations with other types of complex adaptive systems, perhaps most
especially brains,
could help advance DataSpeaks Interactions®.
Although the market for software in epidemiology is small,
the market could be expanded by extension into clinical epidemiology. In
addition to expanding into health care as discussed above, DataSpeaks Interactions® has many
potential applications that would help users understand how individual people
are affected by pollutants, allergens, dietary components, and other
environmental variables. Web enabled systems could help users to collect and
process data in accord with their own concerns.
6.4.
Visualizing How Brains Work, Change, and Adapt
Perhaps the most promising market opportunity for
DataSpeaks, Inc. to pursue first involves use of DataSpeaks Interactions®
to visualize how brains work, change, and adapt. This, by
itself, would not be the biggest market opportunity. But it may be the most
strategic business opportunity. This opportunity appears to be a good choice
because there are a number of specific steps that can be taken now at
relatively low cost to help advance this measurement concept, gain leadership, develop
this market, and enter additional markets. Eventually, DataSpeaks Interactions®
would have to be developed so that it can meet validation requirements of
regulatory agencies such as the U.S. Food and Drug Administration.
DataSpeaks Interactions® can be applied to
currently available data to actually measure mechanisms of brain function.
Functional brain imaging provides some of the best data movies that are
available for multiple individuals. Techniques such as functional magnetic
resonance imaging (fMRI) and Positron Emission Tomography (PET) are providing
data movies with good temporal and spatial resolution. In addition, data are
readily available from public sources such as www.fmridc.org.
Additional modalities such as electroencephalography (EEG)
and magnetoencephalography (MEG) provide data movies of brains in action.
DataSpeaks Interactions® can be used to measure interactions
involving measures of rhythmicity.
Perhaps the simplest initial demonstration would involve
application of DataSpeaks Interactions® to measure function in
individual brains. For this application, all the data would come from the brain
scans themselves - that is, both independent and dependent events would be
determined from the brain scan data alone as in Demonstration
3.
Measurement of brain function as interaction between and
among brain regions goes beyond measurement of action in brain regions. My impression
is that modalities such as functional MRI could be better described by example,
as visualizing the “brain in action” despite the use of “functional” in
describing the imaging modality. DataSpeaks Interactions® would be
applied to these data to visualize coordination of action between and among
brain regions in accord with Patent
6,516,288, which involves action coordination profiles. Perhaps fMRI
would be described more accurately as “action MRI” to contrast it with
structure MRI. This has precedent (http://www.newhorizons.org/nhfl/about/cornerstone.html).
Each frame in a brain scan data movie is comprised of a
number of pixels. Each pixel shows the level of activity in a volume element or
voxel of a brain or some nearby structure. Each pixel or voxel that represents a
particular region is a variable. Repeated measurements of each variable would
form a time series variable, one variable for each voxel. Each time series
variable would have the same number of repeated measurements as there are
frames in the data movie.
Mechanisms of brain function would be visualized by
measuring the interactions between and among brain regions. In this context,
which involves neural circuitry, interactions or temporal contingencies often
are described with terms such as connectivity, interconnectivity, functional
connectivity, effective connectivity and pathways. These are the connections
and pathways that help define our identities as working individuals.
Currently, the relevant literature demonstrates substantial
interest in measuring connectivity, presumably in part because disordered or
diminished connectivity is thought to underlie many functional mental and
neurological disorders. However, to the best of my knowledge, attempts to
measure connectivity apply statistical or related mathematical techniques that
do not actually measure the interactions themselves as functions of relevant
analysis parameters in individual brains. Appendix B shows interactions can be measured as
functions of analysis parameters that account for levels of the interactants,
the episodic nature of events, delays, and persistencies.
Demonstration 3 of Appendix B includes a small portion of results from a
preliminary demonstration for a small patch of motor cortex. These results
appear to provide strong evidence for layering. These results would appear to
help validate DataSpeaks Interactions®. The striking pattern
observed in these results calls for explanation whether or not the pattern
represents different levels of neural activity.
The use of DataSpeaks Interactions® to visualize
brain function could be done after some modification of the software
demonstrated in Appendix B.
The software would have to be ported to a higher performance computing
environment. Portions of the software would have to be modified to accommodate
more scores. More specifically, the current prototype exports scores and values
of the analysis parameters that yield each score to Excel. As such, the
prototype is limited by the number of rows in one Excel spreadsheet (about
65,000). In addition, it would be
necessary to program a new means for displaying measures graphically.
Appendix A
and Patent
6,317,700 identify additional features of MQALA, perhaps most especially Boolean events, that
could be used to define the presence or absence of an almost endless variety of
events involving assemblies of neurons. These features have the potential to
allow investigators to capture more of the complexity of how brains work,
change, and adapt.
Initially, for the sake of simplicity, I recommend limiting initial
demonstrations to two-dimensional cross-sections of brains or to brain slices.
I would start by recommending a cross-sectional image of results color coded so
that the magnitudes of apparently excitatory interactions would be presented
with one spectrum of colors and the magnitudes of apparently inhibitory
interactions would be represented with another spectrum of colors.
Alternatively, the visualization could use two colors displayed with different
degrees of brightness with zero value scores being black. Additionally, I would
recommend an interactive display in which a user
would use a cursor to identify a particular pixel in a brain slice. After a
particular pixel is identified, all other pixels would show how activity levels
in the pixel selected as the independent variable interact with activity levels
in all other pixels.
This type of display would be one way of visualizing action
coordination profiles as described in Patent
6,516,288. In this case the interactions need not be causal because
the data were not collected under conditions that experimentally controlled
levels of activity in pixels used to define independent events. It may be
possible to achieve some degree of experimental control with techniques such as
transcranial magnetic stimulation.
The basic interactive display just described would show the
summary scores for each interaction. Each summary score probably would
summarize thousands of scores for that pixel depending on scoring options
selected by investigators. A relatively simple extension of the display that
would be based on capabilities of the available prototype software would allow
users to examine measures of interaction or connectivity as functions of all
levels of analysis parameters that were selected for use in a particular
scoring protocol. For example, the analysis parameter called delay could be
used to help investigate the temporal criterion of causal and other predictive
interactions. The PowerPoint
presentation included to supplement materials for Demonstration
1 includes graphs that show average measures of interaction as a function
of delay.
One important reason why the functional brain image analysis
application is particularly desirable as an initial market opportunity is that
the variables are localized in space in a manner that permits the type of visual
display just described. Such a display would be much more difficult with, for
example, time series microarray data because microarray variables generally are
not so clearly organized in space, at least at this time. Good visual displays
of interaction scores can help investigators understand what interaction scores
mean. This is important for a new technology that measures interactions, as the
term is used here, for the first time.
The basic type of demonstration just described for a
particular brain would lend itself to additional extensions that could help
sell DataSpeaks Interactions® to potential users. For example, I
would recommend preliminary comparisons with different types of brains. For
example, corresponding slices could be compared for normal brains and brains of
people with Alzheimer’s disease. In this example, I would anticipate diminished
or disordered connectivity between the hippocampus and the cortex. Further
demonstrations of this type would benefit greatly from integrating DataSpeaks Interactions®
with software for statistical analyses.
DataSpeaks Interactions®
has the potential to be the basis for using available brain imaging technology
to provide a new and objective means for diagnosing functional brain disorders.
Such an effort probably would involve measuring interactions from thousands of
brains together with statistical analyses of the resulting measures. Effective
service as a diagnostic tool would be a good entrée to the larger health care market.
Companies that make functional imaging devices would be good
potential customers for DataSpeaks Interactions® because our
software has the potential to greatly increase demand for their devices. Some
of these companies such as General Electric and Siemens are active in the
broader medical informatics marketplace. Such companies could benefit from applications
of DataSpeaks Interactions® in health
care.
Companies such as Kodak that offer Picture Archival Systems
(PACS) also are potential customers for DataSpeaks Interactions®.
Such companies potentially would have access to functional images from
thousands of patients as required to help make functional imaging a gold
standard for diagnosing functional brain disorders and monitoring responses to
treatments.
Functional brain image analysis with DataSpeaks Interactions®
also may be a good entrée into the pharmaceutical industry. Drugs for
disorders of the central nervous system are a large part of their market and have
much potential for growth. Many such drugs up or down regulate various
neurotransmitter systems. DataSpeaks Interactions® has the potential
to help elucidate such mechanisms by measuring how connectivity is affected by
drugs. Conceptually this would be quite simple - measure connectivity before
and after treatment with a drug or potential drug and analyze any changes in measures
of apparent connectivity.
Behavior
modification apparently involves changes in connectivity. DataSpeaks
Interactions®, combined with functional imaging, has potential value
for investigating the neural basis of learning and other forms of behavior
modification.
DataSpeaks Interactions® also has potential for
investigating the neural control of behavior. The most straightforward uses of this
application would involve defining independent events with brain imaging data
and dependent events with behavioral data.
I anticipate that scientific understanding of brains that
can be obtained with DataSpeaks Interactions® will help inspire
development of artificial
systems that can learn. MQALA,
the methodology embodied by DataSpeaks Interactions®, has a type of
face validity for this purpose. Brains
work largely through discrete events - all or none action potentials.
Similarly, MQALA works by measuring temporal contingencies between and among discrete
events. This suggests that MQALA might be particularly suitable for developing
mathematical models of how brains work, change, and adapt as well as for developing
artificial systems that work and adapt more like brains.
The current state of the art in functional brain imaging
often involves activation studies that are used to identify specific brain
regions that are affected by stimuli and tasks. Activation studies have been
enormously helpful in mapping brains. However, activation studies appear to be
of limited value for elucidating higher order phenomena such as attention.
It appears unlikely that attention
would be localized to specific brain regions. In contrast, attention to stimuli
may be more accurately represented by patterns of connectivity present during
periods of time around the time when stimuli are presented. Such patterns of
connectivity could involve many brain regions simultaneously. DataSpeaks
Interactions® has the potential to elucidate higher order phenomena such
as attention by measuring underlying patterns of connectivity in space and time.
For example, it may be easier to investigate the neural basis of higher order
phenomena such as attention by measuring patterns of apparent connectivity with
DataSpeaks Interactions®. Activation studies do not measure such
patterns.
In a broader context, human brains epitomize complex
adaptive systems. Data processing methods that work for brains might command
attention for application to other types of complex adaptive systems.
Brains clearly manifest all three types of mechanisms by
which complex adaptive systems have been said to work - function, response, and agency. Patent
6,317,700 includes specific claims that address all three types of
mechanism in the context of serial functional imaging.
Compared to the statistical
method, MQALA is superior in accounting for
individuality and time. Scientific methods that work best for average
time-invariant systems would appear to exclude much of what is most interesting
and valuable about brains. Interesting features for which MQALA would appear to
be superior include how brain actions are coordinated and how brains change
through development and aging as well as how brains adapt and show substantial
neuroplasticity.
DataSpeaks Interactions®
would help enable a logical progression in the evolution of the use of imaging
in medicine. Initially, the use of imaging was limited primarily to the visualization
of structures, which included broken bones and tumors. More recently, so-called
functional imaging has been extended to, for example, visualize and map brains
in action. The next step, enabled by DataSpeaks Interactions®, would
be to visualize how actions in various regions are coordinated to form entire
working, changing, and adapting systems. This appears to be analogous to what
can be accomplished with other types of complex adaptive systems.
6.4.1. Visible Brain, Visible Human
DataSpeaks
Interactions® can develop data movies of brains in action to visualize brain function
in terms of interactions between and among brain regions. As such, DataSpeaks
Interactions® can help make brains visible, not just as static
structures but as dynamic functioning systems.
Brain structure already is visible as part of, for example, the
Visible Human Project (http://www.nlm.nih.gov/research/visible/visible_human.html).
This project provides detailed 3-dimensional renderings of the human male and
the human female. Given that the Project is limited to the three spatial dimensions,
the Project could be described more specifically as the Visible Structural
Human Project.
Some years ago after discovering MQALA and first hearing of the Visible Human Project,
I began to speculate about a potential “Visible Functional Human Project.” Such
a project would add a forth dimension, time, and could begin by addressing
function. The project also could go on to address additional aspects of work, namely response and agency as well as change, adaptation and individuality.
This new project also could go on to address nested hierarchies of systems at
various levels of understanding. A new “Visible
Working Human Project” could help organize scientific knowledge about what it
means to be human.
This would be a research agenda worthy of great nations.
MQALA made the conception possible and makes the agenda feasible.
6.5.
Improving Prediction of Economies and Capital Markets
Appendix B
includes Demonstration 2 of how DataSpeaks Interactions® can be
applied to measure interactions between economic time series. Additional
demonstrations using these data suggest that DataSpeaks Interactions®
can be used to develop predictive indices as partially described in Section
4.4.3.8 of Patent
6,317,700. Extensions of this work will require investment in programming
for DataSpeaks Interactions® prototype software. DataSpeaks
Interactions® also can be applied to data about capital markets.
Although application of DataSpeaks Interactions®
to capital markets may do more to redistribute wealth than create wealth, this
application can increase the efficiency of capital markets. But redistribution
of wealth our way does have a certain appeal. Additional wealth could be used
to help finance other applications of DataSpeaks Interactions®.
Being rich is apt to be more fun when it is generally recognized that one deserves
to be rich.
Software that improves prediction of economies and capital
markets has the potential to be a substantial market, particularly through the
improved financial services that it can support. In addition, this application
could be developed in a proprietary manner. Here are some reasons why
DataSpeaks Interactions® may be advantageous for prediction compared
with prevailing practice. These comments offer some rationale for trying
DataSpeaks Interactions®.
Chaos appears to be a genuine component of economies, capital
markets, and other complex adaptive systems. However, there also is order.
DataSpeaks Interactions® can help users capitalize on this order.
Small margins of improved predictability can provide a substantial edge in
investment performance.
Those who favor the “random walk” hypothesis question
whether capital markets are predictable. Predictability may well be a matter of
degree. In addition, it may not be wise to conclude that capital markets are
not predictable to a greater degree until all reasonable predictive strategies
have been evaluated. Although a number of mutual funds started using artificial
intelligence a decade or so ago, none to my knowledge were based on measuring
interactions as described in Appendix
A.
The random walk hypothesis appears to be predicated on a
process similar to Brownian motion. Brownian motion describes the motion of an
extremely small particle as being essentially random because it is subject to
essentially random impingements of other small particles such as molecules.
Many capital market time series may look like random walks
primarily because there has been no adequate way to measure the “impingements”
of other variables. DataSpeaks Interactions® can be used to measure
such “impingements” by measuring interactions. These interactions really are
not just impingements but sustained and variable interactions. The random walk
hypothesis needs to be challenged by processing appropriate data movies with
DataSpeaks Interactions®.
This appears to contrast sharply with much current practice.
Some strategies for technical analysis are based on trying to read and project histories
of particular series almost as if they were trajectories. (The sectionon responsible
agency contrasts the prevailing scientific worldview, which emphasizes
trajectories, with the MQALA
scientific worldview, which emphasizes interactions.)
There is little reason why histories of particular variables
should be very predictive of the behavior of complex adaptive systems.
Predictability is more apt to reside in how different variables interact. Science
has made progress by investigating interactions between variables even though
it has lacked adequate methods to measure interactions for individuals. Until
these interactions are measured and the measures combined to make predictions, it
is going to be difficult to make good predictions about economies and for investment
decisions.
One aspect of efficient markets is that investors act on
information and data soon after it becomes publicly available. But much information
in publicly available data is not acted upon because no one has had a good way
to measure and visualize patterns of interaction in the data movies. The DataSpeaks’ advantage
is in using the publicly available data more effectively.
This illustrates the potential value of accounting for
potential impingements in the context of analyzing functional brain images obtained over a period
of time. Functional brain imaging can yield many potential variables - one for
each voxel corresponding to a particular brain region - much as many variables
can be used to investigate economies and capital markets. I will hazard to
guess that activity in any one voxel might appear to be somewhat like a random
walk or perhaps just rhythmic as long as one fails to account for other
variables such as stimuli, behaviors, physiological conditions, and the effects
of activity in other voxels or brain regions. Yet it seems unlikely that brain
activity is essentially random and without some degree of coordination. Brains
that did not coordinate would not have much survival value.
DataSpeaks Interactions® can help users account
for coordinated action and improve prediction. In many such cases, whether the
cases involve brains, economies or capital markets, progress on the problem of
prediction appears to involve measuring the interactions or temporal
contingencies with DataSpeaks Interactions® so that they can be
investigated scientifically.
The current state of the art in predicting economies and
capital markets appears to rely heavily on charting software. Such software often
can show multiple variables on a common time axis. This helps people visualize
interactions. But, as illustrated in the section about the benefits of developing data movies,
the task of accounting for many interactions simultaneously in one’s head soon becomes
overwhelming and odious as an unreasonable expectation as the number of
potential predictors increases.
The next steps in the development of software for prediction
are to actually measure interactions and account for many predictors
simultaneously in accord with their predictive power. Then people could do a
better job of predicting largely without looking at charts - much as clinicians
can now do a better job of diagnosing diabetes without tasting urine.
DataSpeaks Interactions®
probably will help make capital markets more efficient and less predictable. In
the meantime, there is a lot of money to be made by development partners and early
adopters.
6.6. Modifying
Behavior
Temporal contingencies modify behavior. This is a quick
summary of classical conditioning, instrumental or operant conditioning, paired
associate learning, associative learning, extinction, habituation,
sensitization, and other processes that modify behaviors of organisms and other
complex adaptive systems. Section 4.2.6 of Patent
6,317,700 provides some additional information.
Behavior modification exemplifies adaptation and illustrates
why temporal contingencies matter and
need to be measured. In contrast to temporal contingencies, spatial
contingencies may not matter much unless there are interactions between and
among objects over time, temporal contingencies. It appears as if evolution
might involve temporal contingencies working through mechanisms such as natural
selection that involve gene pools.
Temporal contingencies that describe how behaviors are organized
and how behaviors are modified generally involve stimuli and responses, which
are broadly defined here to include various types of events both internal and
external to individual systems. These include reinforcements and punishments. In
general, it appears as if different behavior modification processes can be
explained in terms of one type of contingency involving stimuli and responses
having capabilities to modify another type of contingency involving stimuli and
responses. For example, a temporal contingency between a conditioned stimulus
and an unconditioned response can change a temporal contingency between the
conditioned stimulus and a conditioned response. This type of example has been
called stimulus substitution.
DataSpeaks
Interactions® provides a fundamentally new way to measure
and describe temporal contingencies including those that involve stimuli and
responses. Appendix A
describes some features of this measurement system. As a new measurement system,
DataSpeaks Interactions® has potential to advance scientific
investigations of behavior organization and modification as well as
applications of the resulting knowledge in fields such as education, training,
and health care.
Bad, dumb, and uncivilized behavior is a big market opportunity.
But my impression is that some other market opportunities should be pursued
first.
6.7. Advancing Responsible Agency
Agents have effects on their environments, including people.
In addition, there appear to be certain conditions under which individuals can
be responsible for themselves and held accountable by others for their
behaviors. When these conditions are present, it is reasonable and often
productive to reward, honor, and punish responsible agents in accord with their
behaviors. Some people have opportunities to exercise responsible agency when
they vote for leaders expected to have favored effects.
Although it appears as if DataSpeaks Interactions®
can help advance responsible agency and that this could be the basis for a
large market, it may not be good strategy to begin by focusing on this market.
For one reason, this opportunity appears to be relatively less amenable to
specific business development steps that can be taken now. In contrast, other sections
of DataSpeaks.com focus on specific concrete recommendations that should be
taken now. As examples, DataSpeaks Interactions® can help revitalize the pharmaceutical
industry, reform
health care and visualize
how brains work and adapt.
I will address this opportunity because it might help
potential leaders understand what is at stake as they decide whether or not to
help advance DataSpeaks Interactions®. In addition, if any of this
discussion actually engages people, it is apt to create demand for DataSpeaks
Interactions® and improve human welfare.
6.7.1. Scientific Worldviews
Responsible agency involves issues that cut to the heart of
scientific worldviews or weltanschauungs. These issues often are discussed in
the context of determinism and free will.
Science helps shape worldviews. This includes the effects of
scientists such as Copernicus, Newton and Einstein together with their laws and
theories; discoveries such as germs;
inventions and tools such as microscopes and telescopes as well as methods such
as the statistical method.
MQALA
is a discovery that is being embodied as an invention, DataSpeaks Interactions®,
which is a new set of software tools. Furthermore, it appears to be based on a
rather distinct scientific worldview, which I will call the MQALA worldview. As
such, DataSpeaks Interactions® appears to have important
implications for ethics, personal practices, public policies, political platforms,
legal liability as well as accountability in medicine and other practices.
Both the prevailing and the MQALA scientific worldviews
emphasize measurement, objectivity, and experimental control to elucidate
causal relationships. Of the two, the MQALA view is more dependent on data and software. In
addition, MQALA has special implications for issues such as responsible agency.
I do not have any final answers about these great issues
that continue to challenge humanity. Now I am just trying to advance MQALA,
which is a methodology that we - me and you, my dear reader - can use to help
advance scientific understanding. Generally convincing answers to a number of
great issues do not seem to exist at this time. The real work is just
beginning. MQALA can help advance this work as we seek better answers. I do
hope to prime people to think and share their thoughts as we seek to understand
our role and future in the world.
This, very briefly, is how I am coming to view the history
of our world and scientific methodologies as I continue to develop MQALA. This
presentation emphasizes implications for responsible agency.
I visualize the history of our world as a vast river of
events over time, subject to quantum and relativistic effects that tend to
baffle me. Clusters of events that tend to cohere form various types of objects
such as atoms, molecules, planets, and stars. Objects follow trajectories and
generally behave in accord with the laws of physics and chemistry.
Already some differences between the two worldviews are becoming
apparent. The prevailing view tends to focus on objects that have come to be
understood quite well thanks to a lot of intellectual
heavy lifting, hypothesis
driven science and methods that work best for time invariant systems.
These methods work especially well either when systems have few parts that simply
follow trajectories or when systems have many essentially identical parts that
can be adequately investigated statistically as in statistical mechanics and
thermodynamics. These methods do not work as well to understand traffic on busy
highways when the vehicles largely are controlled by many individual and unique
drivers.
In contrast, the MQALA worldview focuses on events in a
river of time. The shift from objects to events is nothing new. Quantum
mechanics applies to events as when a photon is considered to be a quantum of
action rather than a particle or a wave in a medium. I don’t know if there ever
will be any connection between quantum events and discrete events as used by
MQALA and described in Appendix
A. However, both quantum mechanics and MQALA are fundamentally
probabilistic.
It appears as if a number of attempts have been made to use
quantum mechanical worldviews to explain phenomena such as responsible agency
and consciousness. But these attempts do not appear to be successful in
explaining both higher order phenomena and advancing research agendas about how
complex systems work, change, and adapt. Speculation often
appears to outstrip data. The physical sciences do appear to provide some
opportunities for the emergence
and evolution of complex adaptive systems through processes such as genetic
mutation. Extreme sensitivity to initial conditions creates examples of chaotic
behavior.
According to the MQALA worldview, somehow over billions of
years at least on earth, complex adaptive systems began to emerge and evolve.
How all this occurs largely remains a mystery. But it is clear that there are
trillions of individual systems of millions of kinds, many of them nested in
systems of various degrees of inclusiveness all the way up to our biosphere.
Understanding these systems goes beyond understanding their trajectories.
Individuals with identities such as particular organisms, people,
species, populations, brains, economies, ecosystems, and other types of systems
and subsystems get organized as they come into existence, sustain themselves
from fractions of a second to millions of years, work, change, and adapt. Many
individuals disappear and some leave legacies. Some individuals adapt so much
that new types of individuals emerge, often retaining aspects of their former
selves so that yeast, mice, and humans share similarities. Simpler systems can
combine to form more complex systems. Hierarchies of systems emerge. People and
organizations of people design systems. Some individuals emerge that seek to
understand themselves, other individuals, and the world.
Being part of a river of events, individuals intermingle, swirl
in eddies, and flow. The flow can become turbulent. Different individuals are
brought into contact and may interact. People with different worldviews,
scientific and non-scientific, may interact.
The acceleration of history appears to be positively related
to the diversity and richness of interactions involving multitudes of different
individuals. Inventions such as the printing press and the Internet, organizations
such as universities, processes such as thinking and globalization and services
such as Google contribute to interactions in ways that appear to accelerate
history. We have reached the present time. We recognize that interactions
matter but we really have not adopted good methods for investigating mechanisms
of interaction over time.
It is here at the point where turbulence creates a great
diversity and richness of interactions involving individuals that the MQALA worldview
offers some of the clearest distinctions compared with the prevailing scientific
worldview. The MQALA worldview says that temporal
contingencies matter. Temporal contingencies are another way of describing
interactions or longitudinal associations. But the implications of the
difference between the two types of description appear to be profound.
The prevailing scientific worldview tends to focus on
immutable laws of nature that apply to everything everywhere. According to
these laws, objects follow trajectories. Objects are considered to interact as
if they were billiard balls. One great advance occurred when it was recognized
that objects are subject to relativity.
The Einsteinian worldview, a particular instance of the
prevailing scientific worldview, is that everything is determined and
predictable. Laws are contrasted with mere contingency. Einstein condemned
contingency by saying that nature does not play with dice. Instead of scoffing
at temporal contingencies, MQALA measures temporal contingencies to help make
them a subject matter for scientific investigation.
One implication of complete predictability from the
immutable laws of nature is that the horizon of unknown future possibilities apparently
would collapse into a single point as scientific understanding advances. This may
be one interpretation of what is illustrated at http://www.singularitywatch.com/index.html.
This contrasts with the MQALA worldview of the future as “an expanding horizon of possibilities,” apparently in
accord with Steven Hawking in The
Universe in a Nutshell.
MQALA appears to support a relatively unpredictable and
expanding horizon of possibilities. The two opposing positions about
predictable possibilities are reminiscent of the old controversy about whether
the universe will continue to expand forever or if it will collapse. Although
the MQALA worldview does not offer full predictability, responsible agents are
able to create their futures with some regularity but no assurances. Functional relationships
and scientific laws based on measures obtained with MQALA would be
probabilistic laws because MQALA is fundamentally probabilistic.
I already mentioned Wolfram and A
New Kind of Science. Wolfram appears to offer a variation of the prevailing
scientific worldview. Instead of a world operating in accord to knowable and determinate
laws of nature that would enable complete predictability, Wolfram appears to
see the world unfolding in accord with the rules of programs similar to those
used by computers.
According to Wolfram’s variation of the prevailing view, the
future appears to be determined by programs. But the only way to know the
future may be to run the programs. Even simple programs can produce randomness.
This randomness is determined by rules. But all randomness need not be
determined by rules. Randomness can make temporal contingencies particularly
interesting because they appear to be productive in nature as illustrated by
how contingencies modify
behavior.
Wolfram appears not to see or anticipate that temporal
contingencies matter. Neither “contingency” nor “temporal
contingency” are indexed in A
New Kind of Science, which has an index of about 64 pages in quadruple
columns with fine print.
Wolfram does discuss responsibility in the Notes for Chapter
12. For Wolfram, responsibility appears to reduce to issues of “computational
irreducibility.” As such, responsibility and related higher order phenomena would
not appear to be amenable to programs of mathematical and scientific
investigation. Perhaps measurement, as illustrated by MQALA, connects
mathematics to the world, including higher order phenomena.
Scientific worldviews have consequences for the conduct of
scientific investigations. The prevailing view directs investigators to discover
the immutable laws of nature, primarily through hypothesis driven science. The Wolfram variant
directs investigators to discover programs that have produced the history and are
producing the future of our world. In contrast, the MQALA worldview would
direct investigators to collect data movies in accord with the general thrust of
data driven discovery
science. Then investigators would develop data movies by using DataSpeaks Interactions® to
measure patterns of regularity that describe mechanisms.
Perhaps the most pressing practical implication of the MQALA
worldview, which recognizes that temporal contingencies matter,
is that temporal contingencies need to be measured so that they can be
investigated scientifically. This requires the application and development of
data processing and communications infrastructures to collect and develop data movies. These
infrastructures would help us develop a practical scientific understanding of
the world. Development of this understanding would help shape worldviews and
improve human welfare.
6.7.2. Agency
MQALA
has additional consequences for agency and responsible agency. MQALA measures
temporal contingencies between independent events and dependent events (Appendix A). I already
pointed out the three types of mechanisms by which complex systems have been
said to work - function,
response, and agency.
Function describes mechanisms in which both the independent
and dependent events are internal to systems. As such, function can be said to
describe regulatory control or self-control.
Response describes mechanisms in which independent events,
including treatments, are external to systems. Dependent events can be defined
on conventional variables such as those that measure actions as well as
measures of interaction that describe mechanisms. Measures of response that use
measures of interaction as dependent variables describe how mechanisms are
changed.
Of these three types of mechanism, response is most akin to
the prevailing scientific worldview. The prevailing, deterministic scientific worldview
generally appears to treat people as passive respondents and makes them victims
of circumstance, controlled and subject of fate. People often are subject to
control when they are treated by doctors, educated by teachers, and punished by
courts. An emphasis on control often appears to be something that many people
dislike about behaviorism.
In addition, MQALA explicitly recognizes agency. Agency
describes mechanisms in which independent events are internal to systems and
dependent events are external. External events include behaviors that are
publicly observable including effects on measuring instruments. Agency includes
effects on other individuals. Agency also includes effects on other mechanisms
as when doctors prescribe drugs that up or down regulate physiological
mechanisms in a manner that might benefit or harm health. This increased and
explicit emphasis on agency appears to be a significant departure from
prevailing scientific worldviews.
There appears to be ample evidence for agency. Consider how
the land now occupied by New York City has changed over the last few centuries.
Human agency affects environments. The questions are how and how much, not if. It
seems fitting to account for agency explicitly.
6.7.3. Responsible Agency
But can agents be responsible? The MQALA worldview
also offers a perspective on this.
There appears to be a difference between agency and
responsible agency. Examples of agents include germs, drugs, digestive systems,
people, and organizations of people. Of these, people, and organizations of
people that have or include educated nervous systems apparently are more capable
of being responsible agents. Responsible agency appears to be understandable in
terms of hierarchical organizations of systems.
I already pointed out how systems can be investigated at
different levels of organization and understanding such as physical, chemical,
biological, psychological, social, and cultural. I emphasize this in Patent
6,516,288. I illustrated how interactions can be measured between
events defined at different levels of systems organization. For example, I
mentioned the neural control
of behavior in which action at an apparently lower neural level may
interact with action at an apparently higher behavioral level.
It also appears that higher levels may be able to act down
on apparently lower levels without violating any laws of nature. People create
engines not by violating the laws of nature but by applying the laws of nature.
It appears that people accomplish such feats by controlling temporal
contingencies, often with the aid of machines and devices. For the engine
example, these contingencies can involve physical features such as fuel-air
mixtures, compression ratios and ignition times that need to be controlled
repeatedly, rapidly, automatically and reliably. As an aside, I suspect that
MQALA could be used to make engine controllers more adaptable to variable contingencies
such as weather and variations in fuel.
In contrast to engines that control temporal contingencies,
some engineers help create structures such as bridges, dams, and buildings that
are designed and built to withstand temporal contingencies.
Perhaps of particular relevance for the application of MQALA
to investigate responsible agency is the way that MQALA accounts for patterns
of dynamic interaction that may be the key to describing the workings of higher
order phenomena. For example, I already described measurement and visualization
of patterns of connectivity in brains. Such patterns may be central to
understanding attention.
Similarly, MQALA may be useful in describing the workings of other higher order
phenomena such as self awareness, mind, and consciousness, which may be
important to understand responsible agency.
Responsible agents can be held accountable by others for
their effects. Internalized higher order phenomena that are part of the type of
mechanism called function help make people self aware and responsible for
themselves.
Some higher order phenomena such as self awareness may be
less like objects that can be located in space as with brain activation
investigations but more like particular types of perceived patterns of events
in space and time. Science itself has been described as finding patterns. DataSpeaks Interactions® measures
and visualizes patterns in space and time. Apparently unlike Wolfram’s A New
Kind of Science, DataSpeaks Interactions® can enable research programs
for investigating certain classes of higher order phenomena. Such research programs
can be initiated now. The first steps are to apply DataSpeaks Interactions®
to data movies of brains in action as already described.
The section on behavior modification described how
MQALA can account for an important form of adaptation. Adaptation that crosses
certain thresholds appears to account for emergence of species as well as higher order
phenomena. Responsible agency might represent a type of emergence in complex
adaptive systems. MQALA does appear to contribute to “an
expanding horizon of possibilities” that should be researched.
When I first said that contingencies matter in the Call for Leadership, I referred to the achievement of landing the
Spirit and Opportunity rovers on Mars. This achievement provides an opportunity
to contrast two scientific worldviews - the prevailing and the MQALA views.
The prevailing scientific worldview works best and is
validated primarily for what it can achieve by application of scientific understanding
in the physical sciences. This understanding helps us most with physical
phenomena such as those that involve trajectories, propulsion systems, and what
the physical composition of objects on Mars can tell us about the history of
our universe.
But all of this is just part of the story of the great landing
achievements. In addition to the successful landings themselves, we need to
account for the behavior, motivation, passion, know-how, intelligence of the
people as well as the organizations, teams, and the culture that made these
achievements possible. The Mars landings are accomplishments of complex
adaptive systems at work.
Great achievements can be contrasted with great disasters such
as the Columbia shuttle and the World Trade Center disasters. Better scientific
understanding of complex adaptive systems can make for more achievements and
less disasters.
Nobel Prize winners are honored as if they were responsible.
However, the prevailing scientific worldview appears to be inconsistent with responsible
agency. Anticipate that MQALA and DataSpeaks Interactions® will
enhance the value of honor as the software helps us to understand complex
adaptive systems scientifically.
People and humanity have many choices when the world is
viewed as “an expanding
horizon of possibilities.” Instead of merely seeking to know the
world as it is, we have the choice of accepting responsibility for creating the
future. Hopefully we will pursue this future in accord with fundamental human
values and continue to seed the world with intelligence.
6.7.4. Leadership
Since DataSpeaks.com is opening with a call for leadership,
I will take this opportunity to make a few points about leadership in the
context of responsible agency. The first comments apply to leaders who are
potential customers and might be interested in certain benefits of DataSpeaks Interactions®.
Leaders who, for example, might want to hold clinicians
responsible for the benefit/harm of treatments, the measurement of which is
illustrated in Appendix A, would have a responsibility
to help provide conditions so that clinicians can measure the benefit/harm of
treatments. This includes making DataSpeaks Interactions® available
as part of an adequate data collection, processing, and communications infrastructure.
Similarly, leaders who want to hold people more responsible for their own
health to reduce cost burdens on collective payers have a responsibility to
help provide the required infrastructure. In both cases, this would be similar
to our general acceptance of responsibility to provide schools and teachers for
educating our children (no slight intended) so that children can grow up to be
responsible and effective agents.
Punishments of inadequate, dumb, and untoward behaviors are
not apt to be sufficient when conditions for the desired alternative behaviors
are not available. It can be mean for knowledgeable leaders to hold people
responsible without providing the conditions for them to behave more
responsibly. Leaders, including politicians, have obligations to help provide
suitable conditions. Leaders that execute on plans in accord with their
knowledge are apt to be rewarded according.
The remaining comments apply to potential leaders of
DataSpeaks, Inc.
DataSpeaks is giving potential leaders opportunities to be responsible
agents by leading DataSpeaks, Inc. It is in this context that I will quote
James Clerk Maxwell from his essay: Determinism and Free Will (1873). This and
related quotes appear on the Web site of the Princeton Plasma Physics
Laboratory (http://w3.pppl.gov/~hammett/Maxwell/freewill.html).
Maxwell may be best known for his four partial differential
equations of electromagnetism. The following quote suggests that he also
anticipated chaos and offered some good advice for potential leaders. I put the
last line in italics to emphasize his good advice.
Quoting Maxwell: “For example, the rock loosed by frost and
balanced on a singular point of the mountain-side, the little spark which
kindles the great forest, the little word which sets the world a fighting, the
little scruple which prevents a man from doing his will, the little spore which
blights all the potatoes, the little gemmule which makes us philosophers or
idiots. Every existence above a certain rank has its singular points: the
higher the rank the more of them. At these points, influences whose physical
magnitude is too small to be taken account of by a finite being, may produce
results of the greatest importance. All
great results produced by human endeavor depend on taking advantage of these
singular states when they occur.”
The discovery of MQALA
appears to be a “singular state” that I trace back to work I did while
attempting to create a way to analyze health diary data over 20 years ago. I
pursued this effort after reading that no good method existed - an opinion
offered by a leading health diary researcher who also had an appointment in a
statistics department.
Much like the examples cited by Maxwell, MQALA appears to have
unanticipated implications that extend far beyond the triggering event. (For
example, I started to write a 3-page document. See what happened on this Web
site.) These implications deserve to be pursued. Now impact and success of
DataSpeaks Interactions® depends primarily on leadership.
I have encountered many sources of resistance to MQALA and
DataSpeaks Interactions®. I have learned from some of them. I will
mention some sources of resistance. These points will affect the way I will
spend my time and help choose leaders for DataSpeaks, Inc.
I already described my experience with the statistical
establishment. I will avoid repeating some of these experiences.
Nevertheless, I would be honored to publish with expert statisticians and
experts in other methods of empirical induction once they express real
interest in MQALA and are willing to share the responsibilities and rewards of
leadership.
I used to think that demonstrations of the DataSpeaks
Interactions® would help sell the software. Based on this thinking, I
spent over $100,000 out of my own pocket to develop the prototype. So far, I
have been wrong.
The limited effect of demonstrating DataSpeaks Interactions®
has become quite understandable. I already made reference to the discovery of germs. I know little of the
history of those who invented, developed, and used microscopes. But I can
imagine what these inventors might have gone through. Most people upon seeing
germs for the first time, especially around the time that germs were being
discovered, probably would say “So what?” and go about their routines. But a
few people kept looking, working, and understanding more. It took time for the
microscope market to develop.
We have been able to see germs and parasites for a long
time, but the work continues as we still seek to develop practical scientific
understanding. Understanding developed so far and the products of this
understanding have extended millions of lives. Great accomplishments often
require persistent hard work. It may take time for the market for DataSpeaks
Interactions® to develop. But it might also be possible to create an
avalanche effect.
I suspect that most people who look at Appendix B will say “So what?”
The physician/researcher
who asked why it was important to measure the benefit/harm of treatment was
asking the “So what?” question. The answer to the “So what?” question is that
measurement enables scientific investigations and practical applications of
results to improve economic productivity and human welfare. I’m looking for a few good people with
curiosity, guts, know-how and resources to actually work with the software so
that they can answer the “So what?” question to their own satisfaction. Such
thought leaders will make the market grow.
DataSpeaks Interactions®
makes interactions or temporal contingencies visible in ways that they have
never been seen before. Compared with germs, interactions
are more abstract and potentially more difficult to
appreciate when seen. But if a few leaders succeed, many other people are apt
to follow. That would help create a market. The first and best leaders will
have the biggest advantage.
Development and use of the prototype software did help
solidify proof of concept for me so that I could continue working to advance
DataSpeaks Interactions® with passion and vigor.
Scientists who investigate complex adaptive systems often
speak of interactions. But scientists who really need to measure interactions
in order to investigate their subject matter more scientifically and advance
their careers generally defer
me to statisticians. This behavior has become a reliable predictor of failure.
Statisticians do not measure interactions or temporal
contingencies over time and across variables for individuals anymore than they are
primary creators of, as examples, laboratory tests in medicine or microarrays.
This falls outside the lens of their
experience. Statisticians are good at analyzing the values of
measures, once variables are measured, to describe groups and to make
inferences from representative samples to populations. DataSpeaks Interactions®
itself is validated by demonstrations such as those shown in Appendix B.
Potential business leaders that I have met locally appear to
be waiting for someone else to lead by expressing demand in a way that would
validate DataSpeaks Interactions® as part of a business concept. Business concepts do need to be
validated. But leadership to provide validation of business concepts may
require guts, curiosity, persistent hard work, and actually working with the
software through scientific, technical, and academic thought leaders that business
leaders trust.
I have worked hard and know that I need help. Now I am
casting a bigger net on the Web with DataSpeaks.com. Many people have suffered
and died because of some combination of my ineffectiveness as a leader and the
unresponsiveness of my audience. (See my discussion of levels.)
The extensive material on DataSpeaks.com provides many
opportunities for me to be found wrong on various particulars. But if
DataSpeaks.com includes some modicum of fundamental innovation and original
truth, DataSpeaks Interactions® deserves to be tested. I have
already experienced people who have tried to nit pick me to death, which
generally is understandable but not productive. I would prefer to avoid
reliving these experiences.
DataSpeaks Interactions® is quite easy to test. I
already described the brain
visualization example.
Fortunately, DataSpeaks Interactions® is software, not a
biopharmaceutical that needs to be manufactured and tested in clinical trials. Lives
and people’s health do not need to be put at risk. In addition, DataSpeaks
Interactions® can be applied to other people’s data. Data are
readily available. All this minimizes business risk. Much additional work needs
to be done, including continued development of the intellectual property
portfolio. But research and development costs can be controlled.
I do enjoy engaging people on the issues. I want to engage
thought leaders with resources and who are personally invested in their own
data for projects that can lead promptly to concrete results that will advance
DataSpeaks Interactions®. At the same time, I want to engage
business leaders who can help make DataSpeaks, Inc. into an outstanding and
successful company.
There are some people who will avoid DataSpeaks because I
lack sufficient authority on the weighty matters discussed on DataSpeaks.com.
To these people I offer a timely reminder. Lord Kelvin said "Heavier-than-airflyingmachinesareimpossible." Other authorities
echoed the same opinion. Authorities can be wrong. The Wright Brothers, working
from a bicycle shop, helped prove that some authorities were wrong.
6.8.
Reinvigorating Machine Learning and Artificial Intelligence
Section 4.2.6.4 of Patent
6,317,700 includes specifications for a demonstration learning
robot. These specifications are an extension of material presented and referred
to in the behavior
modification section. This demonstration would just begin to show
what is possible with respect to applying DataSpeaks Interactions®
to create artificial learning systems.
It may be relevant to note that I had considerable
difficulty with the patent specifications for the demonstration learning robot
until I introduced the distinction between long-term and short-term memory.
This is an important distinction in real brains.
The Sony AIBO (http://www.us.aibo.com/)
might provide one way to pursue this market opportunity. The first generation
AIBO was released in 1999, well after the patent specifications were written. AIBO
has sensory, motor, and computer processing capabilities that go far beyond
those mentioned in my patent specifications. It is possible that a version of
DataSpeaks Interactions® could be made available to AIBO enthusiasts
to promote both AIBO and DataSpeaks Interactions®. DataSpeaks
Interactions® may well have the capability of enhancing AIBO’s
ability to modify its behavior in accord with its experience in its
environment. AIBO could be a good platform for demonstrating applications of MQALA for machine learning.
DataSpeaks Interactions® embodies MQALA, a computational
method of empirical
induction. Methods of empirical induction are methods for drawing
generalized conclusions and making predictions from data. The ability to draw
generalized conclusions and make predictions from experience is part of
intelligent behavior.
DataSpeaks Interactions® can make an important
contribution to artificial intelligence. I have given a number of examples of
how DataSpeaks Interactions® uses computation to do that which
people often do in their heads. As examples, I described that people and other
organisms often learn from “what
follows what,” one way of describing temporal contingencies. I
described how clinicians can form subjective impressions about the benefit/harm
of treatments from information about how individual patients respond to drug challenge, de-challenge,
and re-challenge. These are examples of behavior often considered to
be intelligent.
The section on predicting economies and capital markets
provided additional examples of how DataSpeaks Interactions® helps
with achievements generally considered to require intelligence. For example,
DataSpeaks Interactions® can go beyond charting software by
measuring potentially predictive interactions that people generally try to
judge in their heads. Appendix B
provides some examples of measuring interactions involving economic data.
The predictive indices feature of MQALA that is described in
Patent
6,317,700 covers various aspects of intelligent behavior. The
feature starts by measuring interactions and identifying conditions that
provide the most predictive power. It combines information from multiple
predictors into a single predictive index. It can measure predictive
performance and adapt to improve predictive performance. It has the potential
to differentially weight predictors in accord with predictive power and select
optimal subsets of predictors. The performance of the whole system adapts
automatically as relationships between predictor and predicted variables change
as economies respond to external conditions including economic policies.
DataSpeaks Interactions® helps make it possible to predict economic
variables without knowing economics.
Given various scandals such as those involving security
analysts and equity analysts, it often may be valuable to let the data speak in
accord with objective, reproducible, transparent operations that can be
specified in protocols. This would raise some professional standards of
operation to be more like scientific standards of operation.
I already described how DataSpeaks Interactions®
can be used to help visualize
and understand how brains work, change, and adapt. I anticipate that
this understanding can help inform the development of artificial systems for
machine learning and artificial intelligence. I also anticipate that
development of artificial systems can enhance understanding of real systems
much as mathematical models can aid understanding. Similarly, DataSpeaks
Interactions® could contribute to a mutually beneficial relationship
between systems biology and synthetic biology.
Much of empirical science can be described as learning from
experience recorded as data. However, at this time in history, there appears to
be a major chasm between the methods of formal science and the methods of
natural learning. Most of our brains do
not work by performing mental t-tests and analyses of variance. In contrast, we
often learn from temporal contingencies. MQALAcan help bridge the
chasm between the methods of formal science and the methods of natural
learning. Perhaps this, more than anything else, will help get us out of data
swamps.
It is not clear how scientists and practitioners will fare
after designed systems start doing more of the intellectual heavy lifting that
helps keep us busy now. But in the meantime, there are a lot of discoveries to
be made and services to be provided that can improve economic productivity and
the general welfare.
7.
Acknowledgements
I will make some key acknowledgements in advance just in
case this call for leadership succeeds.
First and foremost, I would like to acknowledge my wife,
Cathy Gofrank, and my extended family for putting up with my distractions and being
supportive of me over the years. Special thanks to my children, Stephon,
Alexander, Curtis, and Christie who are doing great despite my distractions.
Recently Christie, age 10, found a “Payday” sticker and put it on a piece of
paper to me between “When is” and a question mark.
I acknowledge
Farideh Bagne, mother of my two oldest children, who helped fire
my ambition but lost patience with my efforts.
I acknowledge Dr. Gordon Guyatt whose publication of an
article on randomized N-of-1 clinical trials in the 1986 New England Journal of
Medicine gave me hope during a difficult time.
I acknowledge Mickey Katz-Pek of Biotechnology Business
Consultants for her support and assistance to me and other entrepreneurs. I
acknowledge Robert Palmerton for assisting during the early unsuccessful years
of trying to turn this technology into a business.
I acknowledge Jill Goldberg who worked so well with me to
engineer the prototype software as a work for hire while she was at Cognitive
Bionics.
I acknowledge David H. Brenner and Tom Edwards of IdeaWorks
LLC for one long meeting that inadvertently initiated most of the material on
this Web site. David chairs the 2003-2004 Great Lakes Entrepreneur’s Quest.
I acknowledge Phillip Covington
for helping to make this Web site a reality.
Finally, I acknowledge Google.
Browsing creates contingencies that have helped advance my work, reinforcing my
conviction that contingencies
matter. Often I have been amazed by what I fail to find. Why aren’t
people using apt phrases such as “computational measurement software,”
“temporal contingency analysis,” “multiple N-of-1 clinical trial design,”
“advancing responsible agency” and “educated nervous systems” along with other
such phrases that are a prominent part of DataSpeaks.com? Why are there about
33,300 hits for “complex adaptive system” but no hits for “simple time invariant
systems,” which appears to be a reasonable contrast? Anyway, thank you, Google!
APPENDIX A
How to Develop Data Movies - A Primer on
How DataSpeaks Interactions® Works
MQALA
and DataSpeaks
Interactions® work by a basically simple, iterative
process. I’ll start by describing the data that are to be processed. I have
referred to this process as developing
data movies and as temporal contingency analysis.
A data movie consists of a chronological series of data
snapshots. Each snapshot corresponds to a set of values obtained from measurements
made at particular times or measurement occasions. Each picture element or
pixel corresponds to one variable. Ideally, all variables relevant to
particular systems of interest and types of operation are measured often and
periodically. Generally, if you don’t know if a variable is relevant, try to
include it to the extent that data collection and processing resources allow -
results obtained with DataSpeaks Interactions® can help users
determine if and how variables are relevant.
DataSpeaks Interactions® measures interactions or
temporal contingencies. Measurement of interactions requires at least two
variables or sets of variables. However, data movies of many systems can easily
have hundreds or thousands of variables. DataSpeaks Interactions®
can be applied to hundreds or thousands of variables.
DataSpeaks Interactions® works by two major
steps. First, it uses the data to determine the
presence or absence of discrete events at particular times over periods of
time. This contrasts with much conventional software that continues to work
with dimensional variables. Discretization is crucial. We need to define events
in order to measure temporal contingencies between events that describe how
systems work, change, and
adapt.
Second, DataSpeaks Interactions® uses
probabilities and other simple mathematical operations to measure the temporal
contingencies between discrete events for two variables or sets of variables.
This process can be repeated millions of times to cover all interactions of interest.
It helps that key measures, which quantify the amount and direction of evidence
for temporal contingencies, are standardized with respect to all possibilities
given the data and a specified scoring protocol. After this, DataSpeaks
Interactions® works primarily by summarizing and displaying the
results.
Here are a few more details. DataSpeaks Interactions® essentially
begins by using simple binning processes to define potentially hundreds or
thousands of types of discrete events on each variable or set of variables. For
example, with DataSpeaks prototype software, I commonly use up to eight
analysis parameters simultaneously to define discrete events. Additional
parameters are possible. Such parameters account for levels of the variables,
temporal aspects such as delay and persistence of any interaction, and the
episodic nature of many events. Discrete events are determined to be either
present (1) or absent (0) on most measurement occasions to form a time series
of ones and zeros for each type of event.
Our experience in going from analog to digital devices
suggests that discretization or digitization of data by forming series of ones
and zeros to represent the presence and absence of discrete events is a
reasonable option. The process need not result in any loss of information. Although
analysis parameters can have many levels to account for all of the information
in the data, this often becomes a waste of computational resources after
analysis parameters are represented with about 7 to 12 levels. The binning processes
used here suggest a new form of digital data processing in the temporal domain.
Here is a simple example of defining treatment and health
events starting with the assumption that an investigator has a data movie with
only two variables, daily drug dose and daily blood pressure, over 100 days for
an individual patient. Assume that the data were collected from a randomized N-of-1 clinical trial
with a range of at least several different doses. One type of treatment event
is present, for example, if daily dose is 50 or more on at least 5 out of 7
consecutive days. A type of adverse health event is present if systolic blood
pressure is 140 or more on at least 2 out of 5 consecutive days.
Users don’t have to know in advance what levels define
events that account for the strongest interaction because users can evaluate
hundreds or thousands of types of such events simultaneously. Users can select
additional analysis parameters to investigate delay and persistence of apparent
drug response. If in doubt about
including or adding an analysis parameter or increasing the number of levels of
any analysis parameter, do so with certain caveats to the extent that computing
resources allow. Results will help tell users how all selected analysis
parameters and levels might be relevant to any interaction.
After discretization, DataSpeaks Interactions® can
measure the interaction between the two variables, drug dose and blood
pressure, for all selected types of discrete events. Assume that in this
example, the user would set DataSpeaks Interactions® to indicate
that higher levels of blood pressure were bad. One step is to cross-classify
all the series of ones and zeros for dose with all the series of ones and zeros
for blood pressure. This yields an array of 2 x 2 tables. The array would have
the same number of dimensions as analysis parameters that the user selected. Such
arrays can easily have thousands of entries as illustrated in Appendix B.
Each 2 x 2 table in an array for this example is used to
compute values of measures (scores) for each location in the array. Each score
measures either the strength or the amount of evidence for an interaction under
the conditions specified by the location of the score in the array. Positive
scores would indicate apparent benefit. Negative scores would indicate apparent
harm.
The primary scores in this example quantify the amount of
evidence for benefit and harm. Each of these scores is standardized with respect
to all possible 2 x 2 tables given the marginal frequencies of the particular
observed 2 x 2 table. DataSpeaks Interactions® currently
standardizes scores to have mean 0 and standard deviation 1. Appendix B shows that this
procedure can yield large positive and negative scores with low probabilities
of occurring by chance alone. In contrast, the strength scores are not
standardized and can range in value from -1 to +1.
Arrays of standardized scores are easily summarized by
selecting extreme values to identify the conditions that provide the most
evidence for interactions. Interactions can be summarized and visualized as functions of any analysis parameter such as
dose or any combination of analysis parameters such as dose and delay.
DataSpeaks Interactions® is a data processing
program that both analyzes and synthesizes data. Analysis and synthesis become
two aspects of the same process. Analysis here involves describing interactions
in detail. The arrays of scores, which easily can have thousands or tens of
thousands of interaction scores, allow results to be analyzed in great detail.
Synthesis involves summarization of arrays of standardized scores to draw
generalized conclusions in accord with DataSpeaks Interactions®
being a software system based on a method of empirical induction, MQALA, as described
in the patents.
Synthesis can be extended across dependent variables. Extend
our “drug for blood pressure” example, by assuming that 20 health variables
were measured daily to help evaluate safety and efficacy in the same patient.
All interaction scores and all summary interaction scores can be differentially
weighed in accord with clinical significance or patient preferences and
averaged to draw generalized quantitative conclusions about benefit/harm over
all of the dependent variables for the patient that was investigated.
Generalization
can be extended from individual patients to groups and populations by applying
the statistical method to corresponding
interaction or benefit/harm scores from two or more individuals. This is
illustrated in Appendix B.
Thus the method used by DataSpeaks Interactions® and the statistical
method often can be complementary methods for
drawing generalized conclusions about groups and making inferences from samples
to populations. In such cases, measurement with DataSpeaks Interactions® comes
before statistical analyses.
To continue with our “drug for blood pressure” example,
assume that we have data for a sample of 50 patients - 100 repeated
measurements of dose and each of the 20 health variables for each patient. I
call this the randomized multiple
N-of-1 clinical trial design where “multiple” refers to patients.
The results could be evaluated statistically with a single group t-test on mean
overall benefit/harm scores. Rejection of the null hypothesis in the positive
direction would indicate benefit. Rejection in the negative direction would
indicate harm. One reason why this approach can work so well is that it reduces the number of
variables that need to be analyzed statistically from 21 (one
treatment variable, dose, and 20 health variables) to 1. Of course,
benefit/harm can be profiled across all 20 health variables for each patient,
for any sub-sample or the entire sample of patients. Similar use of the t-test
is shown in Demonstration 1.
DataSpeaks Interactions® has additional features,
described in Patent
6,317,700, to account for phenomena such as those involving sets of
independent variables acting in concert (e.g., drug interactions, protein
complexes). These features work by applying Boolean operators to each of the
series of 1s and 0s for one variable with each such series for one or more
additional variables in a set. For example, a Boolean independent event can be
said to be present when protein A is present and either protein B or protein C
is present and protein D is not present. In addition, Boolean events can be
defined on sets of dependent variables to investigate phenomena such as
syndromes and the effects of master controller proteins in biological networks.
Such events can be defined across all levels of the analysis parameters such as
those previously described. In this manner, DataSpeaks Interactions®
can be used to investigate complex events.
Other features of DataSpeaks Interactions® can
measure changes in interactions over time. These changes indicate changes in
the amount, strength, and direction of evidence over a period of time. In
addition, such changes can indicate development, aging, learning, habituation,
sensitization, potentiation, and other time dependent processes.
DataSpeaks Interactions® can be used to
investigate interactions involving variables at different levels of
understanding (e.g., laboratory values, symptoms, health perceptions, and
quality of life; biological, psychological, and social). As such, it can foster
interdisciplinary and
collaborative investigations.
DataSpeaks Interactions® is demanding of
computing resources. An important challenge might be to marshal enough computing
power to process data on a large scale. We can turn this problem into an
opportunity if we want to get help from companies that sell and service
computing infrastructure including grid computing.
The patents
provide more information about the operational details of DataSpeaks Interactions®.
APPENDIX B
Abbreviations for the columns are: LAS = Longitudinal
Association Score; IVEL = Independent Variable Episode Length; IVEC =
Independent Variable Episode Criterion; Persist. = Persistence; DVEL =
Dependent Variable Episode Length; DVEC = Dependent Variable Episode Criterion.
Abbreviations for the rows are: MfHrs = Average weekly
hours, manufacturing; Unemp = Average weekly initial claims for unemployment
insurance; MfCG = Manufacturers’ new orders, consumer goods and materials; Vend
= Vendor performance, slower deliveries diffusion index; MfCap = Manufacturers’
new orders, non-defense capital goods; Bldg = Building permits, new private
housing units; Stock = Stock prices, 500 common stocks; Money = Money supply,
M2; Rate = Interest rate spread, 10-year Treasury bonds less federal funds;
CsExp = Index of consumer expectations; LEI = Index of Leading Economic
Indicators.
Demonstration 3 - Functional Brain Image Analysis
The
Tables for Demonstration
3 provide a small representative subset of results
obtained when MQALA
was applied to functional magnetic resonance imaging (fMRI) data
for a 2-row by 24-column patch of voxels from the motor cortex
for one subject. The data were constructed as described in the
publication at the following link: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=10458943&dopt=Abstract
. The data were obtained over the Web from a link that apparently
is no longer available.
This is a very preliminary demonstration of an attempt to
measure apparent functional connectivity in brains. The intent is to identify
thought leaders with resources who will try applying MQALA to functional brain
imaging data as described in one of the market opportunity sections.
The MQALA analysis used only a small portion of the data
from the cited publication, namely data for the first two rows in the “simple
signal” portion of publication Figure 1. These two rows were not affected by
any embedded signal shown in Figure 1. This is in accord with the intent to
measure apparent functional connectivity as distinct from activation. The data
have 384 time-points.
Results from the MQALA analysis illustrated below show
striking evidence for patterns in the data. One purpose of these notes is to
encourage experts in fMRI imaging and analysis to help determine if these
patterns are real. Might the pattern result from some aspect of how the test
data set was constructed by authors of the article? Is the pattern biologically
meaningful? Is it interesting?
Appendix A
provides a brief primer on how DataSpeaks Interactions®
works.
The data were analyzed with the “action coordination
profile” portion of the DataSpeaks Interactions® prototype software.
User specified options were as follows. Under “Transformation Options,” I selected
“Linear Regression Residuals.” Under “Dimensional Resolution,” I selected “Fine
Z Score (12 levels of resolution).” Under “Scoring Settings” and “Independent
Variables,” I selected 2 levels (1 and 2) for “Episode Length” and checked
“Analyze episode criterion” (a total of 3 combinations of levels: 1,1; 1,2; and
2,2). I selected 6 levels of “Delay” (0 through 5) and 2 levels of
“Persistence” (1 and 2). Under “Scoring Settings” and “Dependent Variables,” I
selected 2 values (1 and 2) for “Episode Length” and checked “Analyze episode
criterion” (also a total of 3 combinations of levels). This means that I
analyzed 15,552 (12x12x3x6x2x3) longitudinal association scores for each pair
wise directional combination of one independent variable (voxel) and one
dependent variable (voxel). There are 1856 (48x47) such combinations when there
are 48 variables.
The analysis, illustrated in part by the Tables
for Demonstration 3, took between 36 and 43 hours of to run on a
Dell 750 MHz Dell laptop.
It would be far superior to show the results illustrated by
the Tables for
Demonstration 3 as a set of figures that could be presented
interactively as described elsewhere. But we have reached the edge
between the past and the future. The future depends largely upon identifying
thought leaders
with resources who could port DataSpeaks Interactions® to a higher
performance computing environment, obtain suitable data for a whole brain slice
or, better yet, a whole brain, and doing additional computer programming to
convert numerical results such as those illustrated here into an interactive
visual display.
Each portion of the Tables
shows results for the 2 by 24 patch of motor cortex. In Table 1 and Table 3,
the voxel identified by “x” functions as the independent variable and the 47
other voxels function as dependent variables. The letter “x” identifies the
voxel selected by the user in the description of the visual display.
For Table 2, “y” identifies the voxel functioning as the
dependent variable and the 47 other voxels function as independent variables.
Table 2 shows and the corresponding visual display would show how brain
activity in all other voxels is associated with brain activity in the selected
voxel. In other words, Table 2 shows how activity in voxels other than the
voxel identified with “y” is associated with or may affect activity in the
voxel marked by “y.” Note that the portion of Table 1 for delay = 0 would have
been identical to the corresponding portion of Table 2 if default values had
been selected for Independent Variable
Length, Dependent Variable Length, and Persistence.
Table 1 shows results for the first 4 voxels functioning as
independent variables. Results for the remaining 44 voxels are similar. Table 2
shows results for the first voxel functioning as the dependent variable.
Results for the remaining 47 voxels are similar.
Each summary longitudinal association score (LAS) shown in Table
1 and Table 2 is one score from a standardized distribution (mean 0, standard
deviation 1) of potential scores that is defined by applying the MQALA
algorithm to the data. In this case, each summary score across the 6 levels of
delay, shown in the bottom portions of Table 1 and Table 2, is the most extreme
positive or negative score in the 8-dimensional array of 15,552 scores for each
pair-wise directional combination of two variables. Each summary score for each
delay-specific portion of Table 1 and Table 2 summarizes 2,592 (15,552 / 6
levels of delay) standardized LASs.
I tabulated the Tables for
Demonstration 3 by hand. The tables are subject to review for typing
errors.
The Tables show strong evidence of how the data are patterned.
Patterning is the basis of data mining and empirical scientific investigations
generally.
The patterning becomes evident in several ways. First and as
described above, each score is one score from a distribution of potential
scores that has a mean of 0 and a standard deviation of 1. The portion of Table
1 in which voxel 2 functions as the independent variable includes a score of
206.4. Such large standardized scores have a small probability of occurring by
chance alone.
All the scores shown in the Tables for
Demonstration 3 are positive. Given that each score in Table 1 and
Table 2 is from a standardized distribution with a mean of 0, this provides
additional evidence for a pattern. In general, high levels of activity at
particular times in any region are associated with higher levels of activity at
particular times in other regions. The entire analysis did yield 4 negative
summary scores at delay = 5.
Table 1 and Table 2 include delay-specific summary scores.
Delay is one of the six optional analysis parameters that were used in this
analysis to help account for temporal aspects of each measure of apparent connectivity.
Of these, delay is most similar to the familiar procedure of lagging variables
in relation to each other before doing some type of analysis. However, these
delay-specific summary scores are summarized across the five other optional
analysis parameters used to investigate temporal aspects of connectivity as
well as level of the independent variable and level of the dependent variable.
The entire analysis could be rerun much faster by dropping the other five
temporal analysis parameters.
Results for delay = 0 show the most evidence for patterning
in addition to that described above. In general, these results show two modest
scores followed by one larger score. To me, this suggests layering. My understanding
is that the motor cortex is layered.
The apparent layering pattern decreases with larger levels
of delay. I see little evidence for layering at delay = 5. In additional, the
delay = 5 scores tend to be smaller than those for any other level of delay.
Delay, together with reversing independent and dependent
variables, can be used to help evaluate the temporal criterion of causal and
other predictive interactions. Although these results provide strong evidence
for coordinated activity, my first impression is that they provide little
evidence that coordination that is causal. I would expect more evidence for
causal relations when additional brain regions involved in cascades are
included in analyses.
Here is an aside related to delay that may be a source of
concern or interest. I understand that there is a time lag of about 1 to three seconds
between neural activity and hemodynamic response. I do not think that this is a
problem with MQALA. I will illustrate my reason with an analogy. Imagine that
you are watching a live interview from China on CNN. The satellite delay can be
annoying. But the order of the words is not changed - the whole response is
just delayed. Similarly, my impression is that that MQALA can take advantage of
high temporal resolution of functional imaging, especially if hemodynamic
response time does not vary much across brain regions.
Table 3 shows results for one measure of direction and strength of longitudinal association as
contrasted with the other measures that show amount of evidence for longitudinal association. Strength measures
are ratios of the amount of evidence for a longitudinal association over the
amount of evidence that the data could have provided for a longitudinal
association under a particular condition as described in Section 4.1.6 of Patent
6,317,700. Values of the strength measures, which generally are not
affected by the number of repeated measurements in time series, can range from
-1 to +1.
A
small sample of representative results for functional brain
image analysis.
Page
last revised 3/2/04