CONCEPTS DIFFER for BIOCHEMICAL ENGINEERING ANALYSIS
(This Section is for Researchers and Health Professionals)
Much of today’s Health Research is oriented first to develop population risks by factor (SPR), and second to understand from studies of animals and humans the behavior of related BioChemical mechanisms of the same factors (SBM). A planned objective of most of these types of study is the obtaining of one or more sets of statistics with accompanying error margins. Possible reasons for effects found on populations in SPR studies often are speculated to be results of biochemical processes from SBM studies.
Life Ahead as a type of BioChemical Engineering Global Analysis (BCE) analysis of major disease and health aims at quite differing objectives and involves differing approaches and concepts than does SPR or SBM research. A BCE analysis assumes that a disease is produced by a BioChemical process. Essentially all chemical processes take place according to potentially predictable rates over time. Thus an objective of a BCE Global analysis it to develop the equations that explain quantitatively how this chemical process will proceed as a function of any environment of varying causative factors. An objective of this quantitative analysis is to identify with better probability the true casual factors involved.
Population health studies usually develop statistical associations between disease and certain characteristics of the populations. No matter how large or sophisticated such a study may be, the result remains a statistical association measured on two or more population segments specific in age, sex, and other characteristics, with factors involved operating over certain specific durations of time involved. Multiple SPR or populations studies of a same factor and meta analyses of such increase the likelihood of a factor and may diminish margins of error involved. But the the result remains as a statistical association. There is no assurance that a similar effect will be obtained for differing conditions than those involved in developing the association. Meta analysis usually have overlooked the major factor of risk factor duration. As shown via the diabetes example this can produce a major error in result.
A BCE analysis starts with the results of both SPR statistics of SBM mechanisms and carries forward the search toward quantification of probable casual factors. A chemical process often can be quantified usefully by macro equations as well as by more detailed micro equations. For example, petroleum crude oils include hundreds of differing hydrocarbons and other chemicals. Yet chemical engineers can forecast accurately the kinds of products produced by a high temperature or catalytic process by using various physical measurements of the crude, environmental measurements of the process, and macro engineering equations of the process.
The same is true for many chemical processes as for example for the production of various plastics. The alternate of identifying every included hydrocarbon and how each hydrocarbon changes individually is not needed. Similarly, the process of cancer in the human body probably should be predictable from macro equations of amount of and type of carcinogen, duration, and other factors and probably will not require for useful accuracy a complete analysis at the DNA and cellular level for useful quantification. See the discussions on the macro biochemical engineering processes of Atherosclerosis and Cancer.
The type of analysis, concepts, and hypotheses appropriate for BioChemical Engineering analyses can be different from those usual for obtaining a Statistical Result. A summary of some of some differences that were used in development Life Ahead follow. It is hoped that others will contribute data, ideas, and theory to Life Ahead.
A BCE Model should provide a mathematical framework that is consistent with ALL available evidence on a factor. This writer feels that any truly scientific analysis should make a serious effort to explain ALL experimental results on factors that are statistically useful. This includes all evidence from both population (SPR) and biochemical (SBM) research. Exceptions to this in Life Ahead analyses were the elimination of most risk ratio results having a 5%-95% margin of error larger than 1.00 or 100% as of too poor accuracy for serious consideration. Also, results from multivariable regression analyses where variables are not all individually significant or are inter-correlated - and this nearly always is true for Health Research data - usually were not considered as useful values. Further discussion on this follows.
Some analysts reject studies that do not meet pre-conceived ideas of statistical design or type. This immediately raises a question of study selection and bias. Nearly all health studies have substantial margins of error. A question that should be carefully and thoughtfully answered before rejection of evidence is “What effect will an alleged study error have on the overall known study margin of error.” If this question cannot be reasonably quantified and the possible error cannot be shown to be substantial vs. known study error, rejection of a study result having a useful statistical margin of error is inappropriate. A problem here is the never-ending and often vigorous controversy about nearly every factor pertaining to health – and ease with which advocates of any position can use conjecture to select studies that confirm their position and reject those that do not confirm it.
It is felt that science has a responsibility for explaining any major difference in results of research studies. This includes being able to explain adequately any difference found in differing types of studies such as case control, prospective, and clinical. Reliance on such speculation such as self selection or "One study type is more valid than other" without concrete quantitative evidence why this is true is not appropriate. As example, the acceptance of conclusions from the WHI study of hormones without a good understanding and quantitative explanation of why results of this study differed so much from those of many other and much larger population studies would not be considered appropriate in a BCE global analysis.
A BCE and Global Model Requires Analysis of Multiple Studies: A real key in the development of casual relations is to identify those variables that explain best a widest range of research results. As example, there is a usual statistical relationship of Physical Activity Calories with heart disease and cancer. Meta analysis endorses and refines this relationship, but this still remains only an association. But a broader analysis shows that differences in Cardiofitness not only are similarly related to heart disease, but are more strongly related and explains much better the results of multiple research. Further, a broad range of other research points to the fact that exercise intensity that is not recognized adequately via Physical Activity Calories also is involved, and this factor is recognized via Cardiofitness. It remains possible that a still better causative variable can be found than presently measured Cardiofitness. But at this point in time the broad range of research results confirm Cardiofitness as the most probable known causative. It will be impossible to reach a conclusion on a subject such as this from results of any practical individual population study. Only Global Analysis of ALL and sufficient available research study results can identify a best global and probable causative variable.
Proof and Likelihood: Arguably, the concept of proof of benefit from any health factor is impossible. Humans have such a vast range of susceptibilities and differences that an individual can have no assurance that any intended health action or combination of actions will produce benefit or not may even cause harm. The often cited statement of health advocates “It isn’t proved” has no relevance. Nothing in Health ever is except perhaps death. The BioChemical models that Life Ahead provides from available research identify only what is felt to be a most probable causal sequence for an average population. Individuals will have risks varying much from that of an average population. Research evidence about the benefit of health actions accompanies the program, but individuals must make their own judgments about benefit vs. deprivation. Alternate models for any factor that are quantitatively consistent with all SPR and SBM facts can be considered as alternates to those now proposed. But until other quantitatively consistent (and not speculative) alternates become available, those proposed herein are regarded as most probable.
A BCE Model provides a quantification of one or more causative factors. The term GLOBAL ANALYSIS is used by this author to identify an analysis of statistical results from all available evidence aimed at identifying likely causative BioChemical factors. A causative factor should have an identified zero or initiation value, and be defined by an appropriate mathematical path. A likely causative relationship should be able to forecast values much beyond the range of data used in its development whereas a statistical relationship usually is useful only within the range of data used. A polynomial rarely is broadly useful in causative factor analysis, and the linear relation of cause to result commonly used in textbook statistics rarely is useful in the real world of chemical kinetics. Linear logistic or log relations often are useful, however.
Mathematical relationships of chemical processes should represent those appropriate in Chemical Engineering. Also, factors should be mathematically and casually consistent. For example, use of a ratio or product of variables requires first a supporting demonstration that each change of result from the denominator value is accompanied by a similar or opposite change in result from the numerator value. (i. e. use of a statistical ratio such as Cholesterol / HDL which does not qualify on this would be inappropriate in a BCE model).
Causative Factors in a BCE Model should be Quantitatively Consistent: Most SPR study discussions speculate about possible SBM causes of the statistical result obtained. A BCE Model should require that such SBM causes explain a result quantitatively. For example, it often has been speculated that a reason for the benefit of physical activity on heart disease is that this activity produces a higher HDL and or lower blood pressures. A test of this hypothesis based on actual available data reveals that HDL and blood pressure could explain only a very small fraction of the actually measured effects of Cardiofitness on coronary risk. Thus these factors are not quantitatively adequate. A review of other possible causative mechanisms reveals only one now known that can explain quantitatively the remaining deficiency: This is the effect of cardio-type exercise in enlarging coronary arteries. This is discussed in the section on Exercise, Cardiofitness and Heart Disease.
This suggested Model as the one used in Life Ahead is not necessarily "Proved", and perhaps no model on this ever will be absolutely proved. But it does provide a tenable quantitative explanation of a result that at this time no other scenario known to this author can accomplish. Thus until some alternate model of quantification can accomplish quantification more positively and concisely, this Model remains as a most likely causatively based scenario. The example Global Analysis provided a quantitatively consistent model for diabetes and heart disease. No quantitative Model and scenario explaining the effect of exercise on cancer has yet been proposed. That model must be able to explain why the higher intensity level of exercise that also improves cardiofitness best also best reduces risk of cancer. It always is possible to speculate scenarios of causative factors that might be possible, but until a scenario provides potential quantitative consistency it remains as conjecture.
Statistically Consistent vs. Statistically Significant. "Statistically Significant" is a usual test for in analyzing individual SPR study results. A chemical process should produce a consistent result for a given level of cause. Thus an appropriate hypothesis for testing such a result is “Is a new result statistically consistent with that of all other known results.” An observation that the 5% to 95% margins of individual results overlap the average value obtained - and thus are consistent with some average result identifies what is herein termed as ‘Statistical Consistency.’ This concept used in Life Ahead actually is the same logic often used in Meta-Analyses of multiple studies. In viewing results in the Health Library, note how the various risk ratios average, and how error margins straddle this average. If nearly all error margins straddle an average, the results are ‘Statistically Consistent.’ One or more results that are ‘Statistically Inconsistent” suggest that another unknown factor probably is present, and that this deserves further study.
This concept of Statistical Consistency differs much from a present Health Research dogma that assumes that a study result must be ‘Statistically Significant’ from a base of null, or of complete ignorance. The test value of 95% is arbitrary and has no relevance to real world decisions about death. If the possibly defective plane has been determined to have an 80% of chance of crashing, the statistically significant dogma assures us that this is "Not statistically significant", thus inferring that riding the plane presents no problem. This of course is absurd. But the researcher who finds that a vitamin supplement has only an 80% chance of halving the risk of a heart attack now similarly tells readers that "No statistically significant effect was found", inferring that nothing useful was found and that the health interested person should not take it. Yet most health interested people would be very interested in taking a supplement that has only a 50% or even a 20% chance of preventing a heart attack death.
A most serious problem with the "Statistically Significant" dogma at the 95% level is that this level incompatible with and too demanding vs. the usual margin of error of population health studies. It incorrectly assigns a quarter to a third of all practical studies of health to be "Not significant" that really are quite consistent with and really confirm an important beneficial effect. For example:
Using an average Health Study error margin of 0.60, it can be shown that the hypothesis of 'Statistically Significant' inevitably consigns about 1/4 to 1/3 of all studies of a factor having an actual risk ratio of 0.60 to be ‘Not significant’ even though all are statistically consistent, and really confirm an average value of 0.60 with high significance. The problem is that for an average risk ratio of 0.60 and an error margin of 0.60, results of similar repeat studies will by statistical definition scatter from below 0.30 to above 0.90. And those with measured risks above about 0.70 will qualify as ‘Not significant’. Continuing research forever just will produce the same percentage of ‘significant’ and ‘not significant' results. This problem exists for research on nearly every conceivable health factor. The percentages of confirming and non-confirming studies will be predictable from average risk ratios obtained and their individual margins of error. See a more specific example of this in the analysis of Vitamin C on Risk of Heart Disease A justification for nearly every new study is "Past results are not consistent" when this may not really be true. This poorly recognized defect of the 95% significant thesis may be a key reason why results from Health Research on such major factors as Cholesterol, Exercise, and Smoking took 20 to 40 years of research for a consensus.
The dogma of ‘Statistically Significant’ creates even further confusion and inefficiency. Two studies, each showing exactly the same result are judged “Significant’ and “Not significant’ only because one included a bit more data than the other. An actual example used in Life Ahead included four studies, each ‘Not significant’ individually. Yet all showed the same beneficial result and in total indicated a quite significant benefit. Use of the null hypothesis for a statistical analysis fails further to identify those studies that are really “Inconsistent” with the average of others, and that really deserve more thought and consideration. The Life Analyses found that all but a very few study results claimed by authors to be "Not statistically significant" really were "Statistically consistent" with the majority.
The arbitrary use of a 95% level of significance to state a fact that may not be true produces what is herein called Statistical Confusion. Starting from a recognition that a progress of disease is a BioChemical process, and that such processes when understood are both consistent and largely predictable, it is felt that the concept of 'Statistically Consistent' is a far a more useful basis for BCE analysis. All data used in Life Ahead were analyzed using their actual values and margins of errors. No acceptance was given to a statement that a result was not "Statistically significant" that was not further accompanied by more specific quantification of result and variability.
Arguably, the use of "Statistically significant" from null is useful - although by no means needed - only for a first study of a particular relationship. Once a next study becomes available, a most likely risk becomes some average or mean value of the two. From this point on, the results of further studies should identify their result in terms of its relationship to and departure from some agreed to and identified average risk of all studies to date. This is a function of importance that the NIH should consider initiating. Numerous analyses provided in the Life Ahead research library provide average risk values for accumulated research to date that could be useful for this purpose.
The Risk of Most Factors Producing Major Disease Depends on their Duration of Exposure. A factor of importance revealed by the BCE analyses of Life Ahead is that the risk of most major factors producing cardiovascular diseases and cancer will increase with their duration of exposure. A long term chemical process takes place at some given rate per year, and any rate modifying agent operates to change this rate. Consider any chemical process that moves forward at say 4% per year, and an modifying factor that starting at time zero reduces this rate by 1% per year. A statistical risk ratio measured at year 1 will be 1.03/1.04 or 0.99. After 10 years the risk ratio will be 1.03^10/1.04^10 or 0.91. After 30 years the risk ratio will become 0.75. Most health SPR research has measured a risk at some undetermined time of disease development and thus risk ratios measured are of uncertain BioChemical significance. This duration effect occasionally has been noted by researchers in some studies of vitamins, and is well recognized by the accepted ‘Pack-years’ effect of cigarette smoking on heart disease and cancer. And this duration effect is striking in size and significance in the example Global Analysis of diabetes.
But the fact that nearly every health risk should depend on duration of exposure is poorly recognized. A recent examples of this was the quick acceptance given the WHI study on female hormones that really studied usefully their effect over the average time of only about 5 years. As a rough generality, the effects of carcinogens, atherosclerosis, cholesterol, antioxidants, estrogen, and other factors tend to increase over time at a typical compounded rate of around 3-5% per year of exposure. A minimum duration of 10 years of factor exposure thus may be needed for a measured risk ratio to become adequately measurable within usual study margins of error. These effects of factor duration and of the often inadequate related research on this are discussed extensively in the analyses of various factors in the Health Research Library.
The Problem of the Randomized Clinical Study. This type of study – the accepted basis for drug testing and results on essentially every health problem- is now also a presumed ‘gold standard’ for SPR type research on Life Style Habits. Yet on subjects relating to heart disease and cancer some of the clinical studies performed to date have failed to find the effects expected from other research. There is no question here about the fundamental value of the direct randomized clinical study in measuring effects of drugs and other factors that produce changes near immediately. . Rather a finding of Life Ahead is that many if not most clinical studies related to heart disease and cancer were performed for far too short a time period to value usefully the effects of the long term BioChemical processes involved. Others were done on heart disease patients that provided a differing chemical environment than that usual for otherwise healthy individuals. Problems encountered in clinical studies that can destroy their validity are discussed elsewhere on this site.
The research on cigarettes provides by far the most extensive available evidence about how carcinogens produce cancer in humans. The accompanying analysis of Cancer suggests that all carcinogens probably will act over similar long time periods, with time for first events depending on potency of the carcinogen. It takes 40 pack-years for full development of cancer from cigarettes. Thus a 5 or even 10 year clinical type study of cigarettes in producing cancer predictably will fail to find any effect of cancer within the usual error margins of SPR research. A question thus becomes: "Can we really consider as useful a study method for measuring the risk of cancer that would be incapable even of finding how cancer can develop from cigarettes?" The problem here is not the concept of the clinical trial. Rather it is the time that is required to measure a particular BioChemical process.
Similarly, atherosclerosis usually builds up in arteries gradually over life. An antioxidant can slow the rate of the buildup, but only at a usual slowing of perhaps 2-3% per year of its exposure. After 5 years of a clinical study of this antioxidant a series of disease or death events are obtained that really were for an average exposure time of only about 2.5 years. If disease risk is roughly proportional to atherosclerosis, such a study would have little chance of finding a "Statistical Significant" measurement..
This expected result is just what has been found from most present clinical studies of antioxidants. In contrast, a usual individual compared in a case control study may have taken the antioxidant for 10 years, and obtained a reduction in atherosclerosis of 20-30% and a reduction in disease risk of perhaps 30%. A result from a prospective study of 10 years may have measured the effect of the agent for users vs. non-users for an initial 10 years at study baseline plus half of the study duration, or 15 years. Use of the antioxidant for 10 to 15 years could reduce net atherosclerosis narrowing substantially. Using this logical and confirmed mechanism, Life Ahead now forecasts well actual results for nearly all clinical studies found, and shows that it would take a 20 year clinical study to obtain a really useful result for an antioxidant.
Some other problems determining the usefulness of a clinical trial is that of user compliance over long periods of time, the placebo effect, and establishing correct populations for a study. Users gradually fail to conform to their required conditions over time. Those that exercise the most start exercising less; those that don't exercise start exercising. More and more participants stop taking the pills required in a study after years pass. Users often respond to "feeling better" after taking the placebo pill. And the people that choose to join a clinical trial have to be obtained from the population and rarely will be "average" people. The population used in the Women's Health Initiative tested for the benefit of Vitamin E was found to have a vastly lower risk of the heart disease than was usual. The risk measured on this population for any new factor hardly could be representative of that for a more usual or average individual. . And participants that are involved in a health trial know this and this makes them more careful of all other health habits
The present life cycle Life Ahead Model provides a convenient method for forecasting a likely result from any randomized clinical trial of its included factors, and especially those that have risks based on a percent per year factor. (See Health Library). Simply set up a test case with average habit and risk factor measurements of the starting population at its age at trial start. Run the case. Then set the two optional 'change ages for risk measurement' options to two year periods ahead to be used in the trial, and re-compute result for these years. Copy or screen-print the Complete Listing Results of risk by disease. This is an expected result for the placebo group. Then ask to change any amount of any supplement, food, or other model risk factor and compute this result. The comparison of results in the Complete Listing with that for the placebo group will show the expected result of the proposed clinical trial. This can be done for any number of years of trial. Because Life Ahead computes results by year and sums these results in its output, it should forecast the actual amount of disease differences that accrue during the time of the trial from present age. .
A serious research need today is WHY different kinds of study methods produce different apparent results. Life Ahead provides a quantitative understanding of why results from clinical trials differ from from results of observation studies. The problem of the of too-short term clinical trials is discussed in the analysis of research results reviewed in the Health Research Library. See particularly the Global Analysis and Vitamin E and Heart Disease.
Weighting of Individual Study Results: Individual research studies have widely varying identified margins of error. There also are many non-identified margins of study error and controversies about statistical design. Time was spent early in the project attempting to weight studies inversely by their margins of error and other considerations via logarithmic vs. arithmetic averages, etc. It was finally concluded that this effort was fruitless. The weighted averages ended up being too close to simple averages for the differences in result to mean anything. More important, risk ratios change continuously with age, with duration of exposure to agents, and are lower for healthy than for less healthy populations. Focus finally centered on finding how biochemical processes determined risks. Average population risks selected for health factors are developed and described in the Life Ahead Health Research Library.
Multiple Regression – Too often an Invalid Method: Many health researchers today appear to have fallen in love with Multiple Regression. With modern computers seemingly sophisticated results from a dozen complex variables can be produced in seconds with incredible ease. Four decades of experience with thousands of regressions on hundreds of problems have convinced this writer that this method has a near incredible ability to produce incorrect and misleading results. Many years ago a colleague, Meyer (Mike) Efroymsen and I started a book on the use of regression for analyzing complex industrial problems. Mike, a brilliant chemical engineer who sadly died young published the landmark paper on the stepwise method for doing regressions that since has been used extensively throughout the world. The book sadly never was completed and the partial script and data are now long lost.
But a chapter example of how regressions can obtain wrong results is well remembered. Data used were the top speed, engine horsepower, weight and price for about 50 automobiles of 1950’s vintage. This Car Example example was intended to show logically that top speed depended primarily on engine horsepower, would be slowed by vehicle weight, and price was irrelevant. The wonderful 'expertise' of multiple regression claimed that the key factor producing automobile speed was vehicle weight. The heavier the car, the faster it would go. Price was second in significance, and engine horsepower a distant third. Regression found the wrong variables with a key wrong one also in the wrong direction together with supposed values of significance. How could such an absurd result be obtained?
The answer was variable inter-correlation. Within that data set the cars that had more horsepower also weighed more and cost more. Regression simply could not tell which was the most important, and selected the wrong answers. And much use of regressions on complex industrial problems showed that this type of erroneous result from regression on multivariable problems was all too usual. Regressions are valid and very useful for defining effects of independent factors of identifiable significance on a dependent result. The problem is that in the real world data are rarely independent and only a few variables included are really significant. This is particularly true in the health research area. Vitamins usually are taken in multiples, nearly every diet item is related to some other items, every habit is related to another habit.
A major use of regression in the SPR data reviewed for use in the Research Health Library was for the adjusting results for up to 10 or more different factors. First, it seems likely nearly all variables used as adjustments probably were inter-correlated. If a variable is not individually significant at least to a t value of 1.5 or 2, or has an inter-correlation coefficient with other variables larger than 0.2, its inclusion in a factor adjustment arguably is invalid. It will produce the above "Car Example" type of major error.
Second, experience here suggests that if properly independent and significant, an inclusion of an individual variable adjustment should reduce the variance of the total result. Inclusion of either inter-correlated or non-significant factors can increase the result variance. Nearly every published adjustment in SPR type studies showed an increase in the error margin of the overall result, and in further increasing error margins and poorer overall correlation as added variables were included. This reveals variable confounding. The Life Ahead analyses usually disregarded adjusted results that showed a higher overall variance of total result unless it seemed clear from data provided that the adjustment seemed reasonable. Also, research study results developed from multiple regression that did not also include results for first order measurements were not considered to be useful.
This writer feels that at perhaps 90% of all "Adjusted" results provided in health research papers produced via the usual Statistical Regression method are statistically confused, misleading, and probably meaningless. Study after study - and this includes some large meta analyses - find first a useful effect using either actual or age adjusted results, and then find from multiple other statistically invalid adjustments "No Effect of this Was Found." This is a bad science that is leading to the earlier disease and death of countless individuals.
But regressions used carefully can be of great help in analyzing both individual research and Global results. Many hundreds of statistical regressions were used in the development of Life Ahead. Some important results from these are included for review in the Life Ahead Health Library. All were done stepwise with close recognition of problems of inter-correlation, and results at each variable step were carefully examined for possible confounding. Regressions that increased variance of a total result were not accepted. Dozens of regressions often were used to explore and identify most probable casual relationships among factors. Regressions often were used to explore objectively the most probable Global causes of variable behavior from results of all individual studies found.
Researchers can have difficulty accepting Global Analysis: Experience in analyzing data from different areas of research over several decades has taught that the scientists that develop individual research studies often have difficulty understanding and accepting results of Global analyses that include their studies. A research study can take a dozen years of hard work to complete, and it is natural that the those involved will have pride and faith in its results. The use of their study result as just another statistic in a broader analysis of many more studies can demean a better than usual study. There can be the feeling that "My result is better than the result of certain others and it is not correct to consider them equal. And expertise in study design and development does not extend to that of Global analysis that involves experienced full time analysis of multiple research results over many years of a career.
Because Global Analysis as used in Life Ahead appears to be a relatively new concept in health research, and background in using this type of analysis is not widespread, it is anticipated that its results will not be accepted by everyone. But because results of these analyses can help prolong millions of lives, it is hoped that researchers will review findings herein objectively and raise questions thoughtfully and directly about any results that need review or improvement. There is little question that many areas in Life Ahead will need to be updated substantially as more evidence and ideas accumulate. This writer will be seeking ideas that can accomplish this.