Monday, September 24, 2007

My understanding of the Universal Probability Bound

RE: the Universal Probability Bound:

-given the sequence of prime numbers: “12357 ..."

According to probability theory, the first digit has a one in ten chance of matching up with the sequence of prime numbers, however the second digit has a one in 100 chance, and the third a one in 1000 chance, etc. So, how far up the pattern of prime numbers will chance take us before making a mistake? The further you go, the more likely chance processes will deviate from the specified pattern. It’s bound to happen eventually as the odds increase dramatically and quickly. But, how do we know where the cut off is?

Dembski has introduced a very “giving the benefit of the doubt to chance” type of calculation based on the age of the known universe and other known factors, and actually borrowing from Seth Lloyd’s calculations. Now, it must be noted that as long as the universe is understood to be finite (having a beginning) then there will be a probability bound. This number may increase or decrease based on future knowledge of the age of the universe. However, a UPB will exist and a scientific understanding can only be based on present knowledge.

This number, as far as I understand, when it is calculated actually allows chance to produce less than 500 bits of specific information before cutting chance off and saying that everything else that is already specified and algorithmically complex and above that bound of 500 bits is also most reasonably beyond the scope of chance operating anywhere within the universe for the duration of the universe and is thus complex specified information and the result of intelligence.

Now, let’s take a closer look at the Universal Probability Bound of 500 bits. What would it take for pure random chance to cover all possible combinations of a 500 bit sequence? Well, any given 500 bit sequence is 1 in 2^500 possible combinations; that is 1 out of more than 3.27 x 10^150 possible sequences. Now let’s look at the age of the universe. It is 15.7 billion years old; that is approx. 4.95 x 10^17 seconds old. After a few simple calculations it is easy to see that the whole universe would have to be flipping 6.61 x 10^132 sets of 500 coins every second for 15.7 billion years in order to generate 3.27 x 10^150 sequences of 500 bits.

But even after all is said and done, all possible combinations will not have been generated because there is no way to guarantee that no pattern will appear twice. Since probabilities deal with averages, it is only after many sets of 15.7 billion years that we will see, on average, an exponential appearance of all of the possible combinations being created. But of course, this assumes that there are indeed that many “sets of coins” being flipped at the above rate in the first place.

And still, there is no guarantee that even with that many random “flips of a coin” that a pattern such as “10" repeated 250 times will even be generated. In fact, it is not in the nature of pure random processes to match patterns which can be described and formulated by a system of rules. Furthermore, science always looks for the best explanation, and law and intelligence (teleological processes) are already available as better explanations than chance for the creation of specified patterns – patterns which can be described and formulated by a system of rules. The limit of 500 bits only provides a very generous Universal Probability Bound, which is based on known measurements of the universe, that places a restriction on invocation of “chance of the gaps” when other better and more reasonable explanations, based on observation, are available.

In fact, here is a little test. Take a 100 bit pattern (including spaces and ending punctuation) such as “aaaaaaaaaaaaaaaaaaaa” and randomly “spin” the letters to your hearts content for as long as you like and see if you ever get a specified pattern.

Again, as I’ve stated before, ID Theory provides a best explanation hypothesis about the nature of the cause of the ‘Big Bang’ model based upon observation and elimination of other alternatives that posit unreasonable gaps based on chance, not based on observation, which are postulated to circumvent observed cause and effect relations.

Where is the CSI necessary for evolution to occur?

First, read through this extremely informative article and the abstracts to these three articles:

here

here

here

then continue ...

According to Dr. Marks work with evolutionary algorithms and computing and intelligent systems, evolving functional information is always guided by previous functional information toward a solution within a search space to solve a previously known problem. [endogenous information] = [active information] + [exogenous information] or as “j” at Uncommon Descent explained, “[the information content of the entire search space] equals [the specific information about target location and search-space structure incorporated into a search algorithm that guides a search to a solution] plus [the information content of the remaining space that must be searched].”

Intelligence is capable of the guiding foresight which is necessary and a sufficient level of intelligence possesses the ability to set up this type of system. This is based on observational evidence. So, if functional information only comes from intelligence and previous information, then where does the information necessary for abiogenesis (the original production of replicating information processing systems) and evolutionary change come from?

Evolution seems to be, for the most part, guided by the laws which allow biochemistry and natural selection which both result from the laws of physics. The laws of physics are at the foundation of our universe which is now seen to be an information processing system. If the universe processes information, and if biochemistry and natural selection is a result of the laws of physics, then the information for evolution by natural selection [and other necessary mechanisms] is at the foundation of our universe (as an information processing system) and represented in the finely tuned relationship between the laws of physics and life’s existence and subsequent evolution. IOW, the universe is fine tuned, that is intelligently programmed, for life and evolution.

My point is that abiogenesis and evolution are not accidental; they are necessarily programmed into our universe, arriving from the fine tuned information at the foundation of our universe, yet do not arrive strictly from laws and chance (stochastic processes) alone, since information processors and functional information are not definable in terms of theoretical law. This is similar to the ending pattern of a shot in a pool game. The pattern of balls is created by the laws of physics once the first pool ball it set in motion, however the ending pattern itself is not describable by natural law. It is a random pattern/event. But, a “trick shooter” can fine tune both the initial set up and the starting shot in order to create a desired pattern in the form of a trick shot. Just like the ending pattern of the shot in the pool game, information and information processing systems are not describable by law. Again, the ending pattern is a random pattern/event. However, information processing systems, which are necessary for evolutionary algorithms to search a given space, and the functional information to search that space, and the information to guide the search do not arise by random accident within any random set of laws. Along with the argument from CSI and my other argument (scroll down to first comment), the best explanation is that these factors of the appearance of life (an information processing system) and evolution (the production of CSI) are programmed into our universe by intelligence.

That can be falsified by creating a program which generates random laws. If these random laws will cause any information processing system which generates functional information to randomly self-organize, just as any pattern of pool balls will self organize randomly after any and every shot, then the above hypothesis is falsified. Dr. Robert Marks is already examining these types of claims, and along with Dr. Dembski is refuting the claims that evolution creates CSI for free by critically examining the evolutionary algorithms which purportedly show how to get functional information for free. Dr. Marks is using the concept of CSI and Conservation of Information and experimenting in his field of expertise – “computational intelligence” and “evolutionary computing” – to discover how previously existing information guides the evolution of further information.

Complex Specified Information ... Simplified with filter included

The 6 levels for determining Complex Specified Information.

Before I begin, the reason why CSI points to previous intelligence is because intelligence has been observed creating this type of information, and based on its pseudo-random properties CSI can not be described by law and is not reasonably attributed, nor has it been observed to have arisen, by random chance. CSI is primarily based on specificity. A specified pattern is described, independent of the event in question, by the rules of a system. As such, explanations other than chance are to be posited which can create informational patterns that are described by the rules of a system. Dr. Dembski describes specified patterns as those patterns which can be described and formulated independent of the event (pattern) in question. The rest of CSI is briefly and simply explained within the following filter.

Clarification: I have no problem with an evolutionary process creating CSI, the question is “how?” First, evolution takes advantage of an information processing system and this is a very important observation -- read “Science of ID” and the first comment. Second, it is obvious that an evolutionary process must freeze each step leading to CSI through natural selection and other mechanisms in order to generate CSI. It is thus obvious that the laws of physics contain the fine tuned CSI necessary to operate upon the information processing system of life and cause it to generate further CSI. This can be falsified by showing that any random set of laws acting on any random information processing system will cause it to evolve CSI. For more and to comment on this idea refer to “Where is the CSI necessary for evolution to occur.”

Now for the steps for determining CSI:

1. Is it shannon information? (Is it a sequence of discrete units chosen from a finite set in which the probability of each unit occurring in the sequence can be measured? Note: shannon information is a measurement of decrease of uncertainty.)

Answer:

No – it’s not even measurable information, much less complex specified information. Stop here.

...or...

Yes – it is at least representable and measurable as communicated data. Move to the next level.


2. Is it specified? (Can the given event (pattern) be described independent of itself by being formulated according to a system of rules? Note: this concept can include, but is not restricted to, function and meaning.)

(Ie:
- “Event (pattern) in question” – independent description [formulated according to rules of a system]
- “12357111317" – sequence of whole numbers divisible only by themselves and one [formulated according to mathematical rules]
- “101010101010" – print ‘10' X 6 [formulated according to algorithmic information theory and rules of an information processor]
-“can you understand this” – meaningful question in which each word can be defined [formulated according to the rules of a linguistic system (English)]
- “‘y(x)’ functional protein system” –‘x’ nucleotide sequence [formulated according to the rules of the information processing system in life]
- “14h7d9fhehfnad89wwww” – (not specified as far as I can tell)

Answer:

No – it is most likely the result of chance. Stop here.

...or...

Yes – it may not be the result of chance; we should look for a better explanation. Move to the next level.

3. Is it specified because of algorithmic compressibility? (Is it a repetitious/regular pattern?)

Answer:

Yes – it is most likely the result of law, such as the repetitious patterns which define snowflakes and crystals. The way to attribute an event to law (natural law) as opposed to random chance is to discover regularities which can be defined by equation/algorithm. Stop here.

...or...

No – the sequence is not describable as a regular pattern, thus tentatively ruling out theoretical natural laws -- natural laws being fundamentally bound to laws of attraction (ie: voltaic, magnetic, gravitational, etc.) thus producing regularities. Law can only be invoked to describe regularities. The sequence is algorithmically complex, may be pseudo-random, and we may have a winner but let’s be sure. If the pattern is short, then it may still be the result of chance occurrences and may be truly random. Our universe is huge beyond comprehension after all. Its may be bound to happen somewhere, sometime. Move to the next level.

4. Is it a specification? (Is its complex specificity beyond the Universal Probability Bound (UPB) – in the case of information, does it contain more than 500 bits of information?)

Answer:

No – it may be the result of intelligence, but not quite sure, as random occurrences (possibly “stretching it”) might still be able to produce this sequence somewhere, sometime. We’ll defer to chance on this one. Stop here.

...or...

Yes – it is pseudo-random and is complex specified information and thus the best (most reasonable) explanation is that of previous intelligent cause. If you would still like to grasp at straws and arbitrarily posit chance as a viable explanation, then please move to the next level.

5. Congratulations, you have just resorted to a “chance of the gaps” argument. You have one last chance to return to the previous level. If not, move up on to the last level.

6. You seem to be quite anti-science as you are proposing a non-falsifiable model and this quote from Professor Hasofer is for you:

“"The problem [of falsifiability of a probabilistic statement] has been dealt with in a recent book by G. Matheron, entitled Estimating and Choosing: An Essay on Probability in Practice (Springer-Verlag, 1989). He proposes that a probabilistic model be considered falsifiable if some of its consequences have zero (or in practice very low) probability. If one of these consequences is observed, the model is then rejected.

‘The fatal weakness of the monkey argument, which calculates probabilities of events “somewhere, sometime”, is that all events, no matter how unlikely they are, have probability one as long as they are logically possible, so that the suggested model can never be falsified. Accepting the validity of Huxley’s reasoning puts the whole probability theory outside the realm of verifiable science. In particular, it vitiates the whole of quantum theory and statistical mechanics, including thermodynamics, and therefore destroys the foundations of all modern science. For example, as Bertrand Russell once pointed out, if we put a kettle on a fire and the water in the kettle froze, we should argue, following Huxley, that a very unlikely event of statistical mechanics occurred, as it should “somewhere, sometime”, rather than trying to find out what went wrong with the experiment!’”

Therefore, ID Theory provides a best explanation hypothesis about the nature of the cause of the ‘Big Bang’ model based upon observation and elimination of other alternatives that posit unreasonable gaps based on chance, not based on observation, which are postulated to circumvent observed cause and effect relations.

Saturday, September 15, 2007

Concept of CSI (part 2)

Continuation from here.

Zachriel:
“You keep talking about CSI and complexity, but the only issue at this point is the definition of “specificity”. Your meandering answer is evidence of this extreme overloading of even basic terminology.”

My example of defining “complexity” was to show that even in information theory, some concepts can and must be defined and quantified in different ways.

... and I have given definitions of specificity in different words (pertaining to the definition which aids in ruling out chance occurrences), hoping that you would understand them. However, you continually ignore them. Or did you just miss these?

Meandering, nope. Trying to explain it in terms you will understand (also borrowing from Dembski’s terminology), yep.

Zachriel:
“This is Dembski’s definition of specificity:

Thus, for a pattern T, a chance hypothesis H, and a semiotic agent S for whom ?S measures specificational resources, the specificity ? is given as follows:

? = –log2[ ?S(T)P(T|H)].”

First, you do realize that in order to measure something’s specificity, the event must first qualify as specific, just like how in order to measure an event using shannon information (using the equation which defines and quantifies shannon information) the event must first reach certain qualifiers. I’ve already discussed this.

Now, yes, you are correct. However, you stopped half way through the article and you seemed to have arbitrarily just pulled out one of the equations. Do you even understand what Dembski is saying here? You do also realize that specificity and a specification are different, correct?

Dembski was almost done building his equation, but not quite. You obviously haven’t read through the whole paper. Read through it, then get back to me. You will notice that Dembski later states, in regard to your referenced equation:

“Is this [equation] enough to show that E did not happen by chance? No.” (Italics added)

Wht not? Because he is not done building the equation yet. He hasn’t factored in the probabilistic resources. I’ll get back to this right away, but first ...

The other thing that you must have missed, regarding the symbols used in the equation, directly follows your quote of Dembski’s equation. Here it is:

“Note that T in ϕ S(T) is treated as a pattern and that T in P(T|H) is treated as an event (i.e., the event identified by the pattern).”

It seems that the above referenced equation is showing a comparison of the event in question (the event identified by the pattern) with its independently given pattern, compared to its chance probabilistic hypothesis, thus actually showing that we were both wrong in thinking that just any pattern (event) could be shoved into the above equation. The equation itself only works on those events which have an independently given pattern (thus already qualifying as specific) and giving a measurement of specificity, but not a specification. You will notice that a greater than 1 complex specificity = a specification and thus CSI if you continue to read the paper.

Dembski does point out that in the completed equation, that when a complex specificity produces a greater than 1 result, you have CSI. As far as I understand, this is a result of inputing all available probabilistic resources, which is something normal probability theory does not take into consideration. Normally, probability calculations give you a number between 0 and 1, showing a probability, but they do so without consideration of probabilistic resources and the qualifier of the event conforming to an independently given pattern. Once this is all calculated and it’s measurement is greater than one (greater than the UPB), then you have CSI.
Moreover, the specification is a measurement in bits of information and as such can not be less than 1 anyway, since 1 bit is the smallest amount of measurable information (this has to do with the fact that measurable information must have at least two states -- thus the base unit of the binary digit (bit), which is one of those two states).

You must have seriously missed where Dembski, referencing pure probabilistic methods in “teasing” out non-chance explanations, said (and I already stated a part of this earlier):

“In eliminating H, what is so special about basing the extremal sets Tγ and Tδ on the probability density function f associated with the chance hypothesis H (that is, H induces the probability measure P(.|H) that can be represented as f.dU)? Answer: THERE IS NOTHING SPECIAL ABOUT f BEING THE PROBABILITY DENSITY FUNCTION ASSOCIATED WITH H; INSTEAD, WHAT IS IMPORTANT IS THAT f BE CAPABLE TO BEING DEFINED INDEPENDENTLY OF E, THE EVENT OR SAMPLE THAT IS OBSERVED. And indeed, Fisher’s approach to eliminating chance hypotheses has already been extended in this way, though the extension, thus far, has mainly been tacit rather than explicit.” [caps lock added]

Furthermore ...

Dr. Dembski: “Note that putting the logarithm to the base 2 in front of the product ϕ S(T)P(T|H) has the effect of changing scale and directionality, turning probabilities into number of bits and thereby making the specificity a measure of information. This logarithmic transformation therefore ensures that the simpler the patterns and the smaller the probability of the targets they constrain, the larger specificity.”

Thus, the full equation that you haven’t even referenced yet gives us a measurement (quantity) in bits, of the specified information as a result of ‘log base 2'.

In fact, here is the full equation and Demsbski’s note:

“The fundamental claim of this paper is that for a chance hypothesis H, if the specified complexity χ = –log2[ 120 10  ϕ S(T)P(T|H)] is greater than 1, then T is a specification and the semiotic agent S is entitled to eliminate H as the explanation for the occurrence of any event E that conforms to the pattern T”

In order to understand where this 10^120 comes from, let’s look at a sequence of prime numbers:

“12357" – this sequence is algorithmically complex and yet specific (as per the qualitative definition) to the independently given pattern of prime numbers (stated in the language of mathematics as the sequence of whole numbers divisible only by itself and one), however there is not enough specified complexity to cause this pattern to be a specification greater than 1. It does conform to an independently given pattern, however, it is relatively small and could actually be produced randomly. So, we need to calculate probabilistic resources and this is where the probability bound and the above equation comes into play.

According to probability theory, the first digit has a one in 10 chance of matching up with the sequence of prime numbers (or a pre-specification), however the second digit has a one in 100 chance, and the third a one in 1000 chance, etc. So, how far up the pattern of prime numbers will chance take us before making a mistake? The further you go, the more likely chance processes will deviate from the specific (or pre-specified) pattern. It’s bound to happen eventually as the odds increase dramatically and quickly. But, how do we know where the cut off is? Dembski has introduced a very “giving the benefit of the doubt to chance” type of calculation based on the age of the known universe and other known factors, and actually borrowing from Seth Lloyd’s calculations. Now, it must be noted that as long as the universe is understood to be finite (having a beginning) then there will be a probability bound. This number may increase or decrease based on future knowledge of the age of the universe. However, a UPB will exist and a scientific understanding can only be based on present knowledge. This number, as far as I understand, actually allows chance to produce less than 500 bits of specific information before cutting chance off and saying that everything else that is already specified and above that bound of 500 bits is also definitely beyond the scope of chance operating anywhere within the universe and is thus complex specified information and the result of intelligence. I dare anyone to produce even 100 bits of specified information completely randomly, much less anything on the order of complex specified information.

Clarification: I have no problem with an evolutionary process creating CSI as it makes use of a replicating information processing system. As I have said earlier, it is the “how” that is the real question and the present problem. IMO, the present observations actually seem to support a mechanism which produces CSI in sudden leaps rather than gradually. In fact, Dr. Robert Marks is presently working on evolutionary algorithms to test their abilities and discover experimentally what is necessary to create CSI and how CSI guides evolutionary algorithms towards a goal.

So, do you understand, yet, how to separate the concept of specificity (of which I have provided ample definitions and examples previously) from the measurement of complex specified information as a specification greater than 1 after factoring in the UPB in the completed equation?

Zachriel:
“Dembski’s definition has a multitude of problems in application,”

So, you are now appealing to “fallacy by assertion?” (yes, I think I just made that up)

fallacy by assertion: “the fallacy which is embodied within the idea that simply asserting something as true will thus make it to be true.”

Come on, Zachriel, this is a debate/discussion; not an assertion marathon.

I have already shown you how to apply it earlier and you just conveniently chose not to respond. Remember matching the three different patterns with the three different causal choices? If there are a multitude of problems in the definition of specificity please do bring them foreward. You’ve already brought some up, but after I answered these objections, you haven’t referred to them again.

Actually, to be honest with you, since this is quite a young and recently developed concept, there may be a few problems with the concept of specificity and I welcome the chance to hear from another viewpoint and discuss these problems and see if they are indeed intractable.

Zachriel:

“but grappling with those problems isn’t necessary to show that it is inconsistent with other uses of the word within his argument. This equivocation is at the heart of Dembski’s fallacy.”

Have you ever “dumbed something down” for someone into wording and examples that they would understand in order to explain the concept to them, because they couldn’t comprehend the full detailed explanation? Furthermore, have you ever approached a concept from more than one angle, in order to help someone fully comprehend it? This is indeed a cornerstone principle in teaching. I have employed this countless times, as I’ve worked with kids for ten years.

You have yet to show me where any of Dembski’s definitions of specificity are equivocations rather than saying the same thing in different wording to different audiences of differing aptitudes or rewording as a method of clarification.

Zachriel:
“Dembski has provided a specific equation. This definition should be consistent with other definitions of specificity, as in “This is how specification is used in ID theory..."
Do you accept this definition or not?”

I agree with the concept of specificity and its qualitative definition. As for the equation, I do not understand all of the math involved with the equation, but from what I do understand it does seem to make sense. You do understand the difference between an equation as a definition of something such as *force* in f=ma, as opposed to a qualitative definition of what is “force?”

But, then again, I’ve already been over this with you in discussing shannon information and you chose to completely ignore me. Why should it be any different now?

This definitional equation which provides a quantity of complex specified information with all available probabilistic resources factored in is consistent with all other qualifying definitions of specificity as it contains them within its equation.

You have yet to show anything to the contrary.

As far as application goes, I do think that the equation may be somewhat ambiguous to use on an event which is not based on measurable information. But, then again, I’d have to completely understand the math involved in order to pass my full judgement on the equation.

Furthermore, do you understand the difference between a pre-specification, a specification, specified information, specified complexity, and complex specified information? I ask, because you don’t seem to understand these concepts. If information is specified/specific (which I’ve already explained) and complex (which I’ve already explained), then you can measure for specificity (which is the equation that you have referenced). However, this doesn’t give us a specification, since the probabilistic resources (UPB) are not yet factored in. Once the UPB is factored in, then you can measure the specified complexity for a specification. If the specified complexity is greater than one, then you have a measure of specification and you are dealing with complex specified information.

It is a little confusing, and it has taken me a while to process it all, but how can YOU honestly go around with obfuscating arguments and false accusations (which you haven’t even backed up yet) of equivocations when it is obvious that you don’t even understand the concepts?

Do you do that with articles re: quantum mechanics just because you can’t understand the probabilities and math involved or the concept of wave-particle duality or some other esoteric concept?

I will soon be posting another blog post re: CSI (simplified) and the easy to use filter for determining if something is CSI. Here it is.

P.S. If you want to discuss the theory of ID and my hypothesis go to “Science of Intelligent Design” ...