## Wednesday, November 5, 2014

### More on Erdős–Kac Theorem

One of the valuable features of the Erdős–Kac Theorem is that it enables us to calculate - using the normal distribution - probability estimates for the various numbers of (distinct) primes that might be associated with a given large number.

Now it is amazing how large these numbers quickly get. For example as the Wikipedia reference makes clear, numbers with 10,000 digits would require (on average) just 10 (distinct) primes for their construction.Thus if n is such a number (with 10,000 digits) the mean average i.e. log log n = 10 (approx) and the standard deviation =  √(log log n) which is just slightly in excess of 3.

We can then say that just over 2/3 (i.e 68%) of such numbers would fall within one standard deviation either side of this average which would entail that about 7 - 13 (distinct) primes would be involved in the construction of a 10,000 digit number!

We could equally use the normal distribution to calculate the probability of 1, 2, 3, 4, .....k (distinct) primes comprising a such a number (where k represents the maximum number of distinct primes that could be involved).

It is important to realise however that as we keep ascending the number scale that distinctive normal distributions would be used.

Thus for example the normal distribution that would describe the typical behaviour of a 100,000 digit number for example, would necessarily be of a different shape to that for the corresponding 10,000 digit number!

In general terms as the size of the standard deviation progressively falls in percentage size (relative to the mean), we can therefore expect - ever stronger clustering of values for the number of (distinct) primes around a central average value as the size of n increases..

I suggested in the last entry that a complementary Erdős–Kac (Type 1) Theorem should equally exist.

Once again we can have two complementary opposite notions of the relationship of the primes to the natural numbers.

In Type 1 terms - which represents the standard approach - the primes are viewed in an (individual) manner as having a random existence with respect to the (collective) natural number system.

However in Type 2 terms - which relates directly to the Erdős–Kac Theorem - the primes are now viewed in a (collective) manner as factors with respect to an (individual) natural number!

As Type 1 and Type 2 are directly complementary (mutually implying each other), this suggests that a corresponding Type 1 Erdős–Kac Theorem should apply (complementing the recognised Type 2 version).

Misleadingly I suggested that this alternative distribution might apply to the gaps as between the primes. However following some empirical analysis, it became quickly apparent to me that the measurement of the gaps as between primes (with respect to a typical average gap given as log n) would not correspond to a normal distribution.

Paradoxically, though the individual primes are indeed random with respect to the (collective) natural number system, the actual gaps as between primes are highly ordered. This highlights again the fact that notions of randomness and order can only be properly understood in a dynamic interactive context, where they mutually imply each other.

Thus the very reason why the (individual) primes can be random (in this Type 1 manner) is due to the perfect order with respect to the respective gaps between primes. Without such complementary order with respect to the distance between successive primes, the primes could not preserve their random nature!

So the complementary Erdős–Kac (Type 1) Theorem does not apply directly to the gaps as between primes, but rather the frequency of primes.

As we know this frequency of primes can be simply approximated as n/log n (which steadily improves in relative terms as n increases).

Thus once more when n is very large, we can take successive similar sized random samples of primes (that are large in absolute terms though very small relative to n).

The actual number of primes occurring in each case will tend to cluster around an average value with deviations occurring with respect to the average value.

Now the contention is that these deviations from the average value will be normally distributed!

Thus through knowledge of the standard deviation, the distribution can be standardised so that in principle we can predict the probability percentages for actual number of primes occurring within any distance from of the mean.

The mean average mean value for sample values will be given as t/log n.

The standard deviation seems to be somewhat less than √(t/log n) and approximated roughly by √(t/2 log n).

So to illustrate, in the region of n = 9.000,000, log n = 16!.

Therefore if we were to take repeated samples of say 2,000 in this region of n, we would expect the mean average number of primes to be about 2,000/16 = 125.

Now the actual values of primes occurring will deviate from this average, with the standard deviation roughly √(2000/(2 * 16) = 7.9 (rounded to 8).

Therefore, we could use the normal distribution to estimate that about 2/3 (68%) of the recorded actual number of primes in each sample would lie between 116 and 134.

Needless to say, I am merely suggesting here that a complementary Erdős–Kac (Type 1) Theorem in principle should apply to the primes, that can be precisely represented (in a relative manner) by the normal distribution.

However I have not suggested here the proof nor indeed clarification as to the precise standard deviation.

Addendum (8/10/2016) As mentioned in a later entry I have since accepted that the standard deviation should be √(t//log n).