Virgil Șerbănuță edited untitled.tex  over 8 years ago

Commit id: f430edc2cc78ee99d383d1d54ec2d14d95bbaac7

deletions | additions      

       

For a given world, a good set of axioms would be one that would allow us to make all possible correct predictions for that world. The term prediction is not a clear one. To make it more clear, let us restrict again the \ghilimele{possible worlds} term. One option would be to make it to denote all possible worlds which have a concept of time and a concept of the state of the world at a given time and for which describing the state of the world at all possible times is equivalent to describing the world.  This ignores some important issues like the fact that it's reasonable to have a concept of time without having a well defined concept of "the \ghilimele{the  state of the world at a given time", time},  so we could rephrase the definition above to include many other reasonable notions of space and time, e.g. we can include worlds where "point" \ghilimele{point}  is a concept and we can know which (point, time) pairs are before a given (point, time) pair. Then, in the following, we will say that we can \definitie{predict} something ($S$) whenever we have a set of axioms for which $S$ is uniquely determined by the state of the world at a subset of the previous points in time\footnote{Will be extended to statistical predictions in the next paragraph}. If we are interested in predicting the state of the world at a given point $P$ and time $t$, a good choice for this subset could be a full section through $P$'s past (e.g. a plane which intersects it's past cone), i.e. a subset that separates $P$'s past in two parts, one which is "before before  the subset" subset  and one which is "after after  the subset"\footnote{This subset\footnote{This  means that all lines which fully lie in $P$'s past and connect a point which is before the subset with a point which is after the subset must go through the subset}. One could think of similar definitions for predicting the entire state of the world. If needed, this definition could be changed to work for more concepts of space and time. In a deterministic universe, if we know the laws of the universe and its full state at a given time, we could, in theory, fully predict any future state. But an universe does not have to be deterministic and, even if it is, one could have only a statistical model for it. Then we will allow using a set of axioms which only gives a statistical distribution for the state of the universe given its past (I'll call this a \definitie{statistical axiom set}). For the purpose of this document we don't need to make a difference between a non-deterministic universe and a deterministic one but for which we only have a statistical model.   Let us restrict the "possible worlds" possible worlds  term to the worlds where we can make predictions and let us use only sets of axioms that allow predictions. As mentioned above, for a given world, a good set of axioms is one which allows us to make all possible correct predictions for that world (statistical or not). Using only good sets of axioms solves the "too-general problem" \ghilimele{too-general problem}  since such a set would describe its possible worlds in a way that does not leave out important things. Still, there is nothing that prevents such a set from going into too much details. Let us chose any formalism for specifying axioms that uses a finite alphabet. Then, for each possible world, we could say that the best set of axioms is the smallest good one, smallest being defined as "having having  the smallest length when written on paper". paper.  This is not a well defined notion for a few reasons. First, there could be multiple sets with "the the  smallest length" length  (one obvious case is given by reordering the axioms). In such a case, we could define an order for the symbols that we are using in our formalism and we could pick the system with the smallest length and which is the smallest in the lexicographic order. Second, there could be systems of axioms of infinite length [TODO: These can't really be sorted. I can have an infinite sequence of sets of axioms s1, s2, ... where each element is smaller than its predecessor, but which has no limit. I think I can't reject all systems which include other systems, I may again have an infinite chain. I can reject systems for which some part is implied by a smaller finite part in another system and the reminder is the same.] For this, we will only consider systems which, when written on an infinite paper, use a countable number of symbols. This means that all will have the same length, but we can still use the lexicographic order to compare them. We will ignore systems which need an uncountable set of symbol places. With an axiom system chosen in this way we would also solve the "too-specific problem" \ghilimele{too-specific problem}  [TODO: I only solved it for finite systems. Do I also need to solve it for infinite systems?] since we would remove any axiom that's not absolutely needed. If $U$ is an universe and $A$ is the smallest set of predictive axioms as described above, then we would say than $A$ is the \definitie{optimal set of axioms for $U$}. If $A$ is a set of axioms which is optimal for some universe $U$ then we say that $A$ is an \definitie{optimal set of axioms}.  Now let us see if we actually need infinite length systems. We can have infinite systems of axioms, and there is no good reason to reject such systems and to ignore their possible worlds, so we will take them into account. It is less clear that we can't replace these infinite systems with finite ones. Indeed, let us use any binary encoding allowing us to represent these systems as binary strings, i.e. as binary functions over the set of natural numbers. The encoding of a set of axioms $A$, $encoding(A)$, would be a function from the natural numbers to a binary set, giving the value of the bit for each position in the encoding, $encoding(A):\naturale\longrightarrow \multime{0, 1}$. Then the following scenario becomes plausible: for any universe $U$ with an infinite system of axioms $A$, we can consider $U+encoding(A)$ to be an universe in itself which has $encoding(A)$ as part of its state at any moment in time. Then it's likely that we can find a finite system of axioms which allows predictions for such an universe. While, strictly speaking, this would be a different universe than the one we had at the beginning, it may seem similar enough to it so one may be tempted to use only finite systems of axioms.  On the other hand, using only finite systems of axioms in this way seems to be some sort of cheating. In order to get a more "honest" honest  system of axioms, we could work with very specific systems of axioms, e.g. we could only talk about worlds which have $\reale^4$ as their space and time, whose objects are things similar to the wave functions used by quantum mechanics and so on. \section {Modelling from inside} 

We will assume that those intelligent beings are continuously trying to find better models for their world and that they are reasonably efficient at this.  As a parenthesis, note that until now we restricted the "possible world" possible world  concept several times. The argument below also works with larger "possible world" possible world  concepts as long as those worlds have a few basic properties (e.g. one can make predictions and it can contain intelligent beings) and at the same time it is plausible that our world is such a "possible world". possible world.  First, let us note that having intelligent beings in an universe likely means that their intelligence is needed to allow them to live in that universe, which likely means that they can have a partial model of the universe. That model does not have to be precise (it could be made of simple rules like "If \ghilimele{If  I pick fruits then I can eat them. If I eat them then I live.") live.})  and it can cover only a small part of their world, but it should predict something. Of course, these predictions do not have to be deterministic. Also, they might not be able to perceive the entire universe. Note that the previous definition of prediction does not say that it is feasible to actually predict everything, it only means that prediction is possible for an all-knowing being. [TODO: say why this does not matter: requiring that prediction is actually possible could only make the paper stronger. However, I chosed to argue about a better result.] A related case is the following: It is possible that almost all macroscopic events can be predicted very precisely using quantum physics. Assuming that this is indeed the case, many of these predictions require too many computational resources, making them infeasible. I am requiring even less than this, I am allowing axiom systems where there is no way to infer a prediction from the axiom system, but if one checks all possible models of that system, the prediction turns out to be true. 

We can use any of these definitions (and many other reasonable ones) for the reminder of this paper. Then we would have three possible cases.  First, those intelligent beings could, at some point in time, find an axiom system which gives the best predictions that they could have for their world, i.e. which predicts everything that they can observe. In other words, they wouldn't be able to find anything which is not modelled by their system. We could relax this "best \ghilimele{best  axiom system" system}  condition by only requiring an axiom system that is good enough for all practical purposes. As an example, for an universe based on real numbers, knowing the axioms precisely with the exception of some constants and measuring all constants with a billion digits precision might (or might not) be good enough. Only caring about things which occur frequently enough (e.g. more than once in a million years) could also be "good enough". good enough.  Second, those intelligent beings could reach a point where their theory clearly does not fully model the world, but it's also impossible to improve in a meaningful way. This could be the case if, e.g., they can model a part of their world, but modelling any part of the reminder would require adding an infinite set of axioms and no finite set of axioms would get one a better model.  In order to make this split into cases more clear, let us assume that those intelligent beings would study their universe and would try to improve their axiom systems in some essential way forever. Since they have infinite time available to them, they could use strategies like generating possible theories in order (using the previously defined order), checking if they seem to make sense and testing their predictions against their world, so let us assume that if there is a possible improvement to their current theory, they will find it at some point.   Note that the fraction of the world that can be modelled is increasing, but is limited, so it converges at some value. Also, the prediction error (it's not important to define it precisely here) is decreasing and is limited, so it converges. If the fraction converges at $1$ and the prediction error converges at $0$, then we are in the first case, because we reach a point when the fraction is so close to $1$ and the error is so close to $0$ that one would find them "good enough". good enough.  If the fraction or the error converges to different values then we are in the second case. There is also a third case, when one can improve the axiom system in ways that seem meaningful, without growing the fraction of the world that is covered by the system and without decreasing the prediction error. As an example, imagine a world with an infinite number of earth-like planets that lie on one line and with humans living on the first one. The laws of this hypothetical world, as observed by humans, would be wildly different from one planet to the other. As an example of milder differences, starting at $10$ meters above ground, gravity would be described with a different function on each planet. On some planets it would follow the inverse of a planet-specific polynomial function of the distance, on others it would follow the inverse of an exponential function, on others it would behave in some way if the distance to the center of the planet in meters is even and in another way if the distance is odd, and so on. Let us also assume that humans can travel between these planets freely in some bubble that preserves the laws of the first planet well enough that humans can live, but that also lets them observe what happens outside. 

\section{Description probabilities}  For the purpose of this paper, let us denote by \definitie{finite property} of something any property of that something which can be written using a finite number of words. Since we will use only finite properties here, let us drop "finite" \ghilimele{finite}  and call any of them simply \definitie{property}. These observable descriptions of possible worlds are general enough and different enough that it's hard to say something about them, except that they make sense in a mathematical way. Still, given any property $X$ we could try to see what is the chance that it's true in the set of observable descriptions. 

We can then say that for virtually all descriptions [TODO: Make sure I want descriptions here and not optimal axiom sets or something. Probably I want descriptions], only properties with non-zero probability are true. This means that, if the probability of our world being designed is non-zero, the only rational choices are that either our world is designed or only non-zero probability properties are true.  Now, let us return to the issue of observable descriptions being finite or infinite. With an finite alphabet, only a countable set of models have a finite observable description. Then the "has \ghilimele{has  a finite description" description}  property [TODO: is it a property of the description or of the universe? Does that match what I said above about only inferring things about descriptions? Ah, it's a property of an axiom set, I think.] is a zero-probability one, so either our universe is designed, or at any point in time there will be an important part of our universe that we can observe but can't model no matter how hard we try. \section{Approximations} 

\item Fix $\delta \gt 0$ and say that we care about measuring things which are larger than $\delta$ [TODO: replace epsilon with delta where needed]. This means that we can have three sizes $a$, $b$ and $c$ with $a=b$ and $b=c$ but $a\not=c$. This should be fine as long as we're aware that equality here actually means that the difference is smaller than $\delta$.  \item Fix a time length $s$ and ignore things which happen rarely.  \end{itemize}  We could use any reasonable definition of "measuring" measuring  and "happen rarely". happen rarely.  Then we could say that the important things are the ones which are larger than $\delta$ and which do not happen rarely. Let us also fix an arbitrary time length $t\ge 0$, an acceptable error [TODO: use the right term here. I had a note to use precision/accuracy] $\epsilon \ge 0$ and a probability $q\ge 0$ which is the probability of a random prediction to be successful [TODO: did I define this? Should I move this at the end and say that this is the probability given the previous constraints?] and let us denote by $f$ with $0 < f \le 1$ the fraction of the world where we can make predictions about what happens after the given time length $t$, with the acceptable error $\epsilon$ and having a probability $q$ that the prediction is correct. Then, if the world is not designed, we have a countable number of finite observable [TODO: is observable the right term?] descriptions out of a $\reale$ total number of descriptions. Then, for any continuous distribution, the probability of having a finite description with which we can make predictions for a time length of $t$, with an error $\epsilon$, with a probability $p$ and for a fraction of the world $f$, is $0$. To have a non-zero probability either $t = 0$ (which means that we are not making any prediction, we are just restating the present), $\epsilon = \infty$ (which means that our predictions have no connection to the reality), $p=0$ (which means that our predictions always fail) or $f=0$. We can discard the first option since then we would have no predictions. We can also discard the second and the third since such a description would not be useful in any way. The only remaining option is that $f=0$; as argued above, a description with $f=0$ can actually make sense. Therefore, with probability $1$, we have $f=0$ and the world has an infinite model.  [TODO: Should I replace $f=0$ with "the \ghilimele{the  minimal fraction absolutely needed", needed},  because having a space-time is a property of the entire universe, so f may not be zero? On the other hand, it does not allow any prediction. Should I add a footnote?] There is a distinction that we should make. When predicting (say) weather we can't make long-term precise predictions, and this happens because weather is chaotic, that is, a small difference in the start state can create large differences over time. This could happen even if the universe is deterministic and we know the laws of the universe perfectly, as long as we don't know the full current state of the universe. However, as argued above, with probability $1$, our hypothetical intelligent beings would not be able to make predictions for a significant part of the universe because they would have no idea about how their universe works, not because they don't know its state precisely enough.  [TODO: I should think about what happens when replacing $p$ with a distribution probability.]  Besides the "finite \ghilimele{finite  description for a non-zero fraction of the observable universe" universe}  property, we can look at some of the properties of our universe like homogeneity, isotropy or having the same forces acting through the entire space or for all moments in time [TODO: Make sure that these are distinct]. It is harder to give a mathematical proof that these are zero-probability ones, but if we think that given a set of universes having any of these properties, sharing the same (mathematical space) and having at least two distinct elements, one can slice and recombine them in infinite ways, it is likely that these properties are also zero-probability ones. An example of such a combined possible universe is the one with infinite planets on a line mentioned above. In other words, the cosmological principle is (very) likely to be a zero-probability property. Similarly, if we take the rules for how the universe works as we perceive them, most likely there is a zero chance that they would apply through the entire universe and a very low chance that they would apply outside of earth / our solar system. In other words, if our world is not designed, there is a good chance that we may know a lot about what happens on Earth, maybe something about what happens in our solar system, we almost surely don't know what happens in our galaxy and outside of it. [TODO: put this below and link it to the conclusion.] 

We seem to be able to make predictions for mostly everything that we can observe, even if we may not be able to make many predictions for very distant things. We also have no sign that the laws of the universe would be significantly different outside of Earth, so it seems that the limiting factor is that we don't know the state of the universe. Then the second option is probably false and the first one is probably true.  [TODO: Fix ``quotes".]  [TODO: Make sure I'm using quotes correctly and consistently.]  [TODO: Fix spaces between math mode and punctuation.]  [TODO: Fix the usage of I and we.]  [TODO: Decide when I use axiom set and when axiom system. Say explicitly that they mean the same thing.]