Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Axioms - the choice & meaning

  1. Jun 26, 2006 #1
    Hi! I'm a high school student and I've been interested in Logic for some time... Although I read some books and acquired some knowledge, I still have one question that remains unanswered in spite of my hard work... My tutor told me that the three axioms of Propositional Logic (see them for example at http://en.wikipedia.org/wiki/Axioms) were chosen independently from the (semantic) meaning of the "implication" connectives and thus independently from the number of truth values (true/false; 0/0.5/1 and so on)...

    Is it true? I didn't understand his explanation and so I began my own study of this subject. Does anyone of you know any Internet source or a book that deals with the choice of axioms of Propositional Logic in detail? Thanks a million!
     
  2. jcsd
  3. Jun 26, 2006 #2
    This is also related with the question - what led logicians to choose these three axioms, what is their history...? Because in most Logic books, these three axioms are presented as a definition, they are put down to be learnt by heart but without any deep explanation of their meaning, their history etc.
     
  4. Jun 26, 2006 #3

    honestrosewater

    User Avatar
    Gold Member

    There are no three special axioms of propositional logic. There are calculi for propositional logic that have infinitely many axioms, a billion axioms, five hundred axioms, a dozen axioms, seven axioms, one axiom (e.g., [(((B -> C) -> (~D -> ~E)) -> D) -> F] -> [(F -> B) -> (E -> B)] - from Mendelson, Introduction to Mathematical Logic, 4th ed.), no axioms, and anything in between.

    People might use certain sets of axioms for many reasons: convenience, elegance, custom. I imagine you usually want two things from your axioms. You want them to correspond to tautologous formulas and you want to be able to use them to prove other fomulas. Don't forget about the other part of the calculus: the set of inference rules.

    Propositional logic is a two-valued logic. It only has two truth-values.

    There are some 'meta-laws', you might call them, that you might want to check out. For starters, you coudl google some of these: law of the excluded middle, law of noncontradiction, principle of bivalence, law of identity.

    For resources, see this: https://www.physicsforums.com/showthread.php?t=108151

    Are you using a primary logic text?
     
  5. Jun 27, 2006 #4
    Thanks for your reply!

    Stefan Bilaniuk states in his Problem Course in Mathematical Logic (ver. 1.6) for example this: [​IMG]

    The thing I was talking about is - why he chose these three axioms, how did he proceed in choosing them, what is their significance (they should have some intuitive meaning) etc.

    I use a lot of textbooks, among them e.g. A Problem Course in Mathematical Logic by Stefan Bilaniuk; Language, Proof and Logic by J. Barwise & J. Etchemendy and Klasicka matematicka logika by Antonin Sochor (Classical Mathematical Logic, it's a book in Czech)...
     
  6. Jun 28, 2006 #5

    honestrosewater

    User Avatar
    Gold Member

    Sorry for the sloppy post yesterday. I'm kicking myself that someone comes along who's interested in logic, and I barely say two words. Oy. Start over?

    Welcome to PF, dobry_den! :smile:

    I was thinking in the back of my head most of the day about how to answer these questions, and I'm still not sure where to start.

    The first problem is that the best that I can offer is of course just a guess at what might have been going on in the brain of this particular author, and I suspect that's not really what you're interested in anyway. They might have borrowed this axiom-schema set from another author, which doesn't tell you much at all.

    That aside, if you know what you're using the axioms or axiom schemas to accomplish, the reasons for choosing certain sets fall out naturally. Do you know what you're using the axioms to do? Creating all of the pieces for a propositional logic is easy enough that we can run through it.

    You have 4 basic ingredients working together: (i) the formal, object language that contains of all of the strings that you're playing with, (ii) the semantic components that assign meanings and truth-values to strings, (iii) the calculus that generates certain strings from other strings, by certain rules, and (iv) the group of meta-somethings: the metalanguage, metatheory, and meta-whatever-else that you are actually using to define, discuss, and prove things about the other 3 parts. Are you already comfortable with each of these pieces?
     
  7. Jun 28, 2006 #6
    Thanks for your welcoming! :smile:

    Concerning those four "ingredients" - well, yes, i think that I'm familiar with each of them.

    I'll try to restate my question, maybe it'll help:

    I know that to construct a formal proof, we have to start somewhere.. it's not possible to prove something "out of nothing". Thus, we have to state basic, "self-evident" axioms (or rather - axiom schemes). We subsequently build on these axiom schemes.

    As there are millions of different axioms&rules systems for propositional logic (although they are equivalent in the sense that we can prove the same formulas with all of them), I'll try to stick on Bilaniuk's one (see the link above) from now on.

    Is the "rightness" of this system (the three axiom schemes plus the modus ponens rule) taken from the fact that it is possible to prove self-evident (I mean really self-evident) formulas (e.g. A->A [in other "words" A or ~A; A or not A]) with its help?

    Or in one sentence - what governs us in choosing the axioms?

    (I hope I haven't confused you even more...)
     
    Last edited: Jun 28, 2006
  8. Jun 28, 2006 #7
    Last edited: Jun 28, 2006
  9. Jun 29, 2006 #8

    honestrosewater

    User Avatar
    Gold Member

    Well, judging from your questions, the relationships among those ingredients aren't clear to you yet. Just quickly running off a list...
    * You can infer a formula from the empty set of formulas, which I guess might count as a scare-quotes kind of nothing. You do need to define what a proof is, though, so you need to have at least one language before you start trying to generate proofs. Your use of 'formal' suggests that you want a formal language to boot.

    * Axioms don't have truth-values, so I'm not sure what 'self-evident' means here, unless it is the definition of 'axiom'. Axioms are just formulas of your object language; they are meaningless, uninterpreted strings of symbols. This detail can be safely omitted elsewhere, but it's important here. It seems that it might have left you looking for something that isn't there.

    * Axioms don't need to be basic or simple or short or anything of that sort. If you're smart, you will choose ones that will be useful to you.
    * There are not merely millions, or any other large but finite amount, of calculi for propositional logic (PL) -- there are infinitely many of them. This follows at once by noting that (i) there exists a calculus for PL that has a finite set of axioms and generates an infinite set of theorems, and (ii) if C is a calculus for PL and you create a new calculus C+ by copying C and adopting one of C's theorems as an axiom, then C+ will also be a calculus for PL. You can just keep adding new axioms, creating new calculi. You can do the same with inference rules too.
    * 'The three axiom schemes plus the modus ponens rule' is a calculus. (I don't know about you, but 'system' is usually what I call something when I don't know what else to call it. In case that's what you're doing, this 'system' is a calculus, given the details for replacing the axiom schemas with axioms.)

    * Do you know what you want to do with the calculus, what properties you want it to have? What else could 'rightness' mean? You want semantic entailment to agree with syntactic entailment. Do you know what I mean by that?
    * Well, again, formulas are just strings of symbols. They don't come with inherent, built-in truth-value assignments. The semantic components are what assign truth-values to formulas. The only truth that a formula can be said to have is the truth-value that some valuation assigns to it. You have to actually specify that a formula has been assigned a truth-value of 'true', 'false', '1', 'martha', or whatever names you choose for your truth-values.

    That said, sure, you can demonstrate in an indirect way that every valuation v that assigns a truth-value of 'true' to every member of some set S of formulas also assigns a truth-value of 'true' to a formula f. For example, you can do this by proving that a proof of f from S exists in a calculus that has been proven to be sound. This is one reason for creating a calculus in the first place.

    * The construction of truth tables is just an algorithm, a mechanical procedure for determining the truth-value of a formula given the truth-values of its subformulas. Truth tables are to the semantics what long division, calculators, and abaci are to arithmetic. They are calculating devices. Incidentally, in more powerful logics that include quantifiers that can range over infinite classes, the construction of truth tables for some formulas would go on forever just as a long division would go on forever for nonterminating decimals. Truth tables are of limited use.

    Anywho, if that doesn't clarify things, as I said, creating all of the pieces for propositional logic is easy, so if you are willing, we can walk through each step, starting with the specification of the formal, object language.
     
    Last edited: Jun 29, 2006
  10. Jun 29, 2006 #9
    Thanks a lot for your answer(s)!

    This would be probably the best thing to do as it seems my basics in logic are not as "firm" as I thought :blushing:

    I'd be really glad if we could walk through each step and - at the same time - discuss the issues that are not clear to me...
     
    Last edited: Jun 29, 2006
  11. Jun 30, 2006 #10
    Maybe I'm getting on the right track now... :smile: - after having read "CLASSICAL AND NONCLASSICAL LOGICS", an overview of a book by Eric Schechter (here) - mainly the part about axiom systems. Here's a short excerpt:

    But wait! Why do we need to prove X -> X? Isn’t it obviously true?
    Only if we assume that the symbol “->” has some meaning close
    to the usual meaning of “implies.” But we don’t want to assume
    that. In axiomatic logic, we start with no meaning at all for
    symbols such as “->”; they’re just symbols. They obtain only the
    meanings given to them by our axioms.

    I think I now do understand what you meant by the four ingredients you mentioned earlier. So - we choose such axioms of propositional logic and such rules (i.e. such calculus) that makes syntactic entailment to agree with semantic entailment.

    First, we build up the language and semantic components. Then we contrive calculus that, as you wrote, "generates certain strings from other strings, by certain rules". Having accomplished this, we can "test" our calculus by constructing entailments both in semantic and syntactic part of propositional logic (obviously, they should agree). Having accepted modus ponens rule, we can deduce from the veracity of {A, A->B} the veracity of formula B. This holds also from the semantic point of view - when formulas A and (A->B) have the truth value of 1 (true), formula B has to be true as well...

    So what do you think? Am I getting on the right track? ;o)
     
    Last edited: Jun 30, 2006
  12. Jul 1, 2006 #11

    honestrosewater

    User Avatar
    Gold Member

    Yep, it sounds that way. You could of course also define the calculus first and the semantics to match. Their 34th slide reveals the main goal: to make the set of tautologies equal to the set of theorems. This is actually a special case of a pair of theorems that we can prove soon. (One theorem says roughly that if we can prove a formula f from a set of formulas F, then f is true whenever all members of F are true. The other theorem says roughly that if f is true whenever every member of F is true, then we can prove f from F. Tautologies and theorems are the cases where F is the empty set. Anywho, we need a language first...)

    I. Our Formal, Object Language L​

    We get to create our own language from scratch, but to avoid all kinds of confusion, we need to clearly distinguish -- so pound into your skull the difference -- between the language that we are creating and the language that we are using to create it. The former is called the object language, and the latter is called the metalanguage.

    Our metalanguage is going to be English, with some special symbols and terms added as necessary. In Klasicka matematicka logika, I imagine the metalanguage is Czech, again with some special symbols and terms added in. For short, we'll call our object language L. So

    0. Our languages
    (a) Metalanguage: English with some special symbols and terms.
    (b) Object language: L.

    To drive this home, I'll prefix meta- and L- to the relevant units for a while, starting with a correction of my first paragraph:
    Yep, it sounds that way. You could of course also define the calculus first and the semantics to match. Their 34th slide reveals the main goal: to make the set of L-tautologies equal to the set of L-theorems. This is actually a special case of a pair of meta-theorems that we can prove soon. (One meta-theorem says roughly that if we can prove an L-formula f from a set of L-formulas F, then f is true whenever all members of F are true. The other meta-theorem says roughly that if f is true whenever every member of F is true, then we can prove f from F. L-tautologies and L-theorems are the cases where F is the empty set. Anywho, we need an object language first...)​

    That L is a formal language (as opposed to a natural language, like English or Czech) just means that we made it up for a special purpose, it's precise, and all that good stuff. What we ultimately want from L is a set of L-formulas -- our L-tautologies and L-theorems (and L-axioms) are all L-formulas. Since L is currently empty, we need to give L some symbols. We choose the L-symbols already knowing what types of L-formulas we want to build with them. L-symbols are strung together to yield a set of L-strings, and then we choose some of the L-strings to be L-formulas. That's really all there is to it.

    1. The symbols of L
    (a) a countably infinite set of propositional symbols.
    (b) a negation symbol.
    (c) an implication symbol.

    What do these L-symbols look like? We don't say. We don't care. They might not have a physical form at all. But so that we can talk about them, we give them names in our metalanguage, and for convenience, we adopt some special meta-symbols to use as shorter names. For a propositional symbol of L, we'll use the meta-symbol 'p' with indexes as needed (i.e., p1, p2, etc.); for the negation symbol of L, we'll use the meta-symbol '¬' (as most people do); for the implication symbol of L, we'll use the meta-symbol '→' (again, as most do).

    We could of course use 'the mystery symbol of Xanadu', 'the implication symbol of L' or '$%#**bad_idea**&%' as names for the negation symbol of L, but those have some obvious drawbacks. We chose 'negation' because we intend to associate the negation symbol of L with the negation operation later on. The same goes for the other L-symbols.

    2. The strings of L
    (a) if l is a natural number and m1, m2, ..., ml are L-symbols, their concatenation m1m2...ml is an L-string whose length = l.
    (b) notably, the empty string has length = 0.

    So the length of an L-string must be finite, but we don't put any other bound on how long they can be. The set of L-strings is also countably infinite and includes, for example, ¬, p1, ¬p1, p1¬, →¬→→p23957→→p64→¬, and p1→p1.

    3. The formulas of L
    (a) a propositional symbol is an L-formula.
    (b) if φ is an L-formula, then ¬φ (i.e., the L-string formed by concatenating ¬ and φ in that order) is an L-formula.
    (c) if φ and χ are L-formulas, then →φχ (i.e., the L-string formed by concatenating →, φ, and χ in that order) is an L-formula.

    The set of L-formulas is infinite and a subset of the set of L-strings, so it is also countably infinite. The set of L-formulas includes, for example, p1, ¬p1, →p1p2, and ¬→p1¬p2. And with that, L is finished.

    You'll run into this same basic pattern of specifying symbols, strings, and formulas in creating other formal languages in logic, computer science, and linguistics. Though for other languages, you might have more types of symbols, different rules for strings, more steps in building formulas, and so on.

    By the bye, we used only negation and implication and wrote implication formulas as →φχ in order to keep L simple. We can add meta-names for disjunction, conjunction, and whatever other shortcuts we want. We can also change the order of formulas of the form in (3c) to the more customary (φ → χ). This is just a matter of adding some definitions to our meta-language. For example, we can say

    →φχ =df (φ → χ)

    and

    (φ → χ) =df (¬φ ∨ χ)

    This only adds more names to our meta-language. It doesn't change L at all.

    Well, there's plenty more that I could say about L and other formal languages, but I'll stop myself. :biggrin: Questions? Everything good so far?

    (By the bye, I'm using the same basic system as is in Moshé Machover's Set theory, logic, and their limitations, 1st ed., if you want to check it out. It's one of my favorite logic books, along with Wilfrid Hodges' Logic -- and I've read dozens of them.)
     
  13. Jul 1, 2006 #12
    Wow, thanks for your great post...! I think it's mostly clear up to this point - I have only one question:

    Do I get it right that "(¬φ ∨ χ)" is a meta-string (meta-formula?) that represents the L-formula "→φχ"?
     
    Last edited: Jul 1, 2006
  14. Jul 1, 2006 #13

    honestrosewater

    User Avatar
    Gold Member

    Haha, perhaps yes, perhaps no. I might not have been as careful as I was trying to be. Just bear with me for a sec; this might be a tad painful.

    Remember that saying '→' is just another way of saying 'the implication symbol of L' -- we don't know if the symbol '→' is itself a symbol of L -- we never actually write down any symbols of L. We use names to refer to symbols of L. So to be horribly precise, we could say things like 'the L-formula that "→p1p2" refers to' or 'the L-formulas that "→φχ" refers to', but that is a huge pain and totally unnecessary besides.*

    In their normal capacity, names refer to other things. When I say that Bob is a boy with blonde hair, the string of English symbols 'Bob' refers to a boy, not to itself, a string of English symbols. To make a name refer to itself, i.e., to talk about the string instead of its referent, people conventionally put the string in quotes or italics or use some similar device (I'm using quotes), as you would do when saying that 'Bob' is a string with three letters. Putting a name in quotes takes away its normal capacity to refer. →p1p2 is an L-formula, just as Bob is a boy. '→p1p2' and 'Bob' are both strings of symbols in our metalanguage, English.

    So the rules are simple:
    If you're talking about the string itself, put it in quotes.
    If you're talking about the thing to which the string refers, don't use quotes.

    I might still bungle this occasionally, though -- let me know if you catch me. It sometimes isn't of much consequence or is presumably already clear from the context. However, sometimes it does make a difference, so returning to your question...
    Yes, '(¬φ ∨ χ)' is a meta-string. Symbols in quotes should not ever be L-symbols (or L-strings or L-formulas) since we don't know what L-symbols (or L-strings or L-formulas) look like or how to write them. So '→φχ' is not an L-formula. Both '→φχ' and '(¬φ ∨ χ)' are on equal footing as meta-strings.

    Yes, '(¬φ ∨ χ)' does refer to the same thing that '→φχ' refers to. They're just different names for the same thing. They both refer to a set of L-formulas (see the endnote below).

    Yes, you could call '(¬φ ∨ χ)' a meta-formula, though that's a little misleading since '(¬φ ∨ χ)' contains variables and refers to more than one L-formula. Something like 'an L-formula of the form (¬φ ∨ χ)' might be less confusing.

    Sorry if this was already clear to you -- I wanted to be sure. The details of the languages are perhaps the least fun parts for some, but I think that's about all I needed to say. I'll try to make time to post about the next parts overnight or tomorrow.

    * Note that '¬', '→', 'p1', 'p2', etc. are constants; they refer to one L-symbol each. By contrast, 'φ' and 'χ' are variables; they range over the whole set of L-formulas. (These may be more specifically called 'syntactic constants' and 'syntactic variables' in order to distinguish them from other kinds of constants and variables. But you don't have to worry about that quite yet.) So '→p1p2' refers to a single L-formula, whereas '→φχ' refers to infinitely many L-formulas (every L-formula that begins with an implication symbol).
     
  15. Jul 2, 2006 #14

    honestrosewater

    User Avatar
    Gold Member

    It was bothering me that I didn't have a better analogy-type explanation for what is fundamentally a simple distinction. This just struck me as I was nodding off. Let me know if it's an improvement.

    Suppose that I write a paper describing all of the animals at the local zoo. Well, obviously, none of the animals are actually exhibited in the paper. If you want to see the animals themselves, you have to go to the zoo.

    Suppose now that I write a paper describing the object language L. Although it might be possible to exhibit L in the paper since L is itself a language, we choose not to do this. Instead, we treat the parts of L as if they are animals in a zoo. No part of L is actually exhibited in the paper. If you want to see the parts of L themselves, you have to go to the L-zoo.

    P.S. We don't say where the L-zoo is. :tongue2: Capiche?

    I don't mean to keep harping on this. I'd just like to find a good conceptual hook.
     
  16. Jul 2, 2006 #15
    I have a question about this,

    Why do we do it this way? Why not let the symbols we draw be the actual symbols of L? Couldn't the same symbols be in both the object language and the metalanguage?
     
  17. Jul 2, 2006 #16

    honestrosewater

    User Avatar
    Gold Member

    Sure. For example, in linguistics, when we study English using English (or Japanese using Japanese or Rotokas using Rotokas), our metalanguage and object language are the same. This doesn't make things any less complex or confusing, by the bye. By definition, whenever we study a language and our study itself takes place in a language, we have an object language and a metalanguage. Whether they are the same language or not, or whether we choose to specify any physical forms for the object language, the two roles of object language and metalanguage are still present and distinct.

    When studying formal logic, the particular implementations, physical forms, or applications of the language, the shape of the implication symbol, and suchlike just aren't relevant. I imagine this is clear enough from the fact that logicians already define and study formal languages without ever specifying any of those things. You can specify them if you want to, but doing so just makes your language, well, more specific. Philosophers, computer scientists, linguists, engineers, and others can and do worry about such things. Logicians don't need to.

    If you've never thought about it before, specifying a sign system -- or several of them if we want to communicate in different modes, e.g., through speech, writing, gesture, Braille, chemicals, etc. -- can be a lot of work. Even if we use existing systems, we need to define the maps between our symbols and their physical forms. Until now, I haven't bothered to distinguish between these two parts. Suppose we say that one of the propositional symbols of L has the physical form 'p1'. Is 'p1' the same form (does it also represent that propositional symbol)? What about 'p1' or 'P1' or 'p1' or 'pone' or 'b1' or 'tree' or ':yuck:'? Clearly, our definition needs to be more precise, but we run into problems when we want our forms to be discrete and we have a continuous channel. We are really specifying ranges of values or clusters around norms rather than single forms (the 'single form' is another abstraction). At any rate, when specifying these things, we don't want to say that the implication symbol actually is some physical form. We want to keep the symbol separate from its form(s). And since the forms don't affect the aspects that we're interested in, we don't care what the forms are.

    Also, L has infinitely many symbols. Do we want each symbol to have a unique form? Can we do that? What about for strings? Can we exhibit them all at the same time? And so on. The physical parts of sign systems, and communication systems, come with a whole new bunch of questions to answer and problems to solve -- and some answers can cause other problems. Luckily, we can shift these problems to our metalanguage and take advantage of its answers without getting the object language all tangled up in the physical world.

    If you are curious, you can see some of these ideas in action, for example, in this specification of Extended Bachus-Naur Form (PDF) (HTML), a formal language developed to act as a metalanguage for specifying other formal languages. (It's used in computer science and other areas that would suffer from the ambiguities that can arise from using natural languages as metalanguages.) I'm thinking specifically about Table 1 on page 7. Notice that it lists three things: (i) the metalanguage symbol, (ii) a name for the character representation, and (iii) a glyph. For example, (i) the repetition-symbol is represented by (ii) the asterisk, which usually looks something like (iii) '*'.
     
  18. Jul 3, 2006 #17
    Thanks for making this first part clear for me! I think I did grasp the point now.

    Unfortunately, I won't have access to the Internet during the following 4 days. But I'm really looking forward to reading your next posts...:smile:
     
  19. Jul 4, 2006 #18

    0rthodontist

    User Avatar
    Science Advisor

    Sorry to interject on such a productive conversation. I have little knowledge of mathematical logic, but I'm pretty sure it's accurate to say that for propositional logic at least, the original axioms were chosen because the scheme that results does make intuitive sense, i.e. the arrow of implication was intended to signify the mental meaning of "implies." Boole called his work an investigation into the laws of thought, not the laws of symbols. There are ways to specify propositional logic so that all the rules make immediate intuitive sense.

    Then logicians played around with this intuitive system, and created alternate sets of axioms and rules of inference that specify the same system. The origin of the system is still that it makes sense when interpreted in a certain way.
     
  20. Jul 9, 2006 #19

    CRGreathouse

    User Avatar
    Science Advisor
    Homework Helper

    Sorry, going off-topic for a moment.

    Woah, you're into linguistics? Are there any forums you go to? ("... to which you go", if you're a prescriptivist.)

    I love the subject, but I don't know too much about it; I've been teaching myself by picking up books on the subject and reading them, no formal basis. I'd like to learn more.
     
  21. Jul 10, 2006 #20

    honestrosewater

    User Avatar
    Gold Member

    Hey, sorry, I haven't forgotten about this thread, just had a bad week. I'll get back to this ASAP. :smile:
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?



Similar Discussions: Axioms - the choice & meaning
  1. Choices (Replies: 41)

  2. Axiom of Causality? (Replies: 13)

  3. Are axioms subjective? (Replies: 38)

  4. Axioms and Faith (Replies: 33)

Loading...