Axioms - the choice & meaning

Hi! I'm a high school student and I've been interested in Logic for some time... Although I read some books and acquired some knowledge, I still have one question that remains unanswered in spite of my hard work... My tutor told me that the three axioms of Propositional Logic (see them for example at http://en.wikipedia.org/wiki/Axioms) were chosen independently from the (semantic) meaning of the "implication" connectives and thus independently from the number of truth values (true/false; 0/0.5/1 and so on)...

Is it true? I didn't understand his explanation and so I began my own study of this subject. Does anyone of you know any Internet source or a book that deals with the choice of axioms of Propositional Logic in detail? Thanks a million!

This is also related with the question - what led logicians to choose these three axioms, what is their history...? Because in most Logic books, these three axioms are presented as a definition, they are put down to be learnt by heart but without any deep explanation of their meaning, their history etc.

honestrosewater
Gold Member
There are no three special axioms of propositional logic. There are calculi for propositional logic that have infinitely many axioms, a billion axioms, five hundred axioms, a dozen axioms, seven axioms, one axiom (e.g., [(((B -> C) -> (~D -> ~E)) -> D) -> F] -> [(F -> B) -> (E -> B)] - from Mendelson, Introduction to Mathematical Logic, 4th ed.), no axioms, and anything in between.

People might use certain sets of axioms for many reasons: convenience, elegance, custom. I imagine you usually want two things from your axioms. You want them to correspond to tautologous formulas and you want to be able to use them to prove other fomulas. Don't forget about the other part of the calculus: the set of inference rules.

Propositional logic is a two-valued logic. It only has two truth-values.

There are some 'meta-laws', you might call them, that you might want to check out. For starters, you coudl google some of these: law of the excluded middle, law of noncontradiction, principle of bivalence, law of identity.

Are you using a primary logic text?

honestrosewater
Gold Member
Sorry for the sloppy post yesterday. I'm kicking myself that someone comes along who's interested in logic, and I barely say two words. Oy. Start over?

Welcome to PF, dobry_den!

dobry_den said:
The thing I was talking about is - why he chose these three axioms, how did he proceed in choosing them, what is their significance (they should have some intuitive meaning) etc.
I was thinking in the back of my head most of the day about how to answer these questions, and I'm still not sure where to start.

The first problem is that the best that I can offer is of course just a guess at what might have been going on in the brain of this particular author, and I suspect that's not really what you're interested in anyway. They might have borrowed this axiom-schema set from another author, which doesn't tell you much at all.

That aside, if you know what you're using the axioms or axiom schemas to accomplish, the reasons for choosing certain sets fall out naturally. Do you know what you're using the axioms to do? Creating all of the pieces for a propositional logic is easy enough that we can run through it.

You have 4 basic ingredients working together: (i) the formal, object language that contains of all of the strings that you're playing with, (ii) the semantic components that assign meanings and truth-values to strings, (iii) the calculus that generates certain strings from other strings, by certain rules, and (iv) the group of meta-somethings: the metalanguage, metatheory, and meta-whatever-else that you are actually using to define, discuss, and prove things about the other 3 parts. Are you already comfortable with each of these pieces?

honestrosewater said:
That aside, if you know what you're using the axioms or axiom schemas to accomplish, the reasons for choosing certain sets fall out naturally. Do you know what you're using the axioms to do? Creating all of the pieces for a propositional logic is easy enough that we can run through it.

You have 4 basic ingredients working together: (i) the formal, object language that contains of all of the strings that you're playing with, (ii) the semantic components that assign meanings and truth-values to strings, (iii) the calculus that generates certain strings from other strings, by certain rules, and (iv) the group of meta-somethings: the metalanguage, metatheory, and meta-whatever-else that you are actually using to define, discuss, and prove things about the other 3 parts. Are you already comfortable with each of these pieces?

Concerning those four "ingredients" - well, yes, i think that I'm familiar with each of them.

I'll try to restate my question, maybe it'll help:

I know that to construct a formal proof, we have to start somewhere.. it's not possible to prove something "out of nothing". Thus, we have to state basic, "self-evident" axioms (or rather - axiom schemes). We subsequently build on these axiom schemes.

As there are millions of different axioms&rules systems for propositional logic (although they are equivalent in the sense that we can prove the same formulas with all of them), I'll try to stick on Bilaniuk's one (see the link above) from now on.

Is the "rightness" of this system (the three axiom schemes plus the modus ponens rule) taken from the fact that it is possible to prove self-evident (I mean really self-evident) formulas (e.g. A->A [in other "words" A or ~A; A or not A]) with its help?

Or in one sentence - what governs us in choosing the axioms?

(I hope I haven't confused you even more...)

Last edited:
Another example - J. Donald Monk in his book Mathematical Logic:

http://i83.photobucket.com/albums/j315/dobry_den/monk_axioms.jpg" [Broken]

He just states these three axiom without any discussion. Is there anything else that demonstrates their truth besides semantics (i.e. besides their truth tables)?

Last edited by a moderator:
honestrosewater
Gold Member
dobry_den said:
Concerning those four "ingredients" - well, yes, i think that I'm familiar with each of them.
Well, judging from your questions, the relationships among those ingredients aren't clear to you yet. Just quickly running off a list...
I know that to construct a formal proof, we have to start somewhere.. it's not possible to prove something "out of nothing". Thus, we have to state basic, "self-evident" axioms (or rather - axiom schemes). We subsequently build on these axiom schemes.
* You can infer a formula from the empty set of formulas, which I guess might count as a scare-quotes kind of nothing. You do need to define what a proof is, though, so you need to have at least one language before you start trying to generate proofs. Your use of 'formal' suggests that you want a formal language to boot.

* Axioms don't have truth-values, so I'm not sure what 'self-evident' means here, unless it is the definition of 'axiom'. Axioms are just formulas of your object language; they are meaningless, uninterpreted strings of symbols. This detail can be safely omitted elsewhere, but it's important here. It seems that it might have left you looking for something that isn't there.

* Axioms don't need to be basic or simple or short or anything of that sort. If you're smart, you will choose ones that will be useful to you.
As there are millions of different axioms&rules systems for propositional logic (although they are equivalent in the sense that we can prove the same formulas with all of them), I'll try to stick on Bilaniuk's one (see the link above) from now on.
* There are not merely millions, or any other large but finite amount, of calculi for propositional logic (PL) -- there are infinitely many of them. This follows at once by noting that (i) there exists a calculus for PL that has a finite set of axioms and generates an infinite set of theorems, and (ii) if C is a calculus for PL and you create a new calculus C+ by copying C and adopting one of C's theorems as an axiom, then C+ will also be a calculus for PL. You can just keep adding new axioms, creating new calculi. You can do the same with inference rules too.
Is the "rightness" of this system (the three axiom schemes plus the modus ponens rule) taken from the fact that it is possible to prove self-evident (I mean really self-evident) formulas (e.g. A->A [in other "words" A or ~A; A or not A]) with its help?
* 'The three axiom schemes plus the modus ponens rule' is a calculus. (I don't know about you, but 'system' is usually what I call something when I don't know what else to call it. In case that's what you're doing, this 'system' is a calculus, given the details for replacing the axiom schemas with axioms.)

* Do you know what you want to do with the calculus, what properties you want it to have? What else could 'rightness' mean? You want semantic entailment to agree with syntactic entailment. Do you know what I mean by that?
Is there anything else that demonstrates their truth besides semantics (i.e. besides their truth tables)?
* Well, again, formulas are just strings of symbols. They don't come with inherent, built-in truth-value assignments. The semantic components are what assign truth-values to formulas. The only truth that a formula can be said to have is the truth-value that some valuation assigns to it. You have to actually specify that a formula has been assigned a truth-value of 'true', 'false', '1', 'martha', or whatever names you choose for your truth-values.

That said, sure, you can demonstrate in an indirect way that every valuation v that assigns a truth-value of 'true' to every member of some set S of formulas also assigns a truth-value of 'true' to a formula f. For example, you can do this by proving that a proof of f from S exists in a calculus that has been proven to be sound. This is one reason for creating a calculus in the first place.

* The construction of truth tables is just an algorithm, a mechanical procedure for determining the truth-value of a formula given the truth-values of its subformulas. Truth tables are to the semantics what long division, calculators, and abaci are to arithmetic. They are calculating devices. Incidentally, in more powerful logics that include quantifiers that can range over infinite classes, the construction of truth tables for some formulas would go on forever just as a long division would go on forever for nonterminating decimals. Truth tables are of limited use.

Anywho, if that doesn't clarify things, as I said, creating all of the pieces for propositional logic is easy, so if you are willing, we can walk through each step, starting with the specification of the formal, object language.

Last edited:

honestrosewater said:
Anywho, if that doesn't clarify things, as I said, creating all of the pieces for propositional logic is easy, so if you are willing, we can walk through each step, starting with the specification of the formal, object language.
This would be probably the best thing to do as it seems my basics in logic are not as "firm" as I thought

I'd be really glad if we could walk through each step and - at the same time - discuss the issues that are not clear to me...

Last edited:
honestrosewater said:
* Do you know what you want to do with the calculus, what properties you want it to have? What else could 'rightness' mean? You want semantic entailment to agree with syntactic entailment. Do you know what I mean by that?
Maybe I'm getting on the right track now... - after having read "CLASSICAL AND NONCLASSICAL LOGICS", an overview of a book by Eric Schechter ("www.math.vanderbilt.edu/~schectex/logics/clubtalk.pdf"[/URL]) - mainly the part about axiom systems. Here's a short excerpt:

[INDENT][I]But wait! Why do we need to prove X -> X? Isn’t it obviously true?
Only if we assume that the symbol “->” has some meaning close
to the usual meaning of “implies.” But we don’t want to assume
that. In axiomatic logic, we start with no meaning at all for
symbols such as “->”; they’re just symbols. They obtain only the
meanings given to them by our axioms.[/I][/INDENT]

I think I now do understand what you meant by the four ingredients you mentioned earlier. So - we choose such axioms of propositional logic and such rules (i.e. such calculus) that makes syntactic entailment to agree with semantic entailment.

First, we build up the language and semantic components. Then we contrive calculus that, as you wrote, "generates certain strings [I]from other strings, by certain rules[/I]". Having accomplished this, we can "test" our calculus by constructing entailments both in semantic and syntactic part of propositional logic (obviously, they should agree). Having accepted modus ponens rule, we can deduce from the veracity of {A, A->B} the veracity of formula B. This holds also from the semantic point of view - when formulas A and (A->B) have the truth value of 1 (true), formula B has to be true as well...

So what do you think? Am I getting on the right track? ;o)

Last edited by a moderator:
honestrosewater
Gold Member
dobry_den said:
So what do you think? Am I getting on the right track? ;o)
Yep, it sounds that way. You could of course also define the calculus first and the semantics to match. Their 34th slide reveals the main goal: to make the set of tautologies equal to the set of theorems. This is actually a special case of a pair of theorems that we can prove soon. (One theorem says roughly that if we can prove a formula f from a set of formulas F, then f is true whenever all members of F are true. The other theorem says roughly that if f is true whenever every member of F is true, then we can prove f from F. Tautologies and theorems are the cases where F is the empty set. Anywho, we need a language first...)

I. Our Formal, Object Language L​

We get to create our own language from scratch, but to avoid all kinds of confusion, we need to clearly distinguish -- so pound into your skull the difference -- between the language that we are creating and the language that we are using to create it. The former is called the object language, and the latter is called the metalanguage.

Our metalanguage is going to be English, with some special symbols and terms added as necessary. In Klasicka matematicka logika, I imagine the metalanguage is Czech, again with some special symbols and terms added in. For short, we'll call our object language L. So

0. Our languages
(a) Metalanguage: English with some special symbols and terms.
(b) Object language: L.

To drive this home, I'll prefix meta- and L- to the relevant units for a while, starting with a correction of my first paragraph:
Yep, it sounds that way. You could of course also define the calculus first and the semantics to match. Their 34th slide reveals the main goal: to make the set of L-tautologies equal to the set of L-theorems. This is actually a special case of a pair of meta-theorems that we can prove soon. (One meta-theorem says roughly that if we can prove an L-formula f from a set of L-formulas F, then f is true whenever all members of F are true. The other meta-theorem says roughly that if f is true whenever every member of F is true, then we can prove f from F. L-tautologies and L-theorems are the cases where F is the empty set. Anywho, we need an object language first...)​

That L is a formal language (as opposed to a natural language, like English or Czech) just means that we made it up for a special purpose, it's precise, and all that good stuff. What we ultimately want from L is a set of L-formulas -- our L-tautologies and L-theorems (and L-axioms) are all L-formulas. Since L is currently empty, we need to give L some symbols. We choose the L-symbols already knowing what types of L-formulas we want to build with them. L-symbols are strung together to yield a set of L-strings, and then we choose some of the L-strings to be L-formulas. That's really all there is to it.

1. The symbols of L
(a) a countably infinite set of propositional symbols.
(b) a negation symbol.
(c) an implication symbol.

What do these L-symbols look like? We don't say. We don't care. They might not have a physical form at all. But so that we can talk about them, we give them names in our metalanguage, and for convenience, we adopt some special meta-symbols to use as shorter names. For a propositional symbol of L, we'll use the meta-symbol 'p' with indexes as needed (i.e., p1, p2, etc.); for the negation symbol of L, we'll use the meta-symbol '¬' (as most people do); for the implication symbol of L, we'll use the meta-symbol '→' (again, as most do).

We could of course use 'the mystery symbol of Xanadu', 'the implication symbol of L' or '\$%#**bad_idea**&%' as names for the negation symbol of L, but those have some obvious drawbacks. We chose 'negation' because we intend to associate the negation symbol of L with the negation operation later on. The same goes for the other L-symbols.

2. The strings of L
(a) if l is a natural number and m1, m2, ..., ml are L-symbols, their concatenation m1m2...ml is an L-string whose length = l.
(b) notably, the empty string has length = 0.

So the length of an L-string must be finite, but we don't put any other bound on how long they can be. The set of L-strings is also countably infinite and includes, for example, ¬, p1, ¬p1, p1¬, →¬→→p23957→→p64→¬, and p1→p1.

3. The formulas of L
(a) a propositional symbol is an L-formula.
(b) if φ is an L-formula, then ¬φ (i.e., the L-string formed by concatenating ¬ and φ in that order) is an L-formula.
(c) if φ and χ are L-formulas, then →φχ (i.e., the L-string formed by concatenating →, φ, and χ in that order) is an L-formula.

The set of L-formulas is infinite and a subset of the set of L-strings, so it is also countably infinite. The set of L-formulas includes, for example, p1, ¬p1, →p1p2, and ¬→p1¬p2. And with that, L is finished.

You'll run into this same basic pattern of specifying symbols, strings, and formulas in creating other formal languages in logic, computer science, and linguistics. Though for other languages, you might have more types of symbols, different rules for strings, more steps in building formulas, and so on.

By the bye, we used only negation and implication and wrote implication formulas as →φχ in order to keep L simple. We can add meta-names for disjunction, conjunction, and whatever other shortcuts we want. We can also change the order of formulas of the form in (3c) to the more customary (φ → χ). This is just a matter of adding some definitions to our meta-language. For example, we can say

→φχ =df (φ → χ)

and

(φ → χ) =df (¬φ ∨ χ)

This only adds more names to our meta-language. It doesn't change L at all.

Well, there's plenty more that I could say about L and other formal languages, but I'll stop myself. Questions? Everything good so far?

(By the bye, I'm using the same basic system as is in Moshé Machover's Set theory, logic, and their limitations, 1st ed., if you want to check it out. It's one of my favorite logic books, along with Wilfrid Hodges' Logic -- and I've read dozens of them.)

Wow, thanks for your great post...! I think it's mostly clear up to this point - I have only one question:

honestrosewater said:
By the bye, we used only negation and implication and wrote implication formulas as →φχ in order to keep L simple. We can add meta-names for disjunction, conjunction, and whatever other shortcuts we want. We can also change the order of formulas of the form in (3c) to the more customary (φ → χ). This is just a matter of adding some definitions to our meta-language. For example, we can say

→φχ =df (φ → χ)

and

(φ → χ) =df (¬φ ∨ χ)

This only adds more names to our meta-language. It doesn't change L at all.
Do I get it right that "(¬φ ∨ χ)" is a meta-string (meta-formula?) that represents the L-formula "→φχ"?

Last edited:
honestrosewater
Gold Member
dobry_den said:
Do I get it right that "(¬φ ∨ χ)" is a meta-string (meta-formula?) that represents the L-formula "→φχ"?
Haha, perhaps yes, perhaps no. I might not have been as careful as I was trying to be. Just bear with me for a sec; this might be a tad painful.

Remember that saying '→' is just another way of saying 'the implication symbol of L' -- we don't know if the symbol '→' is itself a symbol of L -- we never actually write down any symbols of L. We use names to refer to symbols of L. So to be horribly precise, we could say things like 'the L-formula that "→p1p2" refers to' or 'the L-formulas that "→φχ" refers to', but that is a huge pain and totally unnecessary besides.*

In their normal capacity, names refer to other things. When I say that Bob is a boy with blonde hair, the string of English symbols 'Bob' refers to a boy, not to itself, a string of English symbols. To make a name refer to itself, i.e., to talk about the string instead of its referent, people conventionally put the string in quotes or italics or use some similar device (I'm using quotes), as you would do when saying that 'Bob' is a string with three letters. Putting a name in quotes takes away its normal capacity to refer. →p1p2 is an L-formula, just as Bob is a boy. '→p1p2' and 'Bob' are both strings of symbols in our metalanguage, English.

So the rules are simple:
If you're talking about the string itself, put it in quotes.
If you're talking about the thing to which the string refers, don't use quotes.

I might still bungle this occasionally, though -- let me know if you catch me. It sometimes isn't of much consequence or is presumably already clear from the context. However, sometimes it does make a difference, so returning to your question...
dobry_den said:
Do I get it right that "(¬φ ∨ χ)" is a meta-string (meta-formula?) that represents the L-formula "→φχ"?
Yes, '(¬φ ∨ χ)' is a meta-string. Symbols in quotes should not ever be L-symbols (or L-strings or L-formulas) since we don't know what L-symbols (or L-strings or L-formulas) look like or how to write them. So '→φχ' is not an L-formula. Both '→φχ' and '(¬φ ∨ χ)' are on equal footing as meta-strings.

Yes, '(¬φ ∨ χ)' does refer to the same thing that '→φχ' refers to. They're just different names for the same thing. They both refer to a set of L-formulas (see the endnote below).

Yes, you could call '(¬φ ∨ χ)' a meta-formula, though that's a little misleading since '(¬φ ∨ χ)' contains variables and refers to more than one L-formula. Something like 'an L-formula of the form (¬φ ∨ χ)' might be less confusing.

Sorry if this was already clear to you -- I wanted to be sure. The details of the languages are perhaps the least fun parts for some, but I think that's about all I needed to say. I'll try to make time to post about the next parts overnight or tomorrow.

* Note that '¬', '→', 'p1', 'p2', etc. are constants; they refer to one L-symbol each. By contrast, 'φ' and 'χ' are variables; they range over the whole set of L-formulas. (These may be more specifically called 'syntactic constants' and 'syntactic variables' in order to distinguish them from other kinds of constants and variables. But you don't have to worry about that quite yet.) So '→p1p2' refers to a single L-formula, whereas '→φχ' refers to infinitely many L-formulas (every L-formula that begins with an implication symbol).

honestrosewater
Gold Member
It was bothering me that I didn't have a better analogy-type explanation for what is fundamentally a simple distinction. This just struck me as I was nodding off. Let me know if it's an improvement.

Suppose that I write a paper describing all of the animals at the local zoo. Well, obviously, none of the animals are actually exhibited in the paper. If you want to see the animals themselves, you have to go to the zoo.

Suppose now that I write a paper describing the object language L. Although it might be possible to exhibit L in the paper since L is itself a language, we choose not to do this. Instead, we treat the parts of L as if they are animals in a zoo. No part of L is actually exhibited in the paper. If you want to see the parts of L themselves, you have to go to the L-zoo.

P.S. We don't say where the L-zoo is. :tongue2: Capiche?

I don't mean to keep harping on this. I'd just like to find a good conceptual hook.

honestrosewater said:
Suppose now that I write a paper describing the object language L. Although it might be possible to exhibit L in the paper since L is itself a language, we choose not to do this.

Why do we do it this way? Why not let the symbols we draw be the actual symbols of L? Couldn't the same symbols be in both the object language and the metalanguage?

honestrosewater
Gold Member
Cincinnatus said:
Why do we do it this way? Why not let the symbols we draw be the actual symbols of L? Couldn't the same symbols be in both the object language and the metalanguage?
Sure. For example, in linguistics, when we study English using English (or Japanese using Japanese or Rotokas using Rotokas), our metalanguage and object language are the same. This doesn't make things any less complex or confusing, by the bye. By definition, whenever we study a language and our study itself takes place in a language, we have an object language and a metalanguage. Whether they are the same language or not, or whether we choose to specify any physical forms for the object language, the two roles of object language and metalanguage are still present and distinct.

When studying formal logic, the particular implementations, physical forms, or applications of the language, the shape of the implication symbol, and suchlike just aren't relevant. I imagine this is clear enough from the fact that logicians already define and study formal languages without ever specifying any of those things. You can specify them if you want to, but doing so just makes your language, well, more specific. Philosophers, computer scientists, linguists, engineers, and others can and do worry about such things. Logicians don't need to.

If you've never thought about it before, specifying a sign system -- or several of them if we want to communicate in different modes, e.g., through speech, writing, gesture, Braille, chemicals, etc. -- can be a lot of work. Even if we use existing systems, we need to define the maps between our symbols and their physical forms. Until now, I haven't bothered to distinguish between these two parts. Suppose we say that one of the propositional symbols of L has the physical form 'p1'. Is 'p1' the same form (does it also represent that propositional symbol)? What about 'p1' or 'P1' or 'p1' or 'pone' or 'b1' or 'tree' or ':yuck:'? Clearly, our definition needs to be more precise, but we run into problems when we want our forms to be discrete and we have a continuous channel. We are really specifying ranges of values or clusters around norms rather than single forms (the 'single form' is another abstraction). At any rate, when specifying these things, we don't want to say that the implication symbol actually is some physical form. We want to keep the symbol separate from its form(s). And since the forms don't affect the aspects that we're interested in, we don't care what the forms are.

Also, L has infinitely many symbols. Do we want each symbol to have a unique form? Can we do that? What about for strings? Can we exhibit them all at the same time? And so on. The physical parts of sign systems, and communication systems, come with a whole new bunch of questions to answer and problems to solve -- and some answers can cause other problems. Luckily, we can shift these problems to our metalanguage and take advantage of its answers without getting the object language all tangled up in the physical world.

If you are curious, you can see some of these ideas in action, for example, in this specification of Extended Bachus-Naur Form (PDF) (HTML), a formal language developed to act as a metalanguage for specifying other formal languages. (It's used in computer science and other areas that would suffer from the ambiguities that can arise from using natural languages as metalanguages.) I'm thinking specifically about Table 1 on page 7. Notice that it lists three things: (i) the metalanguage symbol, (ii) a name for the character representation, and (iii) a glyph. For example, (i) the repetition-symbol is represented by (ii) the asterisk, which usually looks something like (iii) '*'.

Thanks for making this first part clear for me! I think I did grasp the point now.

Unfortunately, I won't have access to the Internet during the following 4 days. But I'm really looking forward to reading your next posts...

0rthodontist
Sorry to interject on such a productive conversation. I have little knowledge of mathematical logic, but I'm pretty sure it's accurate to say that for propositional logic at least, the original axioms were chosen because the scheme that results does make intuitive sense, i.e. the arrow of implication was intended to signify the mental meaning of "implies." Boole called his work an investigation into the laws of thought, not the laws of symbols. There are ways to specify propositional logic so that all the rules make immediate intuitive sense.

Then logicians played around with this intuitive system, and created alternate sets of axioms and rules of inference that specify the same system. The origin of the system is still that it makes sense when interpreted in a certain way.

CRGreathouse
Homework Helper
Sorry, going off-topic for a moment.

honestrosewater said:
Sure. For example, in linguistics, when we study English using English (or Japanese using Japanese or Rotokas using Rotokas), our metalanguage and object language are the same.
Woah, you're into linguistics? Are there any forums you go to? ("... to which you go", if you're a prescriptivist.)

I love the subject, but I don't know too much about it; I've been teaching myself by picking up books on the subject and reading them, no formal basis. I'd like to learn more.

honestrosewater
Gold Member

loseyourname
Staff Emeritus
Gold Member
[Note: I apologize for my crappy imitations of the symbols for logical connectives. For whatever reason, unicode isn't showing in my post preview (except for negation), and I also can't get LaTex to work. Hopefully I managed to get the message across either way.]

dobry_den said:
Hi! I'm a high school student and I've been interested in Logic for some time... Although I read some books and acquired some knowledge, I still have one question that remains unanswered in spite of my hard work... My tutor told me that the three axioms of Propositional Logic (see them for example at http://en.wikipedia.org/wiki/Axioms) were chosen independently from the (semantic) meaning of the "implication" connectives and thus independently from the number of truth values (true/false; 0/0.5/1 and so on)...

Is it true? I didn't understand his explanation and so I began my own study of this subject. Does anyone of you know any Internet source or a book that deals with the choice of axioms of Propositional Logic in detail? Thanks a million!
I think I can help with this. First, I presume that by the "three axioms of Propositional Logic," you mean what have traditionally been referred to as the "laws of thought." That is, identity, noncontradiction, and excluded middle. The reason these were called as such is that they were thought to be self-evident in the sense that they actually apply to reality and it is impossible for a human being to envision a reality in which they do not apply.

I am guessing, of course, but I would think the reason your tutor says they were chosen "independently from the (semantic) meaning of the "implication" connectives" is that, historically speaking, they were. They were chosen, depending on which development of a propositional logic we're thinking of, either because they were thought to be proved inductively through the empirical observation of the world, or they were thought to be necessary strictures on the possible ways in which a human being could conceive of logical relations.

In modern propositional logic, however, they are merely three arbitrarily chosen tautologies that are consequences of the semantic rules applied to logical connectives. They are not axioms at all, but rather are necessarily true sentences, an assertion proven by their truth tables.

Identity

p->p
Code:
p p->p
T T
F T

¬(pA¬p)

Code:
p ¬p (pA¬p) ¬(pA¬p)

T  F    F      T
F  T    F      T
Excluded Middle

pV¬p

Code:
p ¬p pV¬p

T  F   T
F  T   T
These tautologies, rather than being axiomatic, arise as a consequence of the semantic meaning assigned to the connectives ->, A, V, and ¬.

However, as honestrosewater has already pointed out, the semantic meaning assigned to these connectives can be anything. It should be pointed, however, that they are not exactly arbitrary. The standard language used in modern propositional logic is meant to be translatable into natural languages. The semantic meaning of the connective A is meant to translate into English as "and." It is given the truth functional behavior it is given based on the fact that the English sentence ''Adam is a man' and 'Rachel is a woman'' is true if and only if 'Adam is a man' is true and 'Rachel is a woman' is also true. As such, the semantic meanings defined in modern propositional calculus are meant to be derivative on the naturally occuring semantic meanings of connectives in pre-existing languages. We can create other logical languages in which this is not the case, but the entire point of creating this logical language was to aid in evaluating arguments made in a natural language.

honestrosewater said:
Hope everything's OK now! ;o)

I have another question, somewhat - I think - relevant to the topic we've been discussing so far.

Deduction (meta)theorem is usually (meta)proved by mathematical induction (see for example Mendelson's Introduction to Mathematical Logic). I am a little bit confused about this, since mathematical induction is based on Peano's axioms - thus apparently a loop occurs here.

Could you please explain this somehow to me? Thanks a lot in advance!

P.S. I also encountered "metamathematical induction". What does it mean exactly? What is the clear distinction between meta- and mathematical induction?

Robert R Stoll in his book "Set theory and Logic" states that
"In brief, metamathematics is the study of formal theories by methods which should be convincing to everyone qualified to engage in such activities."
Is (meta)mathematical induction part of this?

Last edited:
I'd like to rewrite my question, since it's probably not as clear as it was intended to be.

When talking about logic, when studying logic, we use numerous concepts whose precise definitions make use of logic. For example sets (axiomatic set theory) or numbers (Frege's concept of numbers).

We use these concepts on an intuitive base while describing logic - i.e. we use these concepts without defining them rigorously. But this is not worying since we're using only methods and concepts that should be - as mentioned in my previous post - "convincing to everyone qualified to engage in such activities." Thus the way how to prevent loops* from occurring. Their meaning is considered to be (intuitively) clear to everyone.

*
a) metalanguage - using sets and numbers while desribing logic
b) using logic to define sets and consequently numbers => a loop occurs.

This is the way I see and understand it. Kindly correct me if I'm (completely) wrong - any help will be appreciated since I'm stuck on this for quite a while.

And by the way - if you know any books or articles dealing with this subject, please let me know!

Last edited:
Different Perspective on the Frege-Church Axioms

Hello everyone-

I think that I have a much more direct and accurate answer to the original question on this post. My answer comes in three parts, interspersed with (what I hope to be helpful) commentary.

The axioms cited are known as the Frege-Church axioms (after two famous logicians); and the axiom system that contains these exactly three axioms sometimes is known as FC (for Frege-Church). This way of presenting a logical system is how most logical systems were presented before natural deduction and sequent calculi became popular (a technical point).

FC has exactly two primitive logical connectives (the "primitive" is important: stay tuned), one for negation and one for the material conditional (if...then...). Other logical connectives, like those for "and" and "or", can be defined in terms of these two connectives. For instance, "or" can be defined as follows: (p OR q) = (if (NOT-p), then NOT-q). (I use the "=" to represent a definition.)

Here is the first part of my answer to your original question: using two logical operators as primitive and defining the rest as derived rules is simpler than taking many operators as primitive, because if one wants to prove something about one's logical system the proofs are much shorter (there are fewer operators to consider).

At this point, you might be wondering: "Yeah, but maybe there are a ton of operators than can't be defined in terms of negations and material conditionals. (Consider, for example, counterfactual conditionals.) So maybe it's not so simple after all."

To this the reply is: any formal logic system that has a chance of representing the logic of natural language (or approximating some idealized version thereof) is going to include at least negation and material conditionals. FC is something like a "base" logic, that we can then complicate as much as we'd like. And it turns out that there are many operators that CAN be defined in terms of negation and the material conditional: conjunction, disjunction, material equivalence, the Sheffer stroke, NAND, .... These are pretty much all the operators computer programmers need to make all those cool games we like to play.

So the question "why think the three FC axioms are the only axioms for any logical system" is not well-posed, because the number of axioms one needs depends upon what one is trying to model (or otherwise do) with one's axiom system.

On the other hand, the question "why think that the FC system has only those three axioms" can be given a good answer -- to which I now proceed.

Here is what the second axiom tells us about the material conditional: it distributes over itself.

The third axiom tells us how to contrapose a material conditional.

All three axioms of FC are tautologies: this can be verified by truth-tables.

FC has exactly one rule of inference, namely, modus ponens: from P and P --> Q, infer Q.

Now for the second part of my answer: when we build a logical system, we want to maximize both the simplicity of the system and the scope of the theory. We want to maximize simplicity, because this means that it is easier to prove things about the system. (Which factors count towards simplicity? Number of axioms, number of primitive operators, and number of rules of inference. FC has three axioms, two primitive operators, one rule of inference.) We want to maximize scope, because we want our logical system to apply to as much reasoning as possible.

It is provable that FC is equivalent, in both expressive resources and deductive power, to standard classical first-order propositional logic (e.g., systems presented by the likes of Copi and Cohen or in most basic logic books).

The benefit of FC is that it is simpler than those logical systems; the drawback is that it is much less intuitive to prove things using FC.

But from the perspective of someone interested in studying the properties of logical systems, FC is much less of a headache than systems presented in most textbooks.

If you were able to come up with a logical system that is simpler than FC but the same scope, that'd be great. Lots of fine minds have tried: FC seems to be as simple as we can make a system that covers what we want it to cover.

Now for the third part of my answer: There is one more reason for wanting precisely the three axioms of FC. As I noted, we need at least one axiom that gets negations into the system (since we need negations in order to define other non-primitive logical operators); this is accomplished with axiom 3.

The first two axioms allow us to prove something called the Deduction Theorem. Roughly, this theorem states that if we add "p" as an extra premise to some premise set S and go on to derive "q" (in FC), then we can also derive "if p then q" from S (in FC). If we don't include both axioms 1 and 2 in FC, we can't prove this result. But it is really a nice property for a logical system to have! (This is one of the considerations about "scope" I alluded to earlier.)

Of course, FC is not an axiom system for "all of logic", whatever that means. For instance, it doesn't help us in formalizing our reasoning that involves counterfactuals (if p were the case, then q would be the case) or modal operators (e.g.: it is necessary that p only if it is possible that p), etc. But FC is a pretty good axiom system for covering whatever reasoning one is able to formalize after taking one's first course in formal logic.

Finally, a word about what an axiom is. By axiom, I mean "axiom schema" -- this is a schema that does not contain any sentence letters but can be used to generate arbitrarily many "axioms" (this is why the axioms are given as Greek letters rather than english letters). So it will not do as an objection to my post to say that FC contains infinitely many "axioms": in a sense it does, but these are all generated by three primitive axiom schemes -- "primitive" is what matters.

Hope this helps, or at least is a different perspective than previous posts. (I tried to give more of a meta-logical perspective, less of a philosophical one.)

Special Note on the Deduction Theorem

Yes, the deduction theorem is proven using mathematical induction.

But mathematical induction does not involve the axioms of arithmetic or "math". (I'll substantiate this in a moment.) So there is no need to presuppose the Peano axioms in the proof of the deduction theorem.

The best way to substantiate the claims I've made is to present to you a proof of the deduction theorem; you can inspect for yourself that no appeal is made to Peano's axioms.

-----------
*I'll use capital English letters to stand for generic sentences, except for "G", which I use to refer to a set of premises (for "gamma").
* I'll use "-->" to stand for the material conditional.
*This is proven for derivations in FC (the axioms system that invovles the three axioms linked to at the original message on this thread.)

-------------
Theorem: If "P union G" entails Q, then G entails (P --> Q).

Proof: Suppose we have an FC-derivation of Q from "P union G". Let X be any line in this derivation. Then we can show that there is a derivation of (P --> X) from G. This will establish the theorem, because Q is itself one of the lines of the derivation in question (namely, the last line). We proceed by strong mathematical induction on the line number of X.

Induction Hypothesis: Let Y be any line in the derivation in question that appears before the given instance of X in the derivation of Q from "P union G". Then G entails (P --> Y).

To complete the induction, we need to show that G entails (P --> X).

There are several cases.

Case 1: X is an axiom. Then the following is a derivation of (P --> X) from G:

1. X (axiom, given by the case)
2. (X --> (P --> X)) axiom in form of axiom schema 1
3. (P --> X) from #1 and #2 by modus ponens

Note that this case does not use the induction hypothesis, and this derivation does not use any members of G.

Case 2: X is a member of G. Then the following is a derivation of (P --> X) from G:

1. X (member of G, given by the case)
2. (X --> (P --> X)) axiom in form of axiom schema 1
3. (P --> X) from #1 and #2 by modus ponens

Case 3: X is P. Then we need to show that G entails (P --> P). The proof is unwieldy and is available elsewhere: it won't look nice if I type it here. Suffice it to say the proof only uses axioms of FC, no numbers of anything.

Case 4: X follows from previous lines by modus ponens. This is the only place we'll use the induction hypothesis (because this is the only case that refers to previous lines) -- so if you have worries about mathematical induction, focus them here.

The induction hypothesis gives us that G entails (P --> Z) and that G entails (P --> (Z --> X)). Write out two derivations of these entailments, one after the other; suppose that (P --> Z) appears at line 247 and (P --> (Z --> X)) appears at line 723 of the combined derivation. (The numbers here are arbitrary: they are ways of identifying the line number at which a certain formula occurs; we could equally well have used other letters, or colors, or names, or emoticons.)

Then continue this combined derivation in teh following manner:

724. (P --> (Z --> X)) --> ((P --> Z) --> (P --> X)) (this is an instance of axiom schema 2 of FC)
725. ((P --> Z) --> (P --> X), from lines 724 and 723 by modus ponens.
726. (P --> X), from lines 725 and 247 by modus ponens.

This completes the proof, since there are no more cases to consider for what "Q" might be.

----------------------
Notice the absence of any appeal to arithmetic in this proof, and in particular the absence of any appeal to Peano's axioms (all I used were axioms from FC and modus ponens).

Mathematical induction generally proceeds in accordance with the following sort of procedure:

Find some way of "ordering" what you are taking about (e.g., by complexity of formula or whatever).

Show that some property P holds of the first element in this ordering.

Show that if P holds of one element in the ordering, it holds of the "next" element.

Infer that P holds of every element in the ordering.

----------

"Math" is involved in a loose and informal sense in representing the ordering: for instance, we can say that some formulas are more complex than others because they contain "more" logical operators or what-not.

But we don't need anything as sophisticated as Peano's axioms in order to accomplish this. We can count and order things (e.g., in terms of "more" and "less" without using axioms for arithmetic.