and then a few lines after that, the expected value of a a function g(x) is said to be given by:
∫f(x)g(x)dx. However, if g(x) is not explicitly known, how does one calculate the integtral?

You didn't quote anything that said g(x) was not explicitly known. It is the distribution of g(x) that is not explicitly known. If X is a random variable with probability density f(x) and g(x) is a known function then Y = g(X) is another random variable. Often you can write the probability density h(y) of Y by doing various calculations. But the integral you wrote doesn't mention h(y).

I got a vague idea about what you're talking about. Tell me if I'm wrong:
According to what you say if Y=g(x) is another random variable and h(y) is its distribution, then
E[Y] = Ʃ h(y)*y for the discrete case. ie
E[g(x)] = Ʃ h(g(x))*g(x).
And this is said to be equal to
E[g(x)] = Ʃ f(x) * g(x).
Doesn't this make h and f equal?

Suppose [itex] X [/itex] is a discrete random variable that is equally likely to be one of the values -1,0,1.
Let [itex] g(x) = x^2 [/itex] and let [itex] Y = g(X) [/itex]

Then [itex] h(y) [/itex] is the function whose non-zero values are [itex] h(0) = 1/3 , \ h(1) = 2/3 [/itex]

The definition of the expectation of [itex] Y [/itex] is not [itex] \sum h(g(x)) g(x) [/itex]. The definition of the expectation of [itex] Y [/itex] involves summing over all possible values of [itex] Y [/itex], not summing over all the possible values of some other random variable.

When the random variable [itex] Y [/itex] is defined by [itex] Y = g(X) [/itex] , there is no simple formula that uses the density of [itex] X [/itex] to compute the density of [itex] Y [/itex] and works in all cases. For example, if the density of [itex] X [/itex] is [itex] f(x) [/itex], the density of [itex] Y [/itex] is not necessarily given by [itex] h(y) = f(g(y)) [/itex]

I think the best way for you to sharpen your intuition on the topic is to glance at material about solving problems like: "Let X be uniformly distributed on the interval [0,1]. Let g(x) = x^2 -2. Let Y = g(X). Find the distribution of Y". This would be the topic of "distributions of functions of random variables". Such problems can be very tedious. Even if you don't work-out the answer to such a problem, considering what is involved will make you appreciate the usefulness of the Law Of The Unconscious Statistician, which lets us sidestep the problem of finding the distribution of Y.

In books that deal with applied math, the terminology used in probability theory is often ambiguous and imprecise. (In statistics, the situation is even worse!). What are we to understand when someone uses the phrase "a function of a random variable". Is such a thing a "function"? - or is it a "variable"? or is it a "random variable"? There is the famous truism: "Random variables are not random and they are not variables".

The random variable Y is not a variable if you take "variable" to mean a symbol that represents a number. We abuse terminology when we say things like "Suppose the random variable Y is 3". What we should say is that "Suppose a realization of the random variable Y is 3". A random variable is not a particular realization, even if we think of that realization as being "unknown". The way to completely specify a random variable is to specify its distribution, so we might think of a"random variable" as simply being the distribution. However, when you must talk about two random variables that are related, you have to worry about the joint distribution. Anyway, the basic idea is that a random variable is just the information in some probability distributions.

Taking for granted that we know the meaning of "function", such as the function g(x) = x^2, we can use a "function" to define a random variable Y in terms of another random variable X. We can define Y as Y = g(X). However, Y is not the "function" g(x) even though it seems to make sense to say that "Y is a function of X". Saying that "Y = g(X) is a function of X" conveys the non-technical idea that Y is defined in terms of X using g(x). But technically, Y is not the function g(X) in the sense of being a mapping whose ordered pairs are of the form (x, g(x)). Y is a random variable. A random variable is essentially a function, but that function is its distribution. The distribution of Y is not g().

Liberal arts students are taught to analyze the meaning of phrases by analyzing the meaning of each word in the phrase. In mathematics, this often does not work. A spedific "random variable" is not "random" because it is defined by a specific distribution. It is not a "variable" if you mean a "variable" to be a symbol representing an unknown number. Mathematical definitions deal with the equivalence of two statements. There is no guarantee that you can break down a statement in a mathematical definiton into its component words and give each isolated word a specific interpretation. For example, the definition of "The limit of f(x) as x approaches a = L" does not define "approaches" nor does it define "x approaches a". Only the entire statement has a definition. When we say "the definition of limit", we arent' really talking about a definition of the single word "limit". We are using sloppy language to refer to a definition for a complete statement.

This and the accompanying statements cleared a lot. I was confusing between the two. And yes, according to the law, E[g(x)] = Ʃf(x)g(x) does give the same.. In the example provided by you, where f(x) = 1/3 : x ε {-1, 0, 1} and g(x) = x^{2},
E[g(x)] does indeed give 2/3. It is clear now.