# Isoelectric point

I am looking to understand how isoelectric point is defined and calculated. I read the paper "Isoelectric Point Calculator" (Kozlowski 2016) which gives the definition found everywhere 'isoelectric point pI is the pH value at which the net charge of a macromolecule is zero, and therefore its electrophoretic mobility is stopped'.

So far so good. But every recent paper I can find on the subject, like Kozlowski, goes on to mathematically define this as pH at which

$$\sum_{i=1}^n \frac{-1}{1+10^{pK_n-pH}} = \sum_{i=1}^n \frac{-1}{1+10^{pK_p-pH}}$$
which constitutes, apparently, the net charge or degree of ionization of the negatively-charged residues being equal to that of the positively-charged residues.

Curious as to the origin of this formula I found no explanation except that of Levene and Simms (1923: https://www.jbc.org/content/55/4/801.full.pdf). I quote their definition first Then they say 'The following derivations of formulas for the calculation of isoelectric points are based on the assumption that the ionization of each group (acid or basic) takes place independently of the degree of ionization of other groups in the molecule.' On that basis they find it valid to write the isoelectric condition above as follows (7) and (11) are just the standard equations for degree of dissociation for monovalent positive (basic) and acidic (negative) components. This equation is obviously identical to the one from recent papers like Kozlowski.

I have two questions:

(1) If we are concerned about net charge, rather than net concentration, should we not be weighting by the charges on the forms? For example, consider a simple ampholyte like histidine (let's call it HX). Shouldn't the condition be 2 [H3X ^2+] + [H2X ^+] = [X ^-] rather than, as suggested above, [H3X ^2+] + [H2X ^+] = [X ^-]? As for example David Hitchcock did in his paper https://www.jbc.org/content/114/2/373.full.pdf (see p376)

(2) I am unclear that the form of the HH approximation is valid for matters of charge here. The sum of a single ampholyte's forms' degrees of ionization should not be going above 1. But here the left-hand side and right-hand side could each sum to (nearly) n and m respectively, as even a single term in the summations can approach 1. Thus I question whether the monovalent acid mixture thing is a reasonable approximation to summing up the degrees of ionization using the exact formulae from acid-base theory (i.e., the concentrations of each form from treating ampholyte as a single (n+m)-protic substance).

Am I missing something here in the mathematical derivation of the equation?

#### Attachments

• 38.9 KB Views: 18

Borek
Mentor
Do they use overall, or stepwise contants? I feel like weighing should be different in each case.

The sum of a single ampholyte's forms' degrees of ionization should not be going above 1.
Do they sum forms, or charges? This is in a way similar to the first point.

I haven't seen the papers, so these don't have to address the problem, but they definitely need clarification.

epenguin
Homework Helper
Gold Member
For your question (1) notice that they are talking about macromolecules. having in mind principally proteins, what else? The histidines in a protein are all in peptide bonds and so except for a possible N- or C-terminal one there is only one protonation/deprotonation to worry about. And even for the amino acid over a wide range around neutrality that most biochemists are interested in there is only one change in the degree of protonation.

Secondly, what use is it? Maybe you think if you know the aminoacid composition of a protein you can predict the IEP. But firstly, what real scientific or technological use is that? And secondly you can't really do it anyway because you do not know all these K's! You only know the K's of amino acids (or just maybe of the groups in small peptides). The K's of side groups in proteins are different from those of the free amino acids, sometimes by only a little, sometimes (as for residues buried inside the protein structure) by a lot. (Whilst we are at it, each residue has not just one K, but this is at least a little different for each ionisation state of the others, so each has dozens of K's!)

Unless you are particularly interested in the physical chemistry of amino acids as opposed to proteins, I would not be much concerned with what a paper of the 1930s says. In fact I personally would not be very interested in the question at any time. Nowadays with techniques like NMR you can directly study the ionisation state of a single particular residue. Though for an overall picture of these ionisation states of proteins you very soon find quite a lot of literature e.g.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679426/
https://www.osti.gov/servlets/purl/1183892

The second of these gives you the K's of all the groups of a small protein, lysozyme, of which there are 'only' 20. As one of these publications I think mentions, as well as protonation/deprotonation there are usually also interactions of importance with cations (such as Mg2+, necessary for the action of at least half of all enzymes) not to mention even anions such as Cl-.

Last edited:
I think I've a better handling on where this equation comes from. Was a little unintuitive. Question now is whether / to what extent, such a model is justifiable.

Do they use overall, or stepwise contants? I feel like weighing should be different in each case.

Do they sum forms, or charges? This is in a way similar to the first point.
This hits at the crux of the matter. Though both use stepwise constants, they are using different equilibrium constants from the common approach to acid-base systems. The Levene-Simms formula (let's call it) uses Henderson-Hasselbalch and considers every group an independent dissociation. That way you sum charges (positive minus neutral and neutral minus negative).

The formula cannot be exact because chemical equilibrium is guaranteed only for macromolecules, not individual groups. What do we need to assume, for the model to be valid? Obviously K's need to be 'constant' for each group's dissociation, whatever the values. How do we know if they will be so?

And secondly you can't really do it anyway because you do not know all these K's! You only know the K's of amino acids (or just maybe of the groups in small peptides). The K's of side groups in proteins are different from those of the free amino acids, sometimes by only a little, sometimes (as for residues buried inside the protein structure) by a lot. (Whilst we are at it, each residue has not just one K, but this is at least a little different for each ionisation state of the others, so each has dozens of K's!)
Thank you for detailed comments. What I wrote above is addressed to you too.

To answer your first question (what use is it) my interests are mainly theoretical.

Yes, so, for exact calculations we'd need K's we don't have (in deprotonation order). For Levene-Simms style calculations we need a different set of K's, we also don't have (?). Or maybe we do. But cf my other comments, I'm trying to figure from a theoretical perspective, given these second set of K's, how accurate the equation can be.

Unless you are particularly interested in the physical chemistry of amino acids as opposed to proteins, I would not be much concerned with what a paper of the 1930s says. In fact I personally would not be very interested in the question at any time. Nowadays with techniques like NMR you can directly study the ionisation state of a single particular residue. Though for an overall picture of these ionisation states of proteins you very soon find quite a lot of literature e.g.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2679426/
https://www.osti.gov/servlets/purl/1183892

The second of these gives you the K's of all the groups of a small protein, lysozyme, of which there are 'only' 20. As one of these publications I think mentions, as well as protonation/deprotonation there are usually also interactions of importance with cations (such as Mg2+, necessary for the action of at least half of all enzymes) not to mention even anions such as Cl-.
Yet, as you can see from papers, some research interest persists in this... and all characterizations of isoelectric point in biochemical papers seem to use the 'Levene-Simms' equation. That's why I'm interested in figuring out (1) how far the approach is valid compared to the exact calculations and (2) what requirements are on the K's, that this approach is valid.

Borek
Mentor
I am afraid I don't follow what you are saying:

Though both use stepwise constants, they are using different equilibrium constants from the common approach to acid-base systems.
Please elaborate. In what way their constants are 'different'?

The Levene-Simms formula (let's call it) uses Henderson-Hasselbalch
HH is just a rearranged dissociation constant, in no way different from the common approach to the acid-base system.

and considers every group an independent dissociation. That way you sum charges (positive minus neutral and neutral minus negative).

The formula cannot be exact because chemical equilibrium is guaranteed only for macromolecules, not individual groups.
We do simplify the reality a bit by assigning dissociation constants to individual groups, but we measure them experimentally, so they are in fact properties of the whole molecules.

I am afraid I don't follow what you are saying:

Please elaborate. In what way their constants are 'different'?

HH is just a rearranged dissociation constant, in no way different from the common approach to the acid-base system.
Seems, from what follows, you did understand:

We do simplify the reality a bit by assigning dissociation constants to individual groups, but we measure them experimentally, so they are in fact properties of the whole molecules.
Yes. The issue is they aren't then constants. Chemical equilibrium ratios are not guaranteed to hold for individual groups within a molecule (as far as I know). Which means these 'constants' will actually vary with the degree of ionization of the other groups. For such details I can refer you to the literature, e.g., https://onlinelibrary.wiley.com/doi/abs/10.1016/0307-4412(86)90176-7.

Question remains: what conditions do we need for the group-wise 'constants' to remain approximately constant? Which is the same as asking, what assumptions go behind using this model for proteins in real-world, and what makes the researchers who begin from this model confident that their particular protein likely obeys those assumptions