Help with the Tight Binding model

VortexLattice · Apr 30, 2013

Hi everyone, I'm trying to fit the Tight Binding molecule for a more complicated system, so I'm first trying to understand it for a simpler one, graphene. I've read several guides but they're all confusing me.

Right now, I'm trying to understand the graphene example on this site. My biggest confusion seems to be about how since graphene is a honeycomb lattice, it has two sublattices, or, two atoms per unit cell (as does my more complicated system, so I need to understand this well).

The guide says that each carbon atom has one ##2p_z## valence orbital, so the ##\phi_{2p_{z1}}## orbital is centered at one of the atoms in the primitive cell and ##\phi_{2p_{z2}}## is centered at the other. The primitive lattice vectors are ##\vec a_1## and ##\vec a_2##.

So, going over every primitive cell in the lattice (though of course we're going to cut it off at nearest neighbors), the total wave function is:

##\psi_{\vec{k}}\left(\vec{r}\right)=\frac{1}{\sqrt{N}}\sum\limits_{h,j}e^{i\left(h\vec{k}\cdot\vec{a}_1 + j\vec{k}\cdot\vec{a}_2\right)} \left( c_1\phi_{\text{2p}_{z1}}\left(\vec{r}-h\vec{a}_1-j\vec{a}_2\right) + c_2 \phi_{\text{2p}_{z2}}\left(\vec{r}-h\vec{a}_1-j\vec{a}_2\right) \right)
##

I don't understand why we do this, though. I understand that this isn't a real Bravais lattice, it's a lattice with a basis, but why does that matter if the atoms are all the same and we're just going to look at a few nearest neighbors?

Anyway, then they do this little trick (using orthogonality of the functions) to get some equations that can be solved:

##\begin{array}{a} \langle\phi_{\text{2p}_{z1}}|\hat{H}|\psi_{k}\rangle = E\langle\phi_{\text{2p}_{z1}}|\psi_{k}\rangle , \\ \langle\phi_{\text{2p}_{z2}}|\hat{H}|\psi_{k}\rangle = E\langle\phi_{\text{2p}_{z2}}|\psi_{k}\rangle . \end{array}##

And here's where my confusion starts. They get this as the result of those two lines (and continue on to get the determinant and such), keeping only on-site and nearest neighbor terms:

##\begin{array}{a} \epsilon c_1 -tc_2\left(1+e^{-i\vec{k}\cdot\vec{a_1}} + e^{-i\vec{k}\cdot\vec{a_2}}\right) = Ec_1 ,\\ \epsilon c_2 -tc_1\left(1+e^{i\vec{k}\cdot\vec{a_1}} + e^{i\vec{k}\cdot\vec{a_2}}\right) = Ec_2. \end{array}##

(where ##\epsilon = \langle\phi_{\text{2p}_{z1}}\left(\vec{r}\right)|\hat{H}|\phi_{\text{2p}_{z1}}\left(\vec{r}\right)\rangle## and ##t = - \langle\phi_{\text{2p}_{z1}}\left(\vec{r}\right)|\hat{H}|\phi_{\text{2p}_{z1}}\left(\vec{r}-\vec{a}_1\right)\rangle##)

First of all, this seems to have only used ##\vec a_1## and ##\vec a_2## for the nearest neighbor terms...but looking at the lattice diagram above, doesn't any atom in one of the sublattices have 6? For example, if you look at the middle C2 atom in that diagram, there are 6 other equidistant C2 atoms.

The answer to that might explain this as well, but if I do the algebra out (even just using ##\vec a_1## and ##\vec a_2## for nearest neighbors the way they seem to) on the left hand sides to try and get to those last equations, I get:

##\psi_{\vec{k}}\left(\vec{r}\right)=e^0 \left( c_1\phi_{\text{2p}_{z1}}\left(\vec{r}\right) + c_2 \phi_{\text{2p}_{z2}}\left(\vec{r}\right) \right) +
\\
e^{i\left(\vec{k}\cdot\vec{a}_1\right)} \left( c_1\phi_{\text{2p}_{z1}}\left(\vec{r}-\vec{a}_1\right) + c_2 \phi_{\text{2p}_{z2}}\left(\vec{r}-\vec{a}_1\right) \right) +
\\
e^{i\left(\vec{k}\cdot\vec{a}_2\right)} \left( c_1\phi_{\text{2p}_{z1}}\left(\vec{r}-\vec{a}_2\right) + c_2 \phi_{\text{2p}_{z2}}\left(\vec{r}-\vec{a}_2\right) \right)##

So

##\langle\phi_{\text{2p}_{z1}}|\hat{H}|\psi_{k}\rangle = c_1 \langle\phi_{\text{2p}_{z1}}(\vec r)|\hat{H}|\phi_{\text{2p}_{z1}}(\vec r)\rangle + c_2 \langle\phi_{\text{2p}_{z1}}(\vec r)|\hat{H}|\phi_{\text{2p}_{z2}}(\vec r)\rangle
\\
+ e^{i\left(\vec{k}\cdot\vec{a}_1\right)}(c_1 \langle\phi_{\text{2p}_{z1}}(\vec r)|\hat{H}|\phi_{\text{2p}_{z1}}(\vec r - \vec {a}_1)\rangle + c_2 \langle\phi_{\text{2p}_{z1}}(\vec r )|\hat{H}|\phi_{\text{2p}_{z2}}(\vec r- \vec {a}_1)\rangle)
\\
+ e^{i\left(\vec{k}\cdot\vec{a}_2\right)}(c_1 \langle\phi_{\text{2p}_{z1}}(\vec r)|\hat{H}|\phi_{\text{2p}_{z1}}(\vec r - \vec {a}_2)\rangle + c_2 \langle\phi_{\text{2p}_{z1}}(\vec r )|\hat{H}|\phi_{\text{2p}_{z2}}(\vec r- \vec {a}_2)\rangle)##

Which certainly has the terms they have, but a few more that they seem to have dropped. Unless I'm understanding this really horribly, they seemed to have dropped all the terms with a ##|\phi_{\text{2p}_{z1}}\rangle## ket except the on-site term.

What's going on? I'm so confused...

Thank you!

daveyrocket · May 1, 2013

In their example, they are not considering hopping from C1 -> C1 or C2 -> C2. They are only considering the very nearest neighbor, which for C1 is the three C2 atoms nearby.

I always found tight binding to be much easier to understand from a general formulation rather than a specific one (and taking understanding from a specific example and trying to generalize is often dangerous). First, you want to view the wavefunction as an expansion in localized orbitals in each unit cell:
[tex]|\psi_{\mathbf{k}m}\rangle = \sum_{n\mathbf{R}} c_{n\mathbf{R}}^{m\mathbf{k}} |{n\mathbf{R}\mathbf{k}}\rangle[/tex]
This is a very general formula. m is a band index, and n runs over all the orbitals within your unit cell. These orbitals can be on different atoms or the same ones. The vector R locates the position of that orbital in your crystal (it is a lattice vector plus the vector from the basis). The basis function [itex]|{n\mathbf{R}\mathbf{k}}\rangle = e^{i\mathbf{k}\cdot\mathbf{R}} |{n\mathbf{R}}\rangle[/itex] satisfies the Bloch condition. (I'll drop the k indices from here on out for clarity, and my own sanity in writing all this Latex.)

Then you want to expand the Schrodinger equation:
[tex]\langle n'\mathbf{R}' | H | \psi \rangle = E \langle n'\mathbf{R}' | \psi \rangle[/tex]

Looking at the inner product on the right side first:
[tex]\langle n'\mathbf{R}' | \psi \rangle = \langle n'\mathbf{R}' | \sum_{n\mathbf{R}} c_{n\mathbf{R}}^{m} |{n\mathbf{R}}\rangle = \sum_{n\mathbf{R}} c_{n\mathbf{R}}^{m} \langle n'\mathbf{R}' |{n\mathbf{R}}\rangle[/tex]

Now do the trick of inserting the identity operator in the form of an integral over all space into the product:
[tex]\langle n'\mathbf{R}' |{n\mathbf{R}}\rangle =
e^{i\mathbf{k}\cdot (\mathbf{R}-\mathbf{R}')} \int d^3r \phi_{n'}(\mathbf{r}-\mathbf{R}') \phi_n (\mathbf{r} - \mathbf{R})[/tex]
The quantity in the integral is called the overlap matrix and is often written [itex]S_{nn'}^{\mathbf{k}}(\mathbf{R}-\mathbf{R'})[/itex] in real space or [itex]S_{nn'}^{\mathbf{k}}[/itex] in k space. And it's usually neglected by assuming that it is an identity matrix in k space.

If you perform the same expansion of the left hand side you will find the product
[tex]\langle n'\mathbf{R}' |{n\mathbf{R}}\rangle =
e^{i\mathbf{k}\cdot (\mathbf{R}-\mathbf{R}')} \int d^3r d^3r' \phi_{n'}(\mathbf{r'}-\mathbf{R}') \langle r'|H|r\rangle \phi_n (\mathbf{r} - \mathbf{R})[/tex]
It looks complicated so we do what any good physicist will do and give it a name and a symbol and then pretend we have good ways to approximate it. That is the hopping matrix element [itex]t_{nn'}(\mathbf{R}-\mathbf{R'})[/itex].

So when you put this all together, you will have an equation that looks like the standard Schrodinger equation in matrix form: [itex] \hat{H}^\mathbf{k} |{\mathbf{k}m} \rangle = E_{\mathbf{k}m} \hat{S}^\mathbf{k} |{\mathbf{k}m}\rangle[/itex] except we have not assumed that the basis is orthogonal, so we have introduced the overlap matrix and it is now a generalized eigenvalue problem. The matrix elements of H are given by:
[tex]H_{n'n}^\mathbf{k} = \sum_{n\mathbf{R}} t_{nn'}(\mathbf{R} - \mathbf{R}') e^{i\mathbf{k}\cdot(\mathbf{R} - \mathbf{R}')} [/tex]
where R' and n' are an orbital in your unit cell. For R = R' and n = n' that is your on-site energy [itex]\epsilon[/itex] in their example.

For S, you have:
[tex]S_{n'n}^\mathbf{k} = \sum_{n\mathbf{R}} s_{nn'}(\mathbf{R} - \mathbf{R}') e^{i\mathbf{k}\cdot(\mathbf{R} - \mathbf{R}')} [/tex]
but S is usually neglected.

VortexLattice · May 1, 2013

Thank you, I think I'm starting to get it! But I'm still confused about a couple things. I'm still trying to get my head around the unit cell deal. My textbook (Ashcroft & Mermin) says (talking about the example a hexagonal close packed metal, which has a unit cell of 2 atoms):

...one can treat the two-point basis as a molecule, whose wave functions are assumed to be known, and proceed as [you would for a regular Bravais lattice], using molecular instead of atomic wave functions.

...

Alternatively, one can proceed by continuing to construct linear combos of atomic levels centered at the Bravais lattice points and at the basis points, generalizing [the previous, normal equation for the wave function of an electron that's built out of localized atomic wave functions] to ##\psi(\vec r) = \sum_R e^{i \vec k \cdot \vec R} (a\phi(\vec r - \vec R - \vec d) + b\phi(\vec r - \vec R))## (##d## is the vector pointing from one atom to the other).

I'm trying to understand a paper and embarrassingly, it's still not clear to me which of these they're actually doing. In the paper, it also has a unit cell of just two identical atoms. They say that there are 4 spatial orbitals, s, x (=p_x), y (=p_y), z (=p_z) available, and we denote the orbitals on the atoms of the primitive cell by ##|s i\rangle##, ##|x i\rangle##, ##|y i\rangle##, ##|z i\rangle##, where ##i## is 1 or 2, indicating which of the atoms in the unit cell it is.

Can you tell me if the following is correct?

I think each of those is a "Lowdin orbital" (that's what your ##|\mathbf k m\rangle## is, right?). The appendix gives the matrix elements of ##\hat H## in terms of ##\langle s i|\hat H|x j\rangle## for all the combinations {s,x,y,z}and ##i,j##. But, because each of those is a Lowdin orbital and thus a sum of non-orthogonal states, so we get some linear combination of a geometric factor (the exponent) times some constant to be determined (the hopping matrix element?). This would explain why the matrix elements have multiple terms.

However, I still kinda don't get the role of having different wave functions for the different atoms in the unit cell.

Thank you!

daveyrocket · May 1, 2013

The exponential comes from the distance in space between the two orbitals. It has nothing to do with the non-orthogonality and it is possible (but non-trivial) to choose basis orbitals which are orthogonal. Each term in the matrix element comes from a hopping between two orbitals, and how many terms you have comes from how many neighbors there are for an orbital and how far you are willing to go until you cut off the expansion.

|km> is a solution to the Schrodinger equation, not a Lowdin orbital. Lowdin's method of obtaining orthogonal orbitals could be used to choose basis functions so that the overlap matrix is indeed the identity. That is, if you're actually calculating the hopping parameters from the wavefunctions. If you're fitting the hoppings to the band structure then you don't really need to think about this.

What do you mean by "different wave functions for the different atoms in the unit cell?" Are you referring to having different basis functions for each atom? We are creating an expansion of the wavefunction in terms of some basis states. To get something general, you need some measure of completeness in the basis states. So from that perspective, you need orbitals from all the atoms to create a complete basis.

In practice, this is usually done for a small set of bands near the Fermi level. So what you do first is to come up with some sort of model for what orbitals make up that band. From a DFT calculation this is done usually by looking at fat bands and partial density of states, along with some chemistry insight for the atoms involved. You don't necessarily use every atom... for instance a system which has an alkali metal that gives up its electron and doesn't participate much in your valence band states, you probably won't need to include any function from that atom. But if you're doing something like graphene, where each atom contributes you need basis functions centered on each atom.

sam_bell · May 5, 2013

That was enlightening Davey. One correction, I didn't follow this statement:

daveyrocket said:

The basis function [itex]|{n\mathbf{R}\mathbf{k}}\rangle = e^{i\mathbf{k}\cdot\mathbf{R}} |{n\mathbf{R}}\rangle[/itex] satisfies the Bloch condition.

I believe you need to sum over [itex]\mathbf{R}[/itex] before the theorem is satisfied.

VortexLattice, if you forgo choosing a Bravais lattice + basis desc. you will also forgo the opportunity to chose solutions that satisfy Bloch's theorem. I assume that would also make the problem more difficult because then your solutions will not be unique, and you'd have to sort them out.

daveyrocket · May 5, 2013

Ah, you are correct. I should point out there that the notation I've used is a bit sloppy, there should be a differentiation between the vector which locates an atom in the basis and the lattice vectors of the Bravais lattice. The summation should only be over the Bravais lattice vectors, of course, and the vector [itex]|n\mathbf{R}\mathbf{k}\rangle[/itex] should be labeled only by the vector pointing to the atom in the basis. (I've used R for both.)

Help with the Tight Binding model

1. What is the Tight Binding model?

2. How is the Tight Binding model used in scientific research?

3. What are the limitations of the Tight Binding model?

4. How do researchers validate the results obtained from the Tight Binding model?

5. Can the Tight Binding model be applied to all materials?

Similar threads

Hot Threads

Recent Insights