Using crontab to clone private GitHub repo and run lualatex on Ubuntu

  • Thread starter Wrichik Basu
  • Start date
  • Tags
    Linux
In summary, the conversation discusses setting up a cron job to clone a private GitHub repository, run lualatex, and save the compiled PDF to Google Drive every hour. A script has been written for this purpose, but when run through the cron job, it throws an error related to permission and directory issues. Possible solutions discussed include using su in the script or exploring alternative options such as GitHub Actions or a VPS. The conversation also highlights the importance of monitoring disk usage when using git repositories.
  • #1
Wrichik Basu
Science Advisor
Insights Author
Gold Member
2,116
2,691
My distro is Ubuntu 22.04.2 LTS. I am trying to set up a cron job that will clone a private GitHub repo via git every hour, run lualatex, and save the compiled PDF to Google Drive. I have written a script for this purpose:

myscript.sh:
#!/bin/bash

cd /home/wrichik-basu/opt/cron_scripts

if [ ! -d "./temp" ]; then
    mkdir temp
fi

cd temp

if [ -d "./myrepo" ]; then
    rm -rdf ./myrepo
fi

echo "Cloning repo..."
git clone -q --depth=1 <SSH_Link_to_private_GitHub_repo>

cd myrepo

echo "Running LuaLaTeX #1..."
lualatex -interaction=nonstopmode formulae_main.tex >/dev/null

status=$?

if [ $status -eq 0 ]; then

    echo "Running LuaLaTeX #2..."
    lualatex -interaction=nonstopmode formulae_main.tex >/dev/null
 
    status=$?
 
    if [ $status -eq 0 ]; then
     
        echo "Copying generated PDF to Google Drive..."
        cp ./main.pdf /path/to/Google/Drive/mount
    fi
fi

echo "Cleaning up..."
cd ../..
rm -rdf ./temp

The script works fine when I run it from the terminal.

The cron job is set up as follows:

Bash:
10 */1 * * * /full/path/to/script/myscript.sh >/dev/null

This throws the following error, that I can retrieve via postfix:

Error:
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Cron-Env: <SHELL=/bin/sh>
X-Cron-Env: <HOME=/home/wrichik-basu>
X-Cron-Env: <LOGNAME=wrichik-basu>
Message-Id: <redacted>
Date: Wed,  7 Jun 2023 11:10:03 +0530 (IST)
Status: R
X-IMAPbase:           1686118612                    3
X-UID: 1

git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
/full/path/to/script/myscript.sh: line 18: cd: myrepo: No such file or directory
/full/path/to/script/myscript.sh: line 21: lualatex: command not found

This answer on Serverfault suggested that I should put the path explicitly. I did that. It threw the same error again.

Note that I am accessing crontab via crontab -e or crontab -l, i.e. not as sudo.

Any idea how I can set this up?
 
Last edited:
Computer science news on Phys.org
  • #2
cron tasks run as root so won't be using your private key. Try inserting su wrichik-basu into the script.

Alternatively I have done something similar with a GitHub action similar to the one described at https://blog.benoitblanchon.fr/github-action-run-ssh-commands/. This has the advantage that you can run it on every push (or whatever) instead of every hour regardless of changes, but has the disadvantage that you must expose an ssh login to your machine to the interweb. I run a cheap VPS that does this sort of thing.

Another alternative for this could be e.g. https://github.com/marketplace/actions/latex-compilation.
 
  • #3
One thing to consider is how much disk you will be using each time crontab clones the repo.

Years ago, our company build czar had a fully automated build process that he was quite proud of. He left for vacation. It was a simple run this one script and you're done sort of thing.

I as project lead was given the task to run his script. All remaining code was checked in and our final project build was kicked off. It worked really well but during final product sniff tests we noticed a small problem with the code for one country, and decided we must fix this before shipping.

Everything went in the gutter, the build just failed. We didn't know why until he came back and said we ran out of disk space on the drive he backed up every copy of build output (maybe a years worth of builds) and forgot to tell or check beforehand or have a script contingency to delete old images when disk space was low.

Git repos can grow very large if you don't monitor their disk usage.

Little mistakes can cause an avalanche of problems -- Aesop the Programmer
 
  • #4
pbuk said:
Try inserting su wrichik-basu into the script.
Don't you mean try inserting su wrichik-basu into the crontab command?
https://serverfault.com/questions/352835/crontab-running-as-a-specific-user#352837 said:
Only /etc/crontab and the files in /etc/cron.d/ have a username field. In that file you can do this:

1 1 * * * username /path/to/your/script.sh

From root's crontab sudo crontab -e you can use:

1 1 * * * su username -c "/path/to/your/script.sh"

Or you can use the user's actual crontab like this:

sudo crontab -u username -e
 
  • Like
Likes pbuk
  • #5
pbuk said:
cron tasks run as root so won't be using your private key. Try inserting su wrichik-basu into the script.
Nope, that doesn't work. It throws the following error: Password: su: Authentication failure.

pbuk said:
Alternatively I have done something similar with a GitHub action similar to the one described at https://blog.benoitblanchon.fr/github-action-run-ssh-commands/. This has the advantage that you can run it on every push (or whatever) instead of every hour regardless of changes, but has the disadvantage that you must expose an ssh login to your machine to the interweb. I run a cheap VPS that does this sort of thing.

Another alternative for this could be e.g. https://github.com/marketplace/actions/latex-compilation.
My repo is private, so I cannot use GH Actions.

jedishrfu said:
One thing to consider is how much disk you will be using each time crontab clones the repo.

...

Git repos can grow very large if you don't monitor their disk usage.
(Abridged by me)
I don't think that's going to happen here anytime soon. If you look at my script, it is cloning only upto depth=1, i.e. only the last commit. I am also deleting all files that I have downloaded at the end (unless the script is killed midway, in which case the clean-up is done the next time the script is executed).

jack action said:
Don't you mean try inserting su wrichik-basu into the crontab command?
I did that too; I modified the job to:

Bash:
10 */1 * * * wrichik-basu /full/path/to/script/myscript.sh >/dev/null

This resulted in the following error, and the script never got executed:

Bash:
/bin/sh: 1: wrichik-basu: not found
 
Last edited:
  • #6
By the way, if I execute the following command (which I can't paste here because XF throws an error):



I am getting the following output:

Bash:
Jun  7 13:10:01 PC_name CRON[781083]: (wrichik-basu) CMD (/full/path/to/script/myscript.sh >/dev/null)

All other jobs have (root) instead of (wrichik-basu). Does this somehow mean that the job is being run as my user rather than root?
 
Last edited:
  • #9
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Doesn't this mean the problem is with git and its credentials, not any local permissions?
 
  • #10
Vanadium 50 said:
Doesn't this mean the problem is with git and its credentials, not any local permissions?
As I said, the script executes fine if I run it on a terminal. Without any intervention. If cron can run the script exactly like this, things should go fine. I can't make cron do that.
 
  • #11
That doesn't preclude the problem being one of authentication. Is the environment the same? Is the directory this is being run in the same? And so on. I don't know what authentication git is using, but working interactively does not mean it will work non-interactively.

As a general rule,one's starting point should be that the error message is correct.
 
  • #12
Vanadium 50 said:
Is the environment the same?
This is where the issue lies, I believe. I guess the environments are not the same, even though the user is the same. I put a simple whoami in crontab, and it told me that the user is me, not root. I put git config -l as a cron job, and it echo'ed the same config that I use, including the email and signing key, etc. What other things can I check to verify the environment?
Vanadium 50 said:
Is the directory this is being run in the same?
I believe so. In my script, I am changing directories into the directory in which the script lies before doing anything else.
 
  • #13
It's the SSH passphrase authentication issue because the environments are not exactly the same. After executing ssh -vT git@github.com as a cron job, this is what I got from the output:

Bash:
debug1: Offering public key: /path/to/key <key_name> <key_SHA>
debug1: Server accepts key: /path/to/key <key_name> <key_SHA>
debug1: read_passphrase: can't open /dev/tty: No such device or address
debug1: Trying private key: /path/to/key
debug1: Trying private key: /path/to/key
debug1: Trying private key: /path/to/key
debug1: No more authentication methods to try.
git@github.com: Permission denied (publickey).

It seems that when I am working from terminal, it's not asking for the key because I have used it recently. That's not the case when working from cron. As per this comment on r/linuxquestions, cron works without a console, hence /dev/tty is null.

So, the option is to probably disable password authentication, as stated here: https://serverfault.com/a/1110635 I don't want to do that.

Is there a way I can pass the passphrase from the keyring?
 
Last edited:
  • #14
After reading some answers to this question on ServerFault, I decided to remove the passphrase. The cron job is now working as expected.

As a sidenote, it is regrettable that answers from 2010 stating that automated software can't use SSH keys with passphrase are still valid as of today.
 
  • #15
Some answers never expire and some do in Computer Science. Your task should you decide to take it Mr Phelps is to determine which is which.

(...Mission Impossible theme.plays in the background while we all see a smoking tape recorder...)

PS: I've run into that scenario many times too.
 
  • Like
Likes Wrichik Basu
  • #16
I modified the script quite a bit to make sure that the compilation is done only if any changes have been made since the last time the file was uploaded.

 
  • #17
Wrichik Basu said:
My repo is private, so I cannot use GH Actions.
I think that that is no longer the case.
 
  • Wow
Likes Wrichik Basu

What is crontab and how do I use it?

Crontab is a command in Linux and Unix operating systems that allows users to schedule tasks to be executed at specific times. To use crontab to clone a private GitHub repository and run lualatex on Ubuntu, you will need to create a crontab file and specify the commands you want to run at the desired time intervals.

How do I clone a private GitHub repository using crontab?

To clone a private GitHub repository using crontab, you will need to use the "git clone" command in your crontab file. Make sure to also include the full URL of the repository and any necessary authentication credentials.

Can I schedule crontab to run lualatex at specific intervals?

Yes, you can schedule crontab to run lualatex at specific intervals by specifying the desired time intervals in your crontab file. Make sure to also include the full path to the lualatex executable.

How do I ensure that my crontab job runs successfully?

To ensure that your crontab job runs successfully, make sure to check the log files for any errors or issues. You can also specify an email address in your crontab file to receive notifications about the status of your job.

Are there any security concerns when using crontab to clone private GitHub repositories?

Yes, there are some security concerns when using crontab to clone private GitHub repositories. Make sure to use secure authentication methods and keep your crontab file and repository credentials private to prevent unauthorized access.

Back
Top