Automatic indi-allsky backup on external drive (Fritzbox NAS) – Phase 2: Files

After the database, configuration and migration of indi-allsky have already been automatically backed up to an external drive, the next step was to work on the actual files: the night images from which Timelapse, Keogramm and Startrails will later be created.

The basis for this is phase 1 of the backup, which I have described here:

Automatic indi-allsky backup on external drive (Fritzbox NAS) – Phase 1

Phase 2 is much more demanding. While database backups are comparatively small and deterministic, with image data we are quickly talking about several gigabytes per night – and about data that is not all equally valuable.

Why not simply “back up everything”?

An all-sky system produces images every night – regardless of whether the sky is clear, partially cloudy or completely unusable. If you backed up all the files blindly, even a large external hard disk would fill up very quickly.

That’s why it was clear from the start: the backup must be quality-based.
Not every night is equally valuable – and with indi-allsky this can be derived very well from the database.

And this is how I proceeded…

Quality criteria directly from the indi-allsky database

indi-allsky stores per image, among other things

  • the number of stars detected(stars)
  • whether moon mode was active(moonmode)
  • whether the image was taken at night

I calculate this for each night:

  • the average number of stars
  • the average proportion of the moon

Only nights that meet these minimum criteria are saved at all. Bad nights automatically fall through the grid.

To do this, I first looked at how many stars were detected per image on particularly clear nights to get a feeling of what is “gigantic” and what is “quite okay” and used this to determine the limit values.

Only the original photos and timelapses are saved, no keograms and start rails (both are commented out in the script!) and thumbnails (these can be easily regenerated in indi-allsky using a script).

“Horny nights” vs. rolling nights

The real data very quickly resulted in a clear separation:

  • Horny nights: Ø ≥ 1000 stars on average → keep permanently
  • Okay nights: good but not perfect conditions → may be deleted later

This classification is stored in the backup as part of a manifest file for each night and forms the basis for subsequent retention. The manifest.txt is a small metadata file that is stored in the backup for each night that is backed up. It documents the most important quality and context information for a night (date, average number of stars, moon phase, number of images) in a structured form and serves as a basis for decision-making for the backup system. On this basis, it is possible, for example, to automatically decide which nights are to be stored permanently (“FOREVER”) and which can be deleted when storage space becomes scarce (“ROLLING”). At the same time, the manifest file allows you to trace later why a night was backed up or deleted – independently of log files or database states.

Example of a manifest.txt:

night=2025-12-25
avg_stars=1509.3
moon_pct=12.4
images=1854
class=FOREVER
verified=yes

Hard check – is the number of files correct?

A backup is only a backup if it is complete.
That is why there is a hard verification step after every night of backup:

Number of images according to the database = number of files in the backup

If these figures do not match exactly, the backup is deemed to have failed.
There is no such thing as “almost complete”.

This prevents that:

  • incomplete nights are silently accepted
  • Retention is based on faulty backups
  • Data loss only becomes apparent weeks later

Dynamic retention instead of fixed limits

As database backups are also stored on the same drive, retention is deliberately dynamic:

  • Deletion from > 80 % disk occupancy
  • or if less than 20 GB is free

This means that the script works regardless of whether someone uses a small SSD or a large external HDD.

Only the following are deleted:

  • verified rolling nights
  • never the last (still running) night
  • never nights classified as “horny

Every deletion is documented by e-mail – including night, number of stars, proportion of moon and released storage space.

Automatic confirmation e-mail

After each backup (manual or via cronjob), an e-mail is sent with statistics – including the size of the backup, the best starry night to date (measured by the average number of stars), free storage space and whether anything has been deleted:

indi-allsky batch backup completed

Host: allsky
Time: Sun Dec 28 00:01:30 CET 2025

Nights backed up:
2025-12-24 | 1902 images | 2915 MB images | Ø stars 1170.4
2025-12-25 | 1854 images | 3038 MB images | Ø stars 1509.3
2025-12-26 | 1677 images | 2783 MB images | Ø stars 1483.3

Timelapse total:
501 MB

Greatest night so far:
2025-12-25 | Ø stars 1509.3

Backup HDD occupancy:
Total: 232G
Used: 11G (5%)
Free: 221G

The complete backup script (phase 2)

The following script can be used productively and can be adopted 1:1.
A dry run is possible via --dry-run.

#!/bin/bash
set -euo pipefail

# =========================================================
# Configuration
# =========================================================
DB="/var/lib/indi-allsky/indi-allsky.sqlite"
SRC_BASE="/var/www/html/allsky/images"

BACKUP_MOUNT="/mnt/backup_allsky"
BACKUP_BASE="$BACKUP_MOUNT/backup_allsky"
NIGHTS_DIR="$BACKUP_BASE/nights"
TL_DIR="$BACKUP_BASE/timelapse"

LOG="/var/log/backup_allsky_phase2_batch.log"
MAIL_TO="mail@domain.tld"
# adjust
HOST="$(hostname)"
LOCKFILE="/var/lock/backup_allsky_phase2.lock"

START_DATE="2025-12-24"   # optional, leave empty for all: START_DATE=""

# Criteria
MIN_AVG_STARS_PRIMARY=700
MIN_AVG_STARS_SECONDARY=500
MIN_MOON_PCT_SECONDARY=50
GREAT_NIGHT_MIN_AVG_STARS=1000
# adjust if needed

# Retention
RETENTION_MAX_USED_PCT=80
# adjust if needed

DRY_RUN=0
[[ "${1:-}" == "--dry-run" ]] && DRY_RUN=1

RSYNC_OPTS=(-a)
[[ "$DRY_RUN" -eq 1 ]] && RSYNC_OPTS=(-a --dry-run)

exec >>"$LOG" 2>&1

# =========================================================
# Error reporting (prevents silent "start -> end" without info)
# =========================================================
fail() {
  local rc=$?
  {
    echo "indi-allsky batch backup FAILED"
    echo "Host: $HOST"
    echo "Time: $(date)"
    echo "Exit code: $rc"
    echo "Line: ${BASH_LINENO[0]}"
    echo "Command: ${BASH_COMMAND}"
    echo
    echo "Last log lines:"
    tail -n 80 "$LOG" || true
  } | mail -s "indi-allsky batch backup ERROR ($HOST)" "$MAIL_TO"
  exit "$rc"
}
trap fail ERR

# =========================================================
# Helpers
# =========================================================
disk_used_pct() { df -P "$BACKUP_MOUNT" | awk 'NR==2 {gsub("%","",$5); print $5}'; }
disk_human()    { df -h "$BACKUP_MOUNT" | awk 'NR==2 {print $2, $3, $4, $5}'; }

echo "=== Phase 2 batch start: $(date) ==="

# =========================================================
# Locking (no parallel execution)
# =========================================================
exec 9>"$LOCKFILE" || exit 1
if ! flock -n 9; then
  echo "Lock active, exiting: $(date)"
  exit 0
fi

# =========================================================
# Ensure backup mount
# =========================================================
mountpoint -q "$BACKUP_MOUNT" || mount "$BACKUP_MOUNT"
mountpoint -q "$BACKUP_MOUNT" || { echo "ERROR: $BACKUP_MOUNT not mounted"; exit 1; }

mkdir -p "$NIGHTS_DIR" "$TL_DIR"

# =========================================================
# Determine active night camera (from DB)
# =========================================================
ACTIVE_CCD="$(
  sqlite3 -readonly -noheader "$DB" \
  "SELECT substr(filename,1,instr(filename,'/')-1)
   FROM image
   WHERE night=1 AND exclude=0
   ORDER BY dayDate DESC, filename DESC
   LIMIT 1;"
)"
[[ -z "$ACTIVE_CCD" ]] && { echo "ERROR: No active night camera found."; exit 1; }

SRC_CCD="$SRC_BASE/$ACTIVE_CCD"

# =========================================================
# Determine nights
# =========================================================
DATE_FILTER=""
[[ -n "${START_DATE:-}" ]] && DATE_FILTER="AND dayDate>='$START_DATE'"

mapfile -t NIGHTS < <(
  sqlite3 -readonly -noheader "$DB" "
    SELECT DISTINCT dayDate
    FROM image
    WHERE night=1 AND exclude=0 $DATE_FILTER
    ORDER BY dayDate;
  "
)

LAST_NIGHT="${NIGHTS[-1]:-}"

MAIL_LINES=""
BEST_NIGHT=""
BEST_STARS="0"
TL_TOTAL_MB=0

# =========================================================
# Process each night
# =========================================================
for NIGHT in "${NIGHTS[@]}"; do
  # Read stats cleanly using | separator
  IFS='|' read -r AVG_STARS MOON_PCT IMG_COUNT <<<"$( sqlite3 -readonly -noheader -separator '|' "$DB" " SELECT COALESCE(ROUND(AVG(stars),1),0), COALESCE(ROUND(100.0*AVG(moonmode),1),0), COUNT(*) FROM image WHERE night=1 AND exclude=0 AND dayDate='$NIGHT' AND filename LIKE '$ACTIVE_CCD/%'; " )" AVG_STARS="${AVG_STARS:-0}" MOON_PCT="${MOON_PCT:-0}" IMG_COUNT="${IMG_COUNT:-0}" # Backup decision (bc always receives numeric values) SECURE_NIGHT=0 if (( $(echo "$AVG_STARS >= $MIN_AVG_STARS_PRIMARY" | bc -l) )); then
    SECURE_NIGHT=1
  elif (( $(echo "$AVG_STARS >= $MIN_AVG_STARS_SECONDARY && $MOON_PCT >= $MIN_MOON_PCT_SECONDARY" | bc -l) )); then
    SECURE_NIGHT=1
  fi

  if [[ "$SECURE_NIGHT" -eq 0 ]]; then
    continue
  fi

  # Target directory
  NIGHT_DST="$NIGHTS_DIR/$NIGHT"
  mkdir -p "$NIGHT_DST"

  # Files from DB (active camera only)
  mapfile -t FILES < <( sqlite3 -readonly -noheader "$DB" " SELECT filename FROM image WHERE night=1 AND exclude=0 AND dayDate='$NIGHT' AND filename LIKE '$ACTIVE_CCD/%'; " ) # Copy files for FILE in "${FILES[@]}"; do SRC="$SRC_BASE/$FILE" DST="$NIGHT_DST/$FILE" mkdir -p "$(dirname "$DST")" rsync "${RSYNC_OPTS[@]}" "$SRC" "$DST" done # Image size (exposures only, no thumbnails) IMG_MB="$(du -sm "$NIGHT_DST/$ACTIVE_CCD/exposures" 2>/dev/null | awk '{print $1+0}')"
  IMG_MB="${IMG_MB:-0}"

  # Best night so far
  if (( $(echo "$AVG_STARS > $BEST_STARS" | bc -l) )); then
    BEST_STARS="$AVG_STARS"
    BEST_NIGHT="$NIGHT"
  fi

  # Manifest (retention protection)
  CLASS="ROLLING"
  if (( $(echo "$AVG_STARS >= $GREAT_NIGHT_MIN_AVG_STARS" | bc -l) )); then
    CLASS="FOREVER"
  fi

  cat >"$NIGHT_DST/manifest.txt" <= RETENTION_MAX_USED_PCT )); then
  while (( USED_PCT >= RETENTION_MAX_USED_PCT )); do
    CANDIDATE="$(
      ls -1 "$NIGHTS_DIR" 2>/dev/null | sort | while read -r N; do
        [[ -z "$N" ]] && continue
        [[ "$N" == "$LAST_NIGHT" ]] && continue
        [[ -f "$NIGHTS_DIR/$N/manifest.txt" ]] && grep -q "^class=FOREVER$" "$NIGHTS_DIR/$N/manifest.txt" && continue
        echo "$N"
        break
      done
    )"

    [[ -z "$CANDIDATE" ]] && break

    SIZE_H="$(du -sh "$NIGHTS_DIR/$CANDIDATE" | awk '{print $1}')"

    if [[ "$DRY_RUN" -eq 1 ]]; then
      echo "[DRY-RUN] Retention would delete: $CANDIDATE ($SIZE_H)"
    else
      rm -rf "$NIGHTS_DIR/$CANDIDATE"
      {
        echo "indi-allsky retention: night deleted"
        echo
        echo "Host: $HOST"
        echo "Time: $(date)"
        echo
        echo "Deleted: $CANDIDATE"
        echo "Freed: $SIZE_H"
        echo
        echo "Disk status (total used free usage%):"
        disk_human
      } | mail -s "indi-allsky retention: night $CANDIDATE deleted ($HOST)" "$MAIL_TO"
    fi

    USED_PCT="$(disk_used_pct)"
  done
fi

# =========================================================
# Final report mail
# =========================================================
read -r SIZE USED AVAIL PERC <<<"$(df -h "$BACKUP_MOUNT" | awk 'NR==2 {print $2,$3,$4,$5}')"

mail -s "indi-allsky batch backup completed" "$MAIL_TO" <}

Total timelapse size:
$TL_TOTAL_MB MB

Best night so far:
${BEST_NIGHT:-} | avg stars ${BEST_STARS:-0}

Backup disk usage:
Total: $SIZE
Used:  $USED ($PERC)
Free:  $AVAIL

Note:
Thumbnails are not backed up.
EOF

echo "=== Phase 2 batch end: $(date) ==="
exit 0

Running?

The more impatient people can check the status quo with ps -ef | grep backup_allsky | grep -v grep.

Cronjob

I run the script once a day around 11:30 during the day:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
30 11 * * * root /usr/local/bin/backup_allsky_phase2.sh >> /var/log/backup_allsky_phase2_cron.log 2>&1

Log rotation in /etc/logrotate.d/backup_allsky

/var/log/backup_allsky_phase2*.log {
    weekly
    rotate 4
    compress
    missingok
    notifempty
    copytruncate
}

Conclusion

Phase 2 of the indi-allsky backup is deliberately more complex than simply copying all files via rsync.
But it is:

  • data-based
  • verifying
  • storage-efficient
  • and maintainable in the long term

For me, this is the crucial difference between “somehow backed up” and a backup that you can trust in an emergency.

Enjoyed this post?

You can support allsky-rodgau.de with a small coffee on BuyMeACoffee.

Buy me a coffee!