Table of Contents

Imagemagick

Image manipulation

Change orientation

Some imaging systems may either store the image dimensions as is, or add the EXIF orientation metadata and rely on the image display software to rotate for presentation. To remove this orientation metadata:

convert INPUT.jpg -auto-orient OUTPUT.jpg

Dokuwiki image rendering

Resizing images

To resize and preserve aspect ratio, while ignoring increases in image dimensions (use of >):

convert INPUT.jpg -resize 1024x1024\> OUTPUT.jpg

To resize and honor a pixel area limit (for example when adhering to document submission filesize limits), use the '@' operator:

convert INPUT.jpg -resize 4096@\> OUTPUT.jpg

Tip

In practice, useful to either constrain the maximum dimensions with -resize 1024x1024, or impose variable sizing for filesize limits.

In-place modification can be used to simplify workflow, using the mogrify tool that comes with IM:

mogrify -auto-orient -resize 1024x1024\> *.jpg

Tip

To force crop into fixed sizes, note that this will overwrite the files:

mogrify -resize 512x512^ -gravity Center -extent 512x512 *.jpg

The Imagemagick documentation on resize is extremely helpful in detailing the different options.

Cropping whitespace

Works by removing colors equal to the corners. Sometimes compression can result in pixel colors being interpolated, e.g. leading to gray borders in a B&W greyscale image. An additional crop along the edges will fix this, via -crop, -chop (with gravity specified), -shave (all around).

user:~$ mogrify -trim *.png       # remove corner colors
user:~$ mogrify -shave 1x1 *.png  # remove single pixel border all-around

Screenshot video snippets

View EXIF data

identify -format '%[exif:*]' [FILENAME]

File format conversion

Concatenate PDF

This is a Ghostscript thingy:

gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=combine.pdf -dBATCH 1.pdf 2.pdf

Convert to PDF

No idea how I took this long to search for an open source solution to convert anything PDF.

Rather than direct editing of a PDF file (unless one needs the text and everything, including metadata, preserved), better to convert into a lossless image, edit that image, then recompose back into a PDF file. Do this by using Ghostscript (which Imagemagick wraps when handling PDF files):

> magick -quality 100 -density 600 -depth 16 "path/to/input.pdf" "path/to/output.png"

You can find the downloaders for ImageMagick and GhostScript.

A word of caution, Ghostscript used to be a target of severe Remote Code Execution (RCE), so best to keep this piece of software consistently patched.


Hacked together a script that automates image downscaling and conversion:

#!/usr/bin/env python3
# Rename files in "Screenshot from 2022-12-07 09-23-06.png" format to "20221207_092306_" 
# Justin, 2022-12-12, 2023-03-01
 
import datetime as dt
import pathlib
import os
import re
 
PROFILE_TMP = ".profile.icm"
EXIF_TMP = ".exif.tmp"
 
# Process Ubuntu screenshot images
# into 'YYYYMMDD_HHMMSS_.jpg' format
for path in pathlib.Path().glob("Screenshot from *.png"):
 
    # Extract all numbers
    target = "".join([c for c in str(path) if "0" <= c <= "9"])
    target = target[:8] + "_" + target[8:] + "_.png"
 
    # Rename
    path.rename(path.with_name(target).absolute())
 
SUFFIXES = {
    ".jpg": ".jpg",
    ".png": ".png",
    ".gif": ".jpg",
    ".jpeg": ".jpg",
    ".heic": ".jpg",
}
# Shrink other unprocessed camera photos for wiki
# Notably those not in format listed above in for loop
for path in pathlib.Path().glob("*"):
    print(path.name, end=": ")
 
    # Ignore non-image stuff
    if path.suffix.lower() not in SUFFIXES:
        print("non-image")
        continue
 
    # Ignore files that have stuff appended after the timestamp
    if re.search("^[0-9]+_[0-9]+_[A-Za-z0-9]+", path.stem):
        print("parsed, and commented")
        continue
 
    # Ignore files ending in underline, i.e. not commented
    if path.stem.endswith("_"):
        print("parsed, but uncommented")
        continue
 
    # Attempt to convert
    print("not parsed - converting...")
 
    # For files that do not come in timestamped format, we use EXIF metadata
    # to retrieve the datetime
    if not re.search("^[0-9]+_[0-9]+$", path.stem):
        if os.system(f"identify -format '%[exif:Datetime]' {path.name} > {EXIF_TMP}") != 0:
            print(f"... EXIF read of datetime failed\n")
            continue
        with open(EXIF_TMP) as f:
            date = f.read()
            target = dt.datetime.strptime(date, "%Y:%m:%d %H:%M:%S").strftime(f"%Y%m%d_%H%M%S")
            target = path.with_name(target)
    else:
        target = path
 
    target_suffix = SUFFIXES[path.suffix.lower()]
    target = path.with_stem(target.stem + "_").with_suffix(target_suffix)
 
    # Ignore files which already have converted forms
    is_converting = True
    for _path in pathlib.Path().glob("*"):
        if target.suffix == _path.suffix and _path.stem.startswith(target.stem):
            is_converting = False
            break
    if not is_converting:
        print(f"... converted form '{target.name}' exists\n")
        continue
 
    # https://stackoverflow.com/questions/13646028/how-to-remove-exif-from-a-jpg-without-losing-image-quality/17516878#17516878
    profile_option = ""
    if os.system(f"convert {path.name} {PROFILE_TMP}") == 0:
        profile_option = f"-profile {PROFILE_TMP}"
    os.system(f"convert {path.name} -strip {profile_option} -resize 1024x1024\> {target.name}")
    path.rename(target.stem[:-1] + path.suffix)
    print("... conversion complete\n")

If conversion from PDF is restricted by policy, temporarily disable the policy by commenting out the relevant Ghostscript policy lines in /etc/ImageMagick-6/policy.xml. Make sure to restore them to avoid unintended security vulnerability :)