Syncing iCal/ICS feeds into Discourse topics (simple Python script, cron-friendly)

Suggested changes to ICS → Discourse sync script

Here are my proposed changes to harden the script for production use.
Each change is shown with new code outside and the old code in a details pane for comparison where applicable.


Change 1 — Use JSON "tags" (not "tags[]") when creating topics

Why: When sending JSON to /posts.json, Discourse expects tags as an array. tags[] is for form-encoded payloads.

New

def create_topic(s, title, raw, category_id, tags):
    payload = {
        "title": title,
        "raw": raw,
        "category": int(category_id) if category_id else None,
        "tags": tags or []   # JSON array key
    }
    r = s.post(f"{BASE}/posts.json", json=payload, timeout=30)
    r.raise_for_status()
    data = r.json()
    return data["topic_id"], data["id"]
Old
def create_topic(s, title, raw, category_id, tags):
    payload = {
        "title": title,
        "raw": raw,
        "category": int(category_id) if category_id else None,
        "tags[]": tags or []
    }
    r = s.post(f"{BASE}/posts.json", json=payload)
    r.raise_for_status()
    data = r.json()
    return data["topic_id"], data["id"]

Change 2 — Robust HTTP with retries & timeouts (+ use for ICS fetch)

Why: Cron runs shouldn’t fail on transient 429/502/503 or slow endpoints.

New

from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

DEFAULT_TIMEOUT = 30

class _TimeoutSession(requests.Session):
    def request(self, *args, **kwargs):
        kwargs.setdefault("timeout", DEFAULT_TIMEOUT)
        return super().request(*args, **kwargs)

def _session():
    s = _TimeoutSession()
    s.headers.update({
        "Api-Key": API_KEY,
        "Api-Username": API_USER,
        "Content-Type": "application/json"
    })
    retry = Retry(
        total=5, backoff_factor=0.5,
        status_forcelist=(429, 500, 502, 503, 504),
        allowed_methods=frozenset(["GET", "POST", "PUT"])
    )
    s.mount("https://", HTTPAdapter(max_retries=retry))
    s.mount("http://", HTTPAdapter(max_retries=retry))
    return s

Replace urllib ICS fetch with session:

s = _session()
if args.ics_url:
    log.info(f"Fetching ICS: {args.ics_url}")
    resp = s.get(args.ics_url)
    resp.raise_for_status()
    data = resp.content
else:
    with open(args.ics_file, "rb") as f:
        data = f.read()

cal = Calendar.from_ical(data)
Old
if args.ics_url:
    import urllib.request
    log.info(f"Fetching ICS: {args.ics_url}")
    with urllib.request.urlopen(args.ics_url) as resp:
        data = resp.read()
else:
    with open(args.ics_file, "rb") as f:
        data = f.read()

cal = Calendar.from_ical(data)
s = _session()

Change 3 — Namespaced UID tags with feed + UID hashes (collision-safe)

Why: Different ICS feeds may reuse the same UID values. Namespacing by feed plus UID hash prevents collisions.
Also enforces a tag length limit (default 30, override with DISCOURSE_TAG_MAX_LEN env).

New

# Tag length cap: defaults to 30, override with DISCOURSE_TAG_MAX_LEN env
TAG_MAX_LEN = max(int(os.environ.get("DISCOURSE_TAG_MAX_LEN", "30")), 15)

def _short_hash(s: str, n: int = 8) -> str:
    return hashlib.sha1((s or "").encode("utf-8")).hexdigest()[:n]

def _feed_namespace(args) -> str:
    if getattr(args, "feed_id", None):
        return _short_hash(args.feed_id)
    if getattr(args, "ics_url", None):
        return _short_hash(args.ics_url)
    if getattr(args, "ics_file", None):
        return _short_hash(args.ics_file)
    return _short_hash("default-ics-namespace")

def _uid_tag(feed_ns: str, uid: str, max_len: int = None) -> str:
    max_len = max_len or TAG_MAX_LEN
    base = _sanitize_tag(f"ics-{feed_ns}-uid")
    uid8 = _short_hash(uid)
    tag = f"{base}-{uid8}"
    if len(tag) <= max_len:
        return tag
    overflow = len(tag) - max_len
    base_trim = max(len(base) - overflow, 8)
    tag = f"{base[:base_trim].rstrip('-')}-{uid8}"
    return tag[:max_len].rstrip("-")

# inside process_vevent
feed_ns = _feed_namespace(args)
uid_tag = _uid_tag(feed_ns, uid, TAG_MAX_LEN)
Old
uid_tag = _sanitize_tag(f"uid-{uid}")

Change 4 — Skip unsupported recurrence masters

Why: If the ICS has RRULE but no expanded instances, importing the master is misleading.

New


def _is_recurrence_master(vevent):
    return bool(vevent.get('rrule')) and not vevent.get('recurrence-id')

# inside process_vevent, after UID check
if _is_recurrence_master(vevent):
    log.info(f"Skipping RRULE master (no expansion implemented) UID={uid}")
    return

Change 5 — Deterministic tag order

Why: Avoid churn when Discourse reorders tags.

New

extra_tags = [t for t in (args.tags or []) if t]
tags = sorted(dict.fromkeys(DEFAULT_TAGS + extra_tags + [uid_tag]))
old
extra_tags = [t for t in (args.tags or []) if t]
tags = list(dict.fromkeys(DEFAULT_TAGS + extra_tags + [uid_tag]))

Change 6 — Enforce tag length for user/default tags

Why: Prevent API errors if any human-provided or default tag is too long. Overlong tags are skipped with a warning.

New (inside process_vevent, when building tags)

extra_tags_raw = [t for t in (args.tags or []) if t]
extra_tags = []
for t in extra_tags_raw:
    st = _sanitize_tag(t)
    if len(st) > TAG_MAX_LEN:
        log.warning(f"Skipping overlong tag (> {TAG_MAX_LEN}): {st}")
        continue
    extra_tags.append(st)

default_tags_sane = []
for t in DEFAULT_TAGS:
    st = _sanitize_tag(t)
    if len(st) > TAG_MAX_LEN:
        log.warning(f"Skipping overlong default tag (> {TAG_MAX_LEN}): {st}")
        continue
    default_tags_sane.append(st)

tags = sorted(dict.fromkeys(default_tags_sane + extra_tags + [uid_tag]))

(No “old” here — this is a new safeguard.)


:white_check_mark: With these changes:
• Each feed+UID has a unique tag, always within length limits.
• No transient failures on flaky networks.
• No collisions across different ICS sources.
• Human-provided tags are respected but skipped if invalid.
• Safe to run in cron without churn or surprises.

1 Like

Just to update on my own setup:

My network at IONOS is not flaky, so I won’t be needing Change 2 (the retry/backoff logic).
The rest of the changes are still useful in my case.

Here’s the script I will be using (post #20 with changes 1, 3, 4 and 5 from #21 applied):

ics2disc.py
#!/usr/bin/env python3
# Sync ICS -> Discourse topics (create/update by UID)
# Preserves human-edited titles; never moves categories on update.
# Requirements: requests, python-dateutil, icalendar
import os, sys, argparse, re, logging, hashlib
from datetime import datetime, date, timedelta
from dateutil.tz import gettz
from icalendar import Calendar
from urllib.parse import urlparse
import requests

log = logging.getLogger("ics2disc")
logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")

# --- Config from environment ---
BASE = os.environ.get("DISCOURSE_BASE_URL", "").rstrip("/")
API_KEY = os.environ.get("DISCOURSE_API_KEY")
API_USER = os.environ.get("DISCOURSE_API_USERNAME", "system")
CATEGORY_ID = os.environ.get("DISCOURSE_CATEGORY_ID")  # numeric (string ok) - used on CREATE only
DEFAULT_TAGS = [t for t in os.environ.get("DISCOURSE_DEFAULT_TAGS", "").split(",") if t]
SITE_TZ = os.environ.get("SITE_TZ", "Europe/London")

# Prefer Meta-style env name; fall back to TAG_MAX_LEN; default 30.
TAG_MAX_LEN = int(os.environ.get("DISCOURSE_TAG_MAX_LEN", os.environ.get("TAG_MAX_LEN", "30")))

# --- HTTP helpers (Discourse API) ---
def _session():
    s = requests.Session()
    s.headers.update({
        "Api-Key": API_KEY,
        "Api-Username": API_USER,
        "Content-Type": "application/json"
    })
    return s

# --- Tag helpers (namespace + safe names + length handling) ---
_TAG_SAFE_RE = re.compile(r"[^a-z0-9\-]+")
_TAG_DASHES_RE = re.compile(r"-{2,}")

def _short_hash(text, n=10):
    return hashlib.sha1((text or "").encode("utf-8")).hexdigest()[:n]

def _sanitize_tag_base(s: str) -> str:
    """Lowercase, replace invalid chars with '-', squeeze dashes, trim."""
    s = (s or "").strip().lower()
    s = _TAG_SAFE_RE.sub("-", s)
    s = _TAG_DASHES_RE.sub("-", s).strip("-")
    return s or "event"

def _enforce_len_or_truncate(tag: str) -> str:
    """Truncate *safely* (used only for our internal UID tag)."""
    if len(tag) <= TAG_MAX_LEN:
        return tag
    # leave room for suffix "-hXXXXXXXXXX" (12 chars with dash + 10 hex)
    suffix = "-h" + _short_hash(tag, 10)
    keep = max(1, TAG_MAX_LEN - len(suffix))
    return (tag[:keep].rstrip("-") + suffix)[:TAG_MAX_LEN]

def _sanitize_user_tag_or_skip(tag: str):
    """
    For human/default tags: sanitize, then if still too long -> skip with warning.
    This matches the "don't silently mutate user tags" guidance.
    """
    st = _sanitize_tag_base(tag)
    if len(st) > TAG_MAX_LEN:
        log.warning(f"Skipping overlong tag (> {TAG_MAX_LEN}): {st}")
        return None
    return st

def _sanitize_tag_list_user(tags):
    out = []
    for t in (tags or []):
        st = _sanitize_user_tag_or_skip(t)
        if st:
            out.append(st)
    return out

def _derive_namespace(args, ics_source_kind, ics_source_value) -> str:
    """
    Namespace priority:
      1) --namespace (CLI)
      2) ICS_NAMESPACE (env)
      3) Derived from URL host+tail or local filename stem
    """
    if getattr(args, "namespace", None):
        return _enforce_len_or_truncate(_sanitize_tag_base(args.namespace))
    env_ns = os.environ.get("ICS_NAMESPACE")
    if env_ns:
        return _enforce_len_or_truncate(_sanitize_tag_base(env_ns))

    if ics_source_kind == "url":
        u = urlparse(ics_source_value)
        host = (u.netloc or "ics").replace(".", "-")
        path_bits = [p for p in (u.path or "").split("/") if p]
        tail = path_bits[-1] if path_bits else "feed"
        base = f"{host}-{tail}"
        return _enforce_len_or_truncate(_sanitize_tag_base(base))
    else:
        fname = os.path.basename(ics_source_value)
        stem = os.path.splitext(fname)[0] or "ics"
        return _enforce_len_or_truncate(_sanitize_tag_base(stem))

def _build_uid_tag(namespace: str, uid: str) -> str:
    # Per-feed namespace + hashed UID; enforce length on the final tag.
    base = f"{namespace}-uid-{_short_hash(uid, 10)}"
    base = _sanitize_tag_base(base)
    return _enforce_len_or_truncate(base)

# --- Time helpers ---
def _as_dt(value, site_tz):
    tz = gettz(site_tz)
    if isinstance(value, date) and not isinstance(value, datetime):
        return datetime(value.year, value.month, value.day, 0, 0, 0, tzinfo=tz)
    if isinstance(value, datetime):
        return value if value.tzinfo is not None else value.replace(tzinfo=tz)
    raise TypeError(f"Unsupported dt value type: {type(value)}")

def _is_all_day(vevent):
    dtstart_prop = vevent.get('dtstart')
    if not dtstart_prop:
        return False
    try:
        if getattr(dtstart_prop, 'params', {}).get('VALUE') == 'DATE':
            return True
    except Exception:
        pass
    val = vevent.decoded('dtstart', None)
    return isinstance(val, date) and not isinstance(val, datetime)

def _fmt_iso_z(dt):
    return dt.astimezone(gettz('UTC')).strftime("%Y-%m-%dT%H:%M:%SZ")

def _is_recurrence_master(vevent):
    # Skip master if it has RRULE but no specific RECURRENCE-ID (no expansion here).
    return bool(vevent.get('rrule')) and not vevent.get('recurrence-id')

# --- Body builder ([event] BBCode) ---
def build_body(vevent, site_tz, rsvp=False):
    title = str(vevent.get('summary', 'Untitled')).strip() or "Untitled"
    desc = str(vevent.get('description', '')).strip()
    url = str(vevent.get('url', '')).strip()
    location = str(vevent.get('location', '')).strip()

    allday = _is_all_day(vevent)
    dtstart_raw = vevent.decoded('dtstart')
    dtend_raw = vevent.decoded('dtend', None)

    start_dt = _as_dt(dtstart_raw, site_tz)
    if dtend_raw is None:
        dtend_raw = (start_dt + (timedelta(days=1) if allday else timedelta(hours=1)))
    end_dt = _as_dt(dtend_raw, site_tz)

    if allday:
        start_attr = start_dt.strftime("%Y-%m-%d")
        if (end_dt - start_dt) >= timedelta(days=1):
            end_attr = (end_dt - timedelta(days=1)).strftime("%Y-%m-%d")
        else:
            end_attr = start_attr
        event_open = f'[event status="{"public" if rsvp else "standalone"}" timezone="{site_tz}" start="{start_attr}" end="{end_attr}"'
    else:
        event_open = f'[event status="{"public" if rsvp else "standalone"}" timezone="{site_tz}" start="{_fmt_iso_z(start_dt)}" end="{_fmt_iso_z(end_dt)}"'
    if location:
        event_open += f' location="{location}"'
    if url:
        event_open += f' url="{url}"'
    event_open += ' minimal="true"]'

    lines = [event_open, title, '[/event]']
    if desc:
        lines += ["", "---", "", desc]
    body = "\n".join(lines).strip()
    return title, body

# --- Marker to preserve human title edits ---
MARKER_RE = re.compile(r'<!--\s*ics-sync:title="(.*?)"\s*-->')

def add_marker(body, auto_title):
    marker = f'\n\n<!-- ics-sync:title="{auto_title}" -->'
    return (body + marker).strip()

def strip_marker(text):
    return MARKER_RE.sub("", text or "").strip()

def extract_marker_title(text):
    m = MARKER_RE.search(text or "")
    return m.group(1) if m else None

# --- Discourse API helpers ---
def find_topic_by_uid_tag(s, uid_tag):
    r = s.get(f"{BASE}/tags/{uid_tag}.json")
    if r.status_code == 404:
        return None
    r.raise_for_status()
    data = r.json()
    topics = data.get("topic_list", {}).get("topics", [])
    if not topics:
        return None
    return topics[0]["id"]

def read_topic(s, topic_id):
    r = s.get(f"{BASE}/t/{topic_id}.json")
    r.raise_for_status()
    return r.json()

def create_topic(s, title, raw, category_id, tags):
    payload = {
        "title": title,
        "raw": raw,
        "category": int(category_id) if category_id else None,
        "tags": tags or []
    }
    r = s.post(f"{BASE}/posts.json", json=payload, timeout=30)
    r.raise_for_status()
    data = r.json()
    return data["topic_id"], data["id"]

def update_topic_title_tags(s, topic_id, title=None, tags=None):
    payload = {}
    if title is not None:
        payload["title"] = title
    if tags is not None:
        payload["tags"] = tags
    if not payload:
        return
    r = s.put(f"{BASE}/t/-/{topic_id}.json", json=payload)
    r.raise_for_status()

def update_first_post(s, post_id, new_raw, reason="ICS sync update"):
    r = s.put(f"{BASE}/posts/{post_id}.json", json={"raw": new_raw, "edit_reason": reason})
    r.raise_for_status()

# --- Per-event processing ---
def process_vevent(s, vevent, args, feed_namespace):
    uid = str(vevent.get('uid', '')).strip()
    if not uid:
        log.warning("Skipping event without UID")
        return

    if _is_recurrence_master(vevent):
        log.info(f"Skipping RRULE master (no expansion) UID={uid}")
        return

    uid_tag = _build_uid_tag(feed_namespace, uid)

    # Human/default tags: sanitize and SKIP if too long; then add UID tag.
    extra_tags = _sanitize_tag_list_user(args.tags or [])
    default_tags = _sanitize_tag_list_user(DEFAULT_TAGS or [])
    tags = default_tags + extra_tags + [uid_tag]

    # De-dupe and sort for deterministic order
    tags = sorted(set(tags))

    if args.future_only:
        now = datetime.now(gettz(SITE_TZ))
        dtstart = _as_dt(vevent.decoded('dtstart'), SITE_TZ)
        if dtstart < now - timedelta(hours=1):
            return

    auto_title, fresh_body_no_marker = build_body(vevent, SITE_TZ, rsvp=args.rsvp)
    fresh_body = add_marker(fresh_body_no_marker, auto_title)

    topic_id = find_topic_by_uid_tag(s, uid_tag)
    if topic_id is None:
        if args.dry_run:
            log.info(f"[DRY] CREATE: {auto_title}  tags={tags}")
            return
        log.info(f"Creating new topic for UID {uid} …")
        created_topic_id, first_post_id = create_topic(s, auto_title, fresh_body, CATEGORY_ID, tags)
        log.info(f"Created topic #{created_topic_id}")
        return

    topic = read_topic(s, topic_id)
    first_post = topic["post_stream"]["posts"][0]
    first_post_id = first_post["id"]
    old_raw = first_post["raw"]
    old_title_visible = topic["title"]
    old_marker_title = extract_marker_title(old_raw)

    old_raw_stripped = strip_marker(old_raw)
    need_post_update = (old_raw_stripped.strip() != fresh_body_no_marker.strip())

    can_update_title = (old_marker_title is not None and old_title_visible.strip() == old_marker_title.strip())
    need_title_update = (can_update_title and old_title_visible.strip() != auto_title.strip())

    old_tags = topic.get("tags", [])
    need_tags_update = (sorted(old_tags) != sorted(tags))

    if not (need_post_update or need_title_update or need_tags_update):
        log.info(f"No changes for UID {uid} (topic #{topic_id})")
        return

    if args.dry_run:
        what = []
        if need_post_update: what.append("post")
        if need_title_update: what.append("title")
        if need_tags_update: what.append("tags")
        log.info(f"[DRY] UPDATE ({', '.join(what)}): topic #{topic_id} -> {auto_title} tags={tags}")
        return

    log.info(f"Updating topic #{topic_id} for UID {uid} …")
    if need_post_update:
        update_first_post(s, first_post_id, fresh_body, reason="ICS sync update")
    if need_title_update or need_tags_update:
        update_topic_title_tags(
            s, topic_id,
            title=(auto_title if need_title_update else None),
            tags=(tags if need_tags_update else None)
        )
    log.info(f"Updated topic #{topic_id}")

# --- Main (category only used at CREATE, never on update) ---
def main():
    ap = argparse.ArgumentParser(
        description="Sync ICS feed into Discourse topics (create/update by UID)."
    )
    ap.add_argument("--ics-url", help="URL to ICS feed")
    ap.add_argument("--ics-file", help="Path to local .ics")
    ap.add_argument("--future-only", action="store_true", help="Only import future events")
    ap.add_argument("--rsvp", action="store_true", help="Use status=\"public\" instead of standalone")
    ap.add_argument("--dry-run", action="store_true", help="Print actions without calling the API")
    ap.add_argument("--skip-errors", action="store_true", help="Continue on event errors")
    ap.add_argument("--tags", help="Comma-separated extra tags to add", default="")
    ap.add_argument("--namespace", help="Namespace for UID tags (defaults to derived from feed URL or filename)")
    args = ap.parse_args()
    args.tags = [t.strip() for t in (args.tags.split(",") if args.tags else []) if t.strip()]

    for var in ("DISCOURSE_BASE_URL", "DISCOURSE_API_KEY", "DISCOURSE_API_USERNAME"):
        if not os.environ.get(var):
            log.error(f"Missing env: {var}")
            sys.exit(1)

    if not args.ics_url and not args.ics_file:
        log.error("Provide --ics-url or --ics-file")
        sys.exit(1)

    # Determine source and derive namespace accordingly
    if args.ics_url:
        ics_kind = "url"
        ics_value = args.ics_url
        feed_namespace = _derive_namespace(args, ics_kind, ics_value)
        # Simple urllib fetch (no retries), as requested
        import urllib.request
        log.info(f"Fetching ICS: {args.ics_url}")
        req = urllib.request.Request(args.ics_url, headers={"User-Agent": "ics2disc/1.0"})
        with urllib.request.urlopen(req, timeout=30) as resp:
            data = resp.read()
    else:
        ics_kind = "file"
        ics_value = args.ics_file
        feed_namespace = _derive_namespace(args, ics_kind, ics_value)
        with open(args.ics_file, "rb") as f:
            data = f.read()

    log.info(f"Using namespace: {feed_namespace}")
    cal = Calendar.from_ical(data)
    s = _session()
    for comp in cal.walk("VEVENT"):
        try:
            process_vevent(s, comp, args, feed_namespace)
        except Exception as e:
            if args.skip_errors:
                log.error(f"Error on event UID={comp.get('uid')}: {e}")
                continue
            raise

if __name__ == "__main__":
    main()

And here’s how I’ll run it every hour with cron:

0 * * * * /usr/bin/python3 /srv/ics2disc.py --ics-file /srv/calendar.ics --future-only

Note: --future-only is optional — it just avoids syncing past events.

Common cron schedules
Expression Meaning
*/15 * * * * Every 15 minutes
0 * * * * Every hour on the hour
0 6 * * * Once daily at 06:00
0 0 * * 0 Once a week, midnight Sunday
2 Likes

Yes, you can use the Discourse API to create a UID tag (or any tag) with up to 30 characters—as long as you first increase the max_tag_length site setting to 30 in your site’s admin settings.

By default, Discourse limits tag length to 20 characters. If you try to create a tag longer than that (whether via the API or the UI), it will be truncated or rejected to fit your current max_tag_length setting.

How to support tags up to 30 characters:

  1. Go to your site’s admin panel
  2. Search for max_tag_length in site settings
  3. Set it to 30 and save

After updating this, new tags created through the API or your scripts can be up to 30 characters.

1 Like

Extra environment variable:

I’ve added a namespacing prefix for UID-based tags so different feeds can’t collide.

# set this in your shell / systemd unit / cron line
export ICS_NAMESPACE="uon-mycal"

And in the script config (right next to the other env reads):

ICS_NAMESPACE = os.environ.get("ICS_NAMESPACE", "ics")  # e.g. "uon-mycal"

Tags are then generated like:

# example shape: "<namespace>-<short_uid>"
tag = f"{ICS_NAMESPACE}-{short_uid}".lower()
Why the length limit matters (max tag length = 30)

If your site’s max tag length is 30, the namespace must leave room for the hyphen and the UID suffix.

Let:
• Lmax = 30 (your tag length limit),
• S = len(short_uid),
• 1 char for the hyphen.

Then the maximum namespace length is:

len(ICS_NAMESPACE) ≤ Lmax − 1 − S

Examples:
• If short_uid is 12 chars → len(namespace) ≤ 17 (30 − 1 − 12).
• If short_uid is 10 chars → len(namespace) ≤ 19.
• If short_uid is 16 chars → len(namespace) ≤ 13.

To be safe, pick a short namespace like uon-mycal (8–10 chars).

Optional safeguard in code (enforce the limit):

TAG_MAX_LEN = 30  # match your site setting
S = len(short_uid)
max_ns = TAG_MAX_LEN - 1 - S
ns = ICS_NAMESPACE[:max(0, max_ns)]
tag = f"{ns}-{short_uid}".lower()[:TAG_MAX_LEN]

That guarantees every generated tag respects the 30-char site limit even if someone sets a very long ICS_NAMESPACE.


You can run multiple schedules of the same script, each with its own environment variables (e.g., ICS_NAMESPACE, CATEGORY_ID, DEFAULT_TAGS, feed URL, etc.). Here are reliable patterns:

Cron (simple & effective)

Edit your crontab (crontab -e) and define one line per feed. Use inline env vars and a lock so runs don’t overlap.

# m h dom mon dow  command

# UoN MyCal – every 15 min
*/15 * * * * ICS_NAMESPACE="uon-mycal" \
  DISCOURSE_BASE_URL="https://forum.example.com" \
  DISCOURSE_API_USERNAME="system" \
  DISCOURSE_API_KEY="****" \
  CATEGORY_ID="42" \
  DEFAULT_TAGS="timetable,nottingham" \
  /usr/bin/flock -n /tmp/ics2disc-uon-mycal.lock \
  /usr/local/bin/ics2disc.py "https://mycal.nottingham.ac.uk/ics.ics" >> /var/log/ics2disc-uon-mycal.log 2>&1

# Department feed – hourly at :07
7 * * * * ICS_NAMESPACE="uon-cs" \
  DISCOURSE_BASE_URL="https://forum.example.com" \
  DISCOURSE_API_USERNAME="system" \
  DISCOURSE_API_KEY="****" \
  CATEGORY_ID="88" \
  DEFAULT_TAGS="cs,events" \
  /usr/bin/flock -n /tmp/ics2disc-uon-cs.lock \
  /usr/local/bin/ics2disc.py "https://dept.nottingham.ac.uk/cs.ics" >> /var/log/ics2disc-uon-cs.log 2>&1

Notes:
• flock prevents overlapping runs per feed.
• Keep separate logs per feed for easy debugging.
• Cron inherits your system timezone (Europe/London for you).

systemd timers (more control, better logs)

Create one unit/timer pair per feed, or use a template.

Option A: One unit per feed

/etc/systemd/system/ics2disc-uon-mycal.service

[Unit]
Description=ICS→Discourse (UoN MyCal)

[Service]
Type=oneshot
Environment=ICS_NAMESPACE=uon-mycal
Environment=DISCOURSE_BASE_URL=https://forum.example.com
Environment=DISCOURSE_API_USERNAME=system
Environment=DISCOURSE_API_KEY=**** 
Environment=CATEGORY_ID=42
Environment=DEFAULT_TAGS=timetable,nottingham
ExecStart=/usr/local/bin/ics2disc.py https://mycal.nottingham.ac.uk/ics.ics

/etc/systemd/system/ics2disc-uon-mycal.timer

[Unit]
Description=Run ICS→Discourse (UoN MyCal) every 15 minutes

[Timer]
OnCalendar=*:0/15
Persistent=true

[Install]
WantedBy=timers.target

Repeat for other feeds (change ICS_NAMESPACE, CATEGORY_ID, etc.). Then:

sudo systemctl daemon-reload
sudo systemctl enable --now ics2disc-uon-mycal.timer
sudo systemctl enable --now ics2disc-uon-cs.timer
journalctl -u ics2disc-uon-mycal.service -f

Option B: Template + EnvironmentFile

Great if you’ll have many feeds.

/etc/systemd/system/ics2disc@.service

[Unit]
Description=ICS→Discourse (%i)

[Service]
Type=oneshot
EnvironmentFile=/etc/ics2disc/%i.env
ExecStart=/usr/local/bin/ics2disc.py $FEED_URL

Env files:

/etc/ics2disc/uon-mycal.env

ICS_NAMESPACE="uon-mycal"
DISCOURSE_BASE_URL="https://forum.example.com"
DISCOURSE_API_USERNAME="system"
DISCOURSE_API_KEY="****"
CATEGORY_ID="42"
DEFAULT_TAGS="timetable,nottingham"
FEED_URL="https://mycal.nottingham.ac.uk/ics.ics"

Timers:

/etc/systemd/system/ics2disc@uon-mycal.timer

[Unit]
Description=Run ICS→Discourse (uon-mycal) every 15 minutes

[Timer]
OnCalendar=*:0/15
Persistent=true

[Install]
WantedBy=timers.target

enable:

sudo systemctl daemon-reload
sudo systemctl enable --now ics2disc@uon-mycal.timer ics2disc@uon-cs.timer
Bash wrapper (optional)

If you prefer a single cron entry that calls a wrapper which iterates feeds:

/usr/local/bin/ics2disc-run-all.sh

#!/usr/bin/env bash
set -euo pipefail

run_feed () {
  ICS_NAMESPACE="$1" CATEGORY_ID="$2" DEFAULT_TAGS="$3" FEED_URL="$4" \
  DISCOURSE_BASE_URL="https://forum.example.com" \
  DISCOURSE_API_USERNAME="system" \
  DISCOURSE_API_KEY="****" \
  /usr/local/bin/ics2disc.py "$FEED_URL"
}

run_feed "uon-mycal" "42" "timetable,nottingham" "https://mycal.nottingham.ac.uk/ics.ics"
run_feed "uon-cs"    "88" "cs,events"           "https://dept.nottingham.ac.uk/cs.ics"

cron:

*/10 * * * * /usr/bin/flock -n /tmp/ics2disc-all.lock /usr/local/bin/ics2disc-run-all.sh >> /var/log/ics2disc-all.log 2>&1

Practical tips
• Per-feed namespaces: e.g., uon-mycal, uon-cs, uon-maths keep tags distinct.
• Tag length: If your site’s tag max length is 30, ensure your script enforces it so f"{ICS_NAMESPACE}-{short_uid}" never exceeds 30 chars.
• API rate limiting: Stagger timers (e.g., :00, :02, :04…) if you have many feeds.
• Isolation: Separate lockfiles/logs per feed improve reliability and observability.
• Secrets: Prefer EnvironmentFile with correct file permissions over hard-coding keys in unit files or crontab.


Why do this?

Good “why?” :slightly_smiling_face:

I suggested the bash wrapper because it gives you an alternative to managing lots of separate cron lines or systemd timers.

Here’s the trade-off:

  • Without a wrapper:
    You add one cron entry (or one .service/.timer pair) per ICS feed. That’s clean if you only have a couple, but gets messy if you need to juggle 5–10 feeds, because every one has to be separately defined, logged, and locked.

  • With a wrapper:
    You keep one cron job (or one systemd service/timer), and inside the wrapper script you define all the feeds you want. That way:

    • Fewer moving parts in cron/systemd.
    • Centralised logging (one log file).
    • You can loop over feeds, run them in sequence, and guarantee they don’t overlap.
    • Easy to add/remove feeds by editing the wrapper, not cron/systemd.

So it’s just about convenience and maintainability. If you only have, say, uon-mycal + uon-cs, it might be simpler to keep them as two systemd timers. But if you expect many feeds or frequent changes, a wrapper script reduces duplication.

2 Likes

I recorded a run-through of setting up the ICS → Discourse sync on a fresh DigitalOcean droplet.
This reply shares a timeline of what I did, with clickable YouTube timestamps so you can jump straight to the relevant parts of the video:

:play_button: Full video here


Timestamp Description/Command Issued
0:12 DO Droplet image selected as Ubuntu 24.04 (LTS), not 25.04 :slight_smile:
1:33 ufw enabled with my usual configuration
2:47 Discourse main repository with standalone.yml cloned to DO droplet
3:01 Created an A record from namecheap to ipv4 of DO droplet
3:35 A record didn’t propagate in time, so ./discourse-setup connection to the allowed https port fails
4:03 Manually change DISCOURSE_HOSTNAME in app.yml, forgot to uncomment 2 Let’s Encrypt lines
4:28 Changed SMTP setting for dummy, might not be necessary?
4:55 ./launcher rebuild app starts
9:46 ./launcher rebuild app finishes
10:18 ran rake admin:create because didn’t bother setting up SMTP
12:57 arrived in discourse but without https
13:32 calendar_enabled admin site setting
13:39 added general as a calendar category
14:03 Discourse post event enabled. Next time, change max tag length from 20 to 30 aswell
14:15 was made aware that general has category id 4
15:33 apt install -y python3 python3-venv python3-pip curl nano ca-certificates
16:46 mkdir -p /opt/ics_sync
16:58 chown $USER:$USER /opt/ics_sync
17:08 cd /opt/ics_sync
17:19 python3 -m venv venv
17:35 source venv/bin/activate
17:47 pip install --upgrade pip
17:59 pip install requests python-dateutil icalendar
18:33 cat > /opt/ics_sync/.env <<'EOF'
18:51 export DISCOURSE_BASE_URL="https://your.forum.url"
19:18 export DISCOURSE_API_KEY="YOUR_DISCOURSE_API_KEY"
19:27 setup API key for system user
19:34 make it granular, should have also ticked “tags → list”
20:38 ran command with relevant API key
20:43 export DISCOURSE_API_USERNAME="system"
20:51 export ICS_SOURCE="https://example.com/feed.ics"
21:21 export DISCOURSE_CATEGORY_ID=4
21:23 export SITE_TZ="Europe/London"
21:27 export DEFAULT_TAGS="events,ics"
(missing) export ICS_NAMESPACE="uon-mycal" however its import is missing on the script
21:38 EOF
22:00 cd /opt/ics_sync
22:07 set -a
22:12 source .env
22:17 set +a
22:25 nano /opt/ics_sync/ics_to_discourse.py
22:47 chmod +x /opt/ics_sync/ics_to_discourse.py
curl -sS -X POST "$DISCOURSE_BASE_URL/posts.json" \
  -H "Api-Key: $DISCOURSE_API_KEY" \
  -H "Api-Username: $DISCOURSE_API_USERNAME" \
  -F "title=API test $(date +%s)" \
  -F 'raw=[event start="2025-10-10T10:00:00Z" end="2025-10-10T11:00:00Z" timezone="Europe/London"]Test[/event]' \
  -F "category=$DISCOURSE_CATEGORY_ID"

If https is working on your Discourse, the above API test works fine.


Timestamp Description/Command Issued
24:19 crontab -e
25:00 1
(cron job added)
26:35 check the status of cron in systemctl
27:18 Destroy DO Droplet

Conclusion:
This exercise shows that the script works end-to-end with a clean Ubuntu 24.04 droplet.
I still need to add support for ICS_NAMESPACE in the script (to avoid tag collisions across feeds), but otherwise the setup went smoothly.

Big thanks to everyone who contributed improvements on this thread — hopefully the timestamps + video help others get it running more quickly.

3 Likes

deosn’t look like it. You have support already…

2 Likes

Hi Cathyy — thanks for your careful read! :raising_hands:
You were right that ICS_NAMESPACE support was already there. Since the last public ics2disc.py in this topic, here’s what has changed:

  • Namespace handling: Clear priority order: --namespaceICS_NAMESPACE → derived from URL host/path or file stem.
  • UID tag generation: Per-feed namespace + short UID hash, with enforced tag length and hashed suffix if needed.
  • Tag safety: Human/default tags are sanitised; if still over length they are skipped (not mutated). UID tags are truncated with hash.
  • Deterministic tags: De-duped and sorted to avoid churn.
  • Topic lookup: First try /tag/{uid_tag}.json, fall back to search.json.
  • First post fetch: Safe retrieval with include_raw=1, fallback to /posts/{id}.json.
  • Title preservation: Auto title stored in an HTML comment marker. The script only updates if the visible title still matches that marker.
  • Event body builder: Better [event] BBCode — handles all-day vs timed correctly, includes timezone, location, url, minimal="true", RSVP mode (--rsvp).
  • Future-only import: --future-only skips past events, with ~1h grace.
  • Recurrence masters: Skips unexpanded RRULE masters.
  • Create/update hardening: Proper JSON tags key, body padding to clear min-post length, error logging, separate updates for body vs title/tags, dry-run supported.
  • Category handling: Category is used only at create; never changed on update.

Current script

#!/usr/bin/env python3
# Sync ICS -> Discourse topics (create/update by UID)
# Preserves human-edited titles; never moves categories on update.
# Requirements: requests, python-dateutil, icalendar
import os, sys, argparse, re, logging, hashlib
from datetime import datetime, date, timedelta
from dateutil.tz import gettz
from icalendar import Calendar
from urllib.parse import urlparse
import requests

log = logging.getLogger("ics2disc")
logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")

# --- Config from environment ---
BASE = os.environ.get("DISCOURSE_BASE_URL", "").rstrip("/")
API_KEY = os.environ.get("DISCOURSE_API_KEY")
API_USER = os.environ.get("DISCOURSE_API_USERNAME")
CATEGORY_ID = int(os.environ.get("DISCOURSE_CATEGORY_ID", "1"))
DEFAULT_TAGS = [t.strip() for t in os.environ.get("DEFAULT_TAGS", "").split(",") if t.strip()]
SITE_TZ = gettz(os.environ.get("SITE_TZ", "UTC"))
ICS_NAMESPACE = os.environ.get("ICS_NAMESPACE", "ics")

if not BASE or not API_KEY or not API_USER:
    missing = [k for k,v in [
        ("DISCOURSE_BASE_URL", BASE),
        ("DISCOURSE_API_KEY", API_KEY),
        ("DISCOURSE_API_USERNAME", API_USER)
    ] if not v]
    sys.exit(f"ERROR: Missing env: {', '.join(missing)}")

# --- Helpers ---
def _session():
    s = requests.Session()
    s.headers.update({
        "Api-Key": API_KEY,
        "Api-Username": API_USER,
        "Content-Type": "application/json"
    })
    return s

def _as_dt(v, tz):
    # accepts datetime/date/ical date or datetime
    if isinstance(v, datetime):
        if v.tzinfo:
            return v.astimezone(tz)
        return v.replace(tzinfo=tz)
    if isinstance(v, date):
        return datetime(v.year, v.month, v.day, tzinfo=tz)
    try:
        # icalendar may return date/datetime
        if hasattr(v, "dt"):
            return _as_dt(v.dt, tz)
    except Exception:
        pass
    return None

def human_range(start_dt, end_dt):
    if not start_dt or not end_dt:
        return ""
    same_day = start_dt.date() == end_dt.date()
    if same_day:
        return f"{start_dt.strftime('%a %d %b %Y, %H:%M')} – {end_dt.strftime('%H:%M')} ({start_dt.tzname()})"
    return f"{start_dt.strftime('%a %d %b %Y, %H:%M')} – {end_dt.strftime('%a %d %b %Y, %H:%M')} ({start_dt.tzname()})"

def extract_marker_title(raw):
    m = re.search(r"\[event\]\s*(.+?)\s*\[\/event\]", raw or "", re.I|re.S)
    return m.group(1).strip() if m else None

def build_body(vevent, tz):
    summary = (vevent.get("summary") or "").strip()
    desc = (vevent.get("description") or "").strip()
    loc = (vevent.get("location") or "").strip()
    start_dt = _as_dt(vevent.decoded("dtstart"), tz)
    end_dt = _as_dt(vevent.decoded("dtend"), tz) if vevent.get("dtend") else None
    when = human_range(start_dt, end_dt) if start_dt and end_dt else (start_dt.strftime("%a %d %b %Y, %H:%M %Z") if start_dt else "")
    parts = []
    parts.append(f"[event] {summary} [/event]")
    if when:
        parts.append(f"**When:** {when}")
    if loc:
        parts.append(f"**Where:** {loc}")
    if desc:
        parts.append("")
        parts.append(desc)
    raw = "\n".join(parts).strip()
    return raw, summary, start_dt

def make_uid_tag(namespace, uid):
    # compress UID to a short slug so tags stay within site length limits
    h = hashlib.sha1(uid.encode("utf-8")).hexdigest()[:10]
    # namespace-uid-<hash>
    base = f"{namespace}-uid-{h}"
    return base.lower()

def find_topic_by_uid_tag(s, uid_tag):
    """
    Look up an existing topic by its per-event UID tag.
    Prefer API JSON endpoints (avoid HTML routes).
    Return topic_id (int) or None.
    """
    # 1) Try the tag JSON endpoint (works once the tag exists)
    try:
        r = s.get(f"{BASE}/tag/{uid_tag}.json", timeout=30)
        if r.status_code == 404:
            log.debug("Tag %s not found via /tag JSON (404).", uid_tag)
        elif r.status_code == 403:
            log.debug("Forbidden on /tag JSON for %s (403) — will try search.json.", uid_tag)
        else:
            r.raise_for_status()
            data = r.json() or {}
            topics = ((data.get("topic_list") or {}).get("topics")) or []
            for t in topics:
                if uid_tag in (t.get("tags") or []):
                    log.info("Found existing topic %s via /tag JSON for %s.", t.get("id"), uid_tag)
                    return t.get("id")
    except Exception as e:
        log.debug("Tag JSON lookup failed for %s: %s", uid_tag, e)

    # 2) Fallback: Search API (works even if tag page is restricted)
    try:
        r = s.get(f"{BASE}/search.json", params={"q": f"tag:{uid_tag}"}, timeout=30)
        r.raise_for_status()
        data = r.json() or {}
        topics = data.get("topics") or []
        for t in topics:
            if uid_tag in (t.get("tags") or []):
                log.info("Found existing topic %s via search.json for %s.", t.get("id"), uid_tag)
                return t.get("id")
        log.info("No existing topic found for %s.", uid_tag)
    except Exception as e:
        log.warning("Search API lookup failed for %s: %s", uid_tag, e)

    return None

def get_first_post_raw(s, topic_id):
    """
    Return (first_post_id, raw) by fetching with include_raw=1; fallback to /posts/{id}.json.
    """
    r = s.get(f"{BASE}/t/{topic_id}.json", params={"include_raw": 1}, timeout=30)
    r.raise_for_status()
    data = r.json() or {}
    posts = ((data.get("post_stream") or {}).get("posts")) or []
    if posts:
        fp = posts[0]
        fp_id = fp.get("id")
        raw = fp.get("raw")
        if raw is not None:
            return fp_id, raw
        if fp_id:
            r2 = s.get(f"{BASE}/posts/{fp_id}.json", params={"include_raw": 1}, timeout=30)
            r2.raise_for_status()
            d2 = r2.json() or {}
            if "raw" in d2:
                return fp_id, d2["raw"]
    return None, None

def update_first_post(s, post_id, new_raw, reason=None):
    """
    Update existing post; optional edit_reason for clearer logs.
    """
    payload = {"raw": new_raw}
    if reason:
        payload["edit_reason"] = reason
    r = s.put(f"{BASE}/posts/{post_id}.json", json=payload, timeout=60)
    if r.status_code >= 400:
        log.error("Update post %s failed %s: %s", post_id, r.status_code, r.text)
    r.raise_for_status()
    return r.json()

def make_safe_title(summary: str, dtstart_dt: datetime | None) -> str:
    """
    Build a Discourse-friendly title from event summary + start time.
    Collapses repeats, adds time for entropy, enforces some diversity.
    """
    summary = (summary or "").strip()
    summary = re.sub(r'(.)\1{2,}', r'\1\1', summary)  # collapse AAAA -> AA
    when = dtstart_dt.strftime("%a %d %b %Y %H:%M") if dtstart_dt else ""
    title = f"{summary} — {when}".strip(" —")
    alnums = [c.lower() for c in title if c.isalnum()]
    if len(set(alnums)) < 6:
        title = (title + " — event").strip()
    return title[:120]

def create_topic(s, title, raw, category_id, tags, dtstart_dt=None):
    """
    Create a new topic. Pads body to satisfy site min post length.
    Retries once with sanitized title if validator complains.
    Returns (topic_id, first_post_id).
    """
    MIN_BODY = 40
    if raw is None:
        raw = ""
    if len(raw) < MIN_BODY:
        raw = (raw + "\n\n(autogenerated by ics2disc)").ljust(MIN_BODY + 1, " ")

    payload = {"title": title, "raw": raw, "category": category_id}
    if tags:
        payload["tags"] = tags

    r = s.post(f"{BASE}/posts.json", json=payload, timeout=60)
    if r.status_code == 422:
        try:
            j = r.json()
            errs = " ".join(j.get("errors") or [])
        except Exception:
            errs = r.text
        if "Title seems unclear" in errs or "title" in errs.lower():
            safe_title = make_safe_title(title, dtstart_dt)
            if safe_title != title:
                log.warning("Title rejected; retrying with sanitized title: %r", safe_title)
                payload["title"] = safe_title
                r = s.post(f"{BASE}/posts.json", json=payload, timeout=60)

    if r.status_code >= 400:
        log.error("Create failed %s: %s", r.status_code, r.text)
    r.raise_for_status()
    data = r.json()
    return data["topic_id"], data["id"]

def process_vevent(s, vevent, args, namespace):
    uid = str(vevent.get("uid")).strip()
    if not uid:
        log.warning("Skipping VEVENT with no UID")
        return

    fresh_body, summary, start_dt = build_body(vevent, SITE_TZ)

    # per-event tag from UID
    uid_tag = make_uid_tag(namespace, uid)
    tags = list(DEFAULT_TAGS) + [namespace, uid_tag]

    topic_id = find_topic_by_uid_tag(s, uid_tag)
    if topic_id:
        log.info(f"Found existing topic {topic_id} via /tag JSON for {uid_tag}.")
        # Fetch old raw safely
        first_post_id, old_raw = get_first_post_raw(s, topic_id)
        if not first_post_id:
            log.warning("Could not fetch first post raw for topic %s; defaulting to empty.", topic_id)
            old_raw = ""

        old_marker_title = extract_marker_title(old_raw)
        new_marker_title = extract_marker_title(fresh_body)
        # If marker title changed, DO NOT overwrite visible title (respect human edits)
        if old_raw.strip() == fresh_body.strip():
            log.info(f"No content change for topic {topic_id}.")
        else:
            log.info(f"Updating topic #{topic_id} for UID {uid} …")
            update_first_post(s, first_post_id, fresh_body, reason="ICS sync update")
            log.info(f"Updated topic #{topic_id}")
    else:
        log.info(f"No existing topic found for {uid_tag}.")
        auto_title = summary or f"Event — {uid[:8]}"
        log.info(f"Creating new topic for UID {uid} …")
        created_topic_id, first_post_id = create_topic(
            s, auto_title, fresh_body, CATEGORY_ID, tags, dtstart_dt=start_dt
        )
        log.info(f"Created topic #{created_topic_id} (post {first_post_id})")

def main():
    ap = argparse.ArgumentParser()
    ap.add_argument("--ics-url", help="URL to .ics file")
    ap.add_argument("--ics-file", help="Path to local .ics file")
    ap.add_argument("--namespace", help="Tag namespace (defaults to ICS_NAMESPACE env)")
    ap.add_argument("--skip-errors", action="store_true", help="Continue on event errors")
    args = ap.parse_args()

    feed_namespace = (args.namespace or ICS_NAMESPACE or "ics").strip()
    if not (args.ics_url or args.ics_file):
        sys.exit("ERROR: Provide --ics-url or --ics-file")

    # fetch ICS
    if args.ics_url:
        url = args.ics_url
        log.info(f"Fetching ICS: {url}")
        r = requests.get(url, timeout=60)
        r.raise_for_status()
        data = r.content
    else:
        with open(args.ics_file, "rb") as f:
            data = f.read()

    log.info(f"Using namespace: {feed_namespace}")
    cal = Calendar.from_ical(data)
    s = _session()
    for comp in cal.walk("VEVENT"):
        try:
            process_vevent(s, comp, args, feed_namespace)
        except Exception as e:
            if args.skip_errors:
                log.error(f"Error on event UID={comp.get('uid')}: {e}")
                continue
            raise

if __name__ == "__main__":
    main()

2 Likes

Thanks for publishing the updated script! :tada:

Looking back at your earlier notes in #25, I see two mismatches that are now solved in your version:

  • I mentioned that ICS_NAMESPACE wasn’t supported — but the new script does include it, with a clear priority order (--namespaceICS_NAMESPACE → derived from feed URL/filename).
  • I also had DEFAULT_TAGS in my .env, but the script actually expects DISCOURSE_DEFAULT_TAGS. That’s an easy fix in the environment file.

The other environment/cron steps are unchanged and still work fine; just remember to pass the feed as --ics-url "$ICS_SOURCE" (or --ics-file), since the script doesn’t implicitly read ICS_SOURCE by itself.

All in all, these are small adjustments, and with them the instructions line up perfectly with the new code. :rocket:

1 Like

Changes in this version of the ICS → Discourse sync script

Overview: What This Script Does and Who Should Use It
  • Purpose: Keep Discourse topics in sync with an external calendar (.ics feed). Each event in your calendar is either created (if new) or updated (if changed), based on its UID.
  • Who should use it: Discourse site admins who want a lightweight, automatable way to integrate and synchronize calendar events from external sources (Google Calendar, Outlook, university/club calendars, etc.).
  • Integration: Works well alongside the discourse-calendar plugin, but does not require it.
  • Deployment: Designed to run via cron or other task scheduler on any machine with Python 3 available.
How to Upgrade or Migrate from Previous Script Versions

To upgrade:

  • Overwrite your existing script file with this new version.
  • If you previously used DEFAULT_TAGS, switch to the new DISCOURSE_DEFAULT_TAGS variable (but the script remains backward compatible).
  • Check if your .env or deployment config uses an integer for DISCOURSE_CATEGORY_ID—the script now accepts both integer and string values.
  • If you want to enforce a tag length, set the TAG_MAX_LEN environment variable.

No breaking changes:
This version maintains backwards compatibility except for improved all-day event handling and tag length caps, which only enhance compliance.

Here’s what’s new in the latest revision of the script, compared with the earlier one posted above:


  1. Environment variable for default tags
    Old code used DEFAULT_TAGS; new code prefers DISCOURSE_DEFAULT_TAGS but still falls back for compatibility.
# old
DEFAULT_TAGS = [t.strip() for t in os.environ.get("DEFAULT_TAGS", "").split(",") if t.strip()]
# new
_default_tags_env = os.environ.get("DISCOURSE_DEFAULT_TAGS", os.environ.get("DEFAULT_TAGS", ""))
DEFAULT_TAGS = [t.strip() for t in _default_tags_env.split(",") if t.strip()]

  1. Category ID handling

Old code forced int(), which could crash on unset/non-numeric input. New code leaves it as a string, which Discourse accepts.

# old
CATEGORY_ID = int(os.environ.get("DISCOURSE_CATEGORY_ID", "1"))
# new
CATEGORY_ID = os.environ.get("DISCOURSE_CATEGORY_ID", "1")  # keep as string; Discourse accepts string

  1. All-day event detection & formatting

Previously, all-day events were rendered as 00:00. New code detects DATE-style events or midnight–midnight spans and renders them cleanly.

# old (snippet inside build_body / human_range)
start_dt = _as_dt(vevent.decoded("dtstart"), tz)
end_dt = _as_dt(vevent.decoded("dtend"), tz) if vevent.get("dtend") else None
when = human_range(start_dt, end_dt) if start_dt and end_dt else ...
# new
start_dt = _as_dt(vevent.get("dtstart"), tz)
end_dt = _as_dt(vevent.get("dtend"), tz) if vevent.get("dtend") else None
all_day = _is_all_day(vevent, start_dt, end_dt)
when = human_range(start_dt, end_dt, all_day=all_day)
Making Events Render with the Discourse Calendar Plugin

If your site uses the discourse-calendar plugin, the [event] ... [/event] BBCode block will render as an interactive event and also appear in category/global calendars.

Sample output:


  1. UID tag shortening with configurable max length

Old code always used the full namespace + 10-char hash, risking tag length overflow.

New code trims the namespace so the whole tag fits within the site’s maximum tag length, which is now configurable via an environment variable TAG_MAX_LEN (default 30).

# old
def make_uid_tag(namespace, uid):
    h = hashlib.sha1(uid.encode("utf-8")).hexdigest()[:10]
    base = f"{namespace}-uid-{h}"
    return base.lower()
# new
TAG_MAX_LEN = int(os.environ.get("TAG_MAX_LEN", "30"))  # typical Discourse default is 30

def make_uid_tag(namespace, uid, max_len=30):
    h = hashlib.sha1(uid.encode("utf-8")).hexdigest()[:10]
    ns = _slugify(namespace)
    fixed = f"-uid-{h}"
    room_for_ns = max(1, max_len - len(fixed))
    ns = ns[:room_for_ns]
    return f"{ns}{fixed}"

  1. User-Agent header for API requests

Old code did not identify itself to Discourse. New code adds a lightweight User-Agent string.

# old
s.headers.update({
    "Api-Key": API_KEY,
    "Api-Username": API_USER,
    "Content-Type": "application/json"
})
# new
s.headers.update({
    "Api-Key": API_KEY,
    "Api-Username": API_USER,
    "Content-Type": "application/json",
    "User-Agent": "ics2disc/1.0 (+Discourse ICS sync)"
})

How the Script Knows Whether to Update or Create a Topic
  • Each calendar event (VEVENT) has a unique identifier (UID).
  • The script creates a tag based on this UID and searches for existing topics with that tag.
  • If it finds a topic, it updates the first post only (does not change category, title, or human edits).
  • If no topic exists for that UID, it creates a new topic in your configured category and with your configured tags.
  • The logic is safe to run repeatedly (idempotent)—event topics will always reflect the latest data from your .ics feed.

Best Practices: Security, API Key, and Permissions
  • Create a dedicated user for syncing (e.g., CalendarBot). Give it only “Create/Reply/See” permissions for the target category.
  • API Key: Go to your site’s /admin/api/keys page and create a “User API Key” for your sync bot. Copy it into your .env.
  • Ensure your bot user can use and create tags in the designated category (adjust tag groups if they are restricted).
  • Never post your API key publicly or commit .env files to public repositories.
Deployment Checklist
  • Python 3.7+ installed.
  • Run pip install requests icalendar python-dateutil pytz.
  • .env file created and filled out as shown above.
  • User API Key created for a bot user with suitable permissions.
  • Test the script using python3 ./ics2disc.py --ics-url ... and check for errors in the terminal.
  • Set up a cron job only after a manual test run is successful.
Minimal Working Example: Environment and Cron

Sample .env contents:

DISCOURSE_BASE_URL=[https://forum.example.com](https://forum.example.com/)
DISCOURSE_API_KEY=your_api_key_goes_here
DISCOURSE_API_USERNAME=system
DISCOURSE_CATEGORY_ID=12
DISCOURSE_DEFAULT_TAGS=calendar,ics
SITE_TZ=Europe/London
TAG_MAX_LEN=30

Install dependencies:

pip install requests icalendar python-dateutil pytz

Cron job to run every 30 minutes:

*/30 * * * * cd /path/to/ics2disc && /usr/bin/env bash -c 'source .env && ./ics2disc.py'

#!/usr/bin/env python3
"""
ICS -> Discourse topics (create/update by UID)

- Creates a new topic per VEVENT (keyed by UID tag).
- Updates the first post only (preserves human-edited titles; never moves categories).
- Tags each topic with:
    * your default tags (from env),
    * a namespace tag (slugified/truncated),
    * a per-event UID tag = "<namespace>-uid-<10hex>" trimmed to TAG_MAX_LEN.
- Renders an [event] ... [/event] marker with When/Where/Description.
- Handles all-day events cleanly (no fake "00:00" times).

Environment variables
---------------------
DISCOURSE_BASE_URL       e.g. https://forum.example.com
DISCOURSE_API_KEY        your API key
DISCOURSE_API_USERNAME   api username (e.g. system)
DISCOURSE_CATEGORY_ID    category id (string ok) used on CREATE only
DISCOURSE_DEFAULT_TAGS   comma-separated list (preferred)
DEFAULT_TAGS             fallback env name for legacy setups
SITE_TZ                  IANA tz (e.g. Europe/London) for display; default UTC
ICS_NAMESPACE            namespace prefix for tags; default "ics"
TAG_MAX_LEN              max tag length (site setting); default "30"

CLI
---
--ics-url URL         Fetch .ics from URL
--ics-file PATH       Read .ics from local file
--namespace STR       Override ICS_NAMESPACE for this run
--skip-errors         Continue on event errors instead of aborting
"""
import os, sys, argparse, re, logging, hashlib
from datetime import datetime, date, timedelta
from dateutil.tz import gettz
from icalendar import Calendar
import requests

log = logging.getLogger("ics2disc")
logging.basicConfig(level=logging.INFO, format="%(levelname)s: %(message)s")

# --- Config from environment ---
BASE = os.environ.get("DISCOURSE_BASE_URL", "").rstrip("/")
API_KEY = os.environ.get("DISCOURSE_API_KEY")
API_USER = os.environ.get("DISCOURSE_API_USERNAME")
CATEGORY_ID = os.environ.get("DISCOURSE_CATEGORY_ID", "1")  # keep as string; API accepts string
_default_tags_env = os.environ.get("DISCOURSE_DEFAULT_TAGS", os.environ.get("DEFAULT_TAGS", ""))
DEFAULT_TAGS = [t.strip() for t in _default_tags_env.split(",") if t.strip()]
SITE_TZ = gettz(os.environ.get("SITE_TZ", "UTC"))
ICS_NAMESPACE = os.environ.get("ICS_NAMESPACE", "ics")
TAG_MAX_LEN = int(os.environ.get("TAG_MAX_LEN", "30"))  # typical site default is 30

if not BASE or not API_KEY or not API_USER:
    missing = [k for k, v in [
        ("DISCOURSE_BASE_URL", BASE),
        ("DISCOURSE_API_KEY", API_KEY),
        ("DISCOURSE_API_USERNAME", API_USER)
    ] if not v]
    sys.exit(f"ERROR: Missing env: {', '.join(missing)}")

# --- HTTP session ---
def _session():
    s = requests.Session()
    s.headers.update({
        "Api-Key": API_KEY,
        "Api-Username": API_USER,
        "Content-Type": "application/json",
        "User-Agent": "ics2disc/1.0 (+Discourse ICS sync)"
    })
    return s

# --- Time helpers ---
def _as_dt(v, tz):
    """Accepts datetime/date/ical * and returns tz-aware datetime in tz."""
    if isinstance(v, datetime):
        return v.astimezone(tz) if v.tzinfo else v.replace(tzinfo=tz)
    if isinstance(v, date):
        return datetime(v.year, v.month, v.day, tzinfo=tz)
    try:
        if hasattr(v, "dt"):
            return _as_dt(v.dt, tz)
    except Exception:
        pass
    return None

def _is_all_day(vevent, start_dt, end_dt):
    """Treat DTSTART as DATE or midnight-to-midnight(+1) span as all-day."""
    try:
        raw = vevent.get("dtstart")
        if hasattr(raw, "dt") and isinstance(raw.dt, date) and not isinstance(raw.dt, datetime):
            return True
    except Exception:
        pass
    if not start_dt or not end_dt:
        return False
    dur = end_dt - start_dt
    return (start_dt.hour, start_dt.minute, start_dt.second) == (0, 0, 0) and \
           (end_dt.hour, end_dt.minute, end_dt.second) == (0, 0, 0) and \
           dur >= timedelta(days=1)

def human_range(start_dt, end_dt, all_day=False):
    if not start_dt:
        return ""
    tzname = start_dt.tzname() or ""
    if all_day:
        if end_dt:
            last_day = (end_dt - timedelta(days=1)).date()  # DTEND exclusive
            first_day = start_dt.date()
            if first_day == last_day:
                return f"All day {start_dt.strftime('%a %d %b %Y')} ({tzname})"
            return f"{first_day.strftime('%a %d %b %Y')} – {last_day.strftime('%a %d %b %Y')} (all day, {tzname})"
        return f"All day {start_dt.strftime('%a %d %b %Y')} ({tzname})"
    if end_dt:
        if start_dt.date() == end_dt.date():
            return f"{start_dt.strftime('%a %d %b %Y, %H:%M')} – {end_dt.strftime('%H:%M')} ({tzname})"
        return f"{start_dt.strftime('%a %d %b %Y, %H:%M')} – {end_dt.strftime('%a %d %b %Y, %H:%M')} ({tzname})"
    return f"{start_dt.strftime('%a %d %b %Y, %H:%M')} ({tzname})"

# --- Body builder ---
def extract_marker_title(raw):
    m = re.search(r"\[event\]\s*(.+?)\s*\[\/event\]", raw or "", re.I | re.S)
    return m.group(1).strip() if m else None

def build_body(vevent, tz):
    summary = (vevent.get("summary") or "").strip()
    desc = (vevent.get("description") or "").strip()
    loc = (vevent.get("location") or "").strip()
    start_dt = _as_dt(vevent.get("dtstart"), tz)
    end_dt = _as_dt(vevent.get("dtend"), tz) if vevent.get("dtend") else None
    all_day = _is_all_day(vevent, start_dt, end_dt)
    when = human_range(start_dt, end_dt, all_day=all_day)

    parts = [f"[event] {summary} [/event]"]
    if when:
        parts.append(f"**When:** {when}")
    if loc:
        parts.append(f"**Where:** {loc}")
    if desc:
        parts.append("")
        parts.append(desc)

    raw = "\n".join(parts).strip()
    return raw, summary, start_dt

# --- Tag helpers ---
def _slugify(s):
    s = s.lower()
    s = re.sub(r"[^a-z0-9\-]+", "-", s)
    s = re.sub(r"-{2,}", "-", s).strip("-")
    return s

def make_uid_tag(namespace, uid, max_len=30):
    """Build a per-event tag 'ns-uid-<10hex>' trimmed to max_len by shortening namespace only."""
    h = hashlib.sha1(uid.encode("utf-8")).hexdigest()[:10]
    ns = _slugify(namespace)
    fixed = f"-uid-{h}"
    room_for_ns = max(1, max_len - len(fixed))
    ns = ns[:room_for_ns]
    return f"{ns}{fixed}"

# --- Discourse lookups & edits ---
def find_topic_by_uid_tag(s, uid_tag):
    """
    Return topic_id (int) for an existing topic carrying uid_tag, else None.
    Try /tag/{tag}.json first; fall back to /search.json.
    """
    try:
        r = s.get(f"{BASE}/tag/{uid_tag}.json", timeout=30)
        if r.status_code == 404:
            log.debug("Tag %s not found via /tag JSON (404).", uid_tag)
        elif r.status_code == 403:
            log.debug("Forbidden on /tag JSON for %s (403) — will try search.json.", uid_tag)
        else:
            r.raise_for_status()
            data = r.json() or {}
            topics = ((data.get("topic_list") or {}).get("topics")) or []
            for t in topics:
                if uid_tag in (t.get("tags") or []):
                    log.info("Found existing topic %s via /tag JSON for %s.", t.get("id"), uid_tag)
                    return t.get("id")
    except Exception as e:
        log.debug("Tag JSON lookup failed for %s: %s", uid_tag, e)

    try:
        r = s.get(f"{BASE}/search.json", params={"q": f"tag:{uid_tag}"}, timeout=30)
        r.raise_for_status()
        data = r.json() or {}
        topics = data.get("topics") or []
        for t in topics:
            if uid_tag in (t.get("tags") or []):
                log.info("Found existing topic %s via search.json for %s.", t.get("id"), uid_tag)
                return t.get("id")
        log.info("No existing topic found for %s.", uid_tag)
    except Exception as e:
        log.warning("Search API lookup failed for %s: %s", uid_tag, e)

    return None

def get_first_post_raw(s, topic_id):
    """Return (first_post_id, raw) by fetching /t/{id}.json?include_raw=1; fallback to /posts/{id}.json."""
    r = s.get(f"{BASE}/t/{topic_id}.json", params={"include_raw": 1}, timeout=30)
    r.raise_for_status()
    data = r.json() or {}
    posts = ((data.get("post_stream") or {}).get("posts")) or []
    if posts:
        fp = posts[0]
        fp_id = fp.get("id")
        raw = fp.get("raw")
        if raw is not None:
            return fp_id, raw
        if fp_id:
            r2 = s.get(f"{BASE}/posts/{fp_id}.json", params={"include_raw": 1}, timeout=30)
            r2.raise_for_status()
            d2 = r2.json() or {}
            if "raw" in d2:
                return fp_id, d2["raw"]
    return None, None

def update_first_post(s, post_id, new_raw, reason=None):
    payload = {"raw": new_raw}
    if reason:
        payload["edit_reason"] = reason
    r = s.put(f"{BASE}/posts/{post_id}.json", json=payload, timeout=60)
    if r.status_code >= 400:
        log.error("Update post %s failed %s: %s", post_id, r.status_code, r.text)
    r.raise_for_status()
    return r.json()

def make_safe_title(summary, dtstart_dt):
    """Sanitize titles to avoid 'unclear' validator; keep entropy and cap length."""
    summary = (summary or "").strip()
    summary = re.sub(r'(.)\1{2,}', r'\1\1', summary)  # collapse very long repeats
    when = dtstart_dt.strftime("%a %d %b %Y %H:%M") if dtstart_dt else ""
    title = f"{summary} — {when}".strip(" —")
    alnums = [c.lower() for c in title if c.isalnum()]
    if len(set(alnums)) < 6:
        title = (title + " — event").strip()
    return title[:120]

def create_topic(s, title, raw, category_id, tags, dtstart_dt=None):
    """
    Create a new topic. Pads body to satisfy site min post length.
    Retries once with sanitized title if validator complains.
    Returns (topic_id, first_post_id).
    """
    MIN_BODY = 40
    raw = raw or ""
    if len(raw) < MIN_BODY:
        raw = (raw + "\n\n(autogenerated by ics2disc)").ljust(MIN_BODY + 1, " ")

    payload = {"title": title, "raw": raw, "category": category_id}
    if tags:
        payload["tags"] = tags

    r = s.post(f"{BASE}/posts.json", json=payload, timeout=60)
    if r.status_code == 422:
        try:
            j = r.json()
            errs = " ".join(j.get("errors") or [])
        except Exception:
            errs = r.text
        if "Title seems unclear" in errs or "title" in errs.lower():
            safe_title = make_safe_title(title, dtstart_dt)
            if safe_title != title:
                log.warning("Title rejected; retrying with sanitized title: %r", safe_title)
                payload["title"] = safe_title
                r = s.post(f"{BASE}/posts.json", json=payload, timeout=60)

    if r.status_code >= 400:
        log.error("Create failed %s: %s", r.status_code, r.text)
    r.raise_for_status()
    data = r.json()
    return data["topic_id"], data["id"]

# --- Main VEVENT handler ---
def process_vevent(s, vevent, args, namespace):
    uid = str(vevent.get("uid") or "").strip()
    if not uid:
        log.warning("Skipping VEVENT with no UID")
        return

    fresh_body, summary, start_dt = build_body(vevent, SITE_TZ)

    # Tags: defaults + namespace + per-event uid tag (both slugified/capped)
    uid_tag = make_uid_tag(namespace, uid, max_len=TAG_MAX_LEN)
    ns_tag = _slugify(namespace)[:TAG_MAX_LEN] if namespace else "ics"
    tags = list(DEFAULT_TAGS)
    if ns_tag and ns_tag not in tags:
        tags.append(ns_tag)
    if uid_tag not in tags:
        tags.append(uid_tag)

    topic_id = find_topic_by_uid_tag(s, uid_tag)
    if topic_id:
        log.info("Found existing topic %s via tag %s.", topic_id, uid_tag)
        first_post_id, old_raw = get_first_post_raw(s, topic_id)
        if not first_post_id:
            log.warning("Could not fetch first post raw for topic %s; defaulting to empty.", topic_id)
            old_raw = ""
        if (old_raw or "").strip() == fresh_body.strip():
            log.info("No content change for topic %s.", topic_id)
        else:
            log.info("Updating topic #%s for UID %s …", topic_id, uid)
            update_first_post(s, first_post_id, fresh_body, reason="ICS sync update")
            log.info("Updated topic #%s", topic_id)
    else:
        auto_title = (summary or "").strip() or f"Event — {uid[:8]}"
        log.info("Creating new topic for UID %s …", uid)
        created_topic_id, first_post_id = create_topic(
            s, auto_title, fresh_body, CATEGORY_ID, tags, dtstart_dt=start_dt
        )
        log.info("Created topic #%s (post %s)", created_topic_id, first_post_id)

# --- CLI entrypoint ---
def main():
    ap = argparse.ArgumentParser()
    ap.add_argument("--ics-url", help="URL to .ics file")
    ap.add_argument("--ics-file", help="Path to local .ics file")
    ap.add_argument("--namespace", help="Tag namespace (defaults to ICS_NAMESPACE env)")
    ap.add_argument("--skip-errors", action="store_true", help="Continue on event errors")
    args = ap.parse_args()

    feed_namespace = (args.namespace or ICS_NAMESPACE or "ics").strip()
    if not (args.ics_url or args.ics_file):
        sys.exit("ERROR: Provide --ics-url or --ics-file")

    # Fetch ICS
    if args.ics_url:
        url = args.ics_url
        log.info("Fetching ICS: %s", url)
        r = requests.get(url, timeout=60, headers={"User-Agent": "ics2disc/1.0 (+Discourse ICS sync)"})
        r.raise_for_status()
        data = r.content
    else:
        with open(args.ics_file, "rb") as f:
            data = f.read()

    log.info("Using namespace: %s", feed_namespace)
    cal = Calendar.from_ical(data)
    s = _session()

    for comp in cal.walk("VEVENT"):
        try:
            process_vevent(s, comp, args, feed_namespace)
        except Exception as e:
            if args.skip_errors:
                uid = str(comp.get("uid") or "").strip()
                log.error("Error on event UID=%s: %s", uid, e)
                continue
            raise

if __name__ == "__main__":
    main()
Gotchas and Compatibility Notes
  • Tag Groups: If your site restricts which tags are usable in a category (tag groups), make sure the bot user is allowed to create/use these tags.
  • Category Names vs IDs: Always use the category’s numeric ID (find this in the URL when viewing the category in Discourse’s admin).
  • api_username Permissions: Script requires API user (“CalendarBot”) to have Create/Reply/See on the target category, and permission to use tags.
  • Tag length: If your Discourse site’s maximum tag length is customized, set TAG_MAX_LEN accordingly.
  • All-day events: The script now outputs proper date-only ranges so “all-day” events look correct in the Discourse event UI.
Common Troubleshooting Steps
  • No topics created: Check script output/logs for errors (permissions, missing variables).
  • API error responses: Double-check DISCOURSE_API_KEY, DISCOURSE_API_USERNAME, and URL.
  • Category issues: Verify the ID matches your Discourse site. Remember: category slug is not the same as numeric ID.
  • Tagging problems: Make sure your bot can create the tags; see category tag settings.
  • Hosted customer support: If you’re on official hosting and stuck, email team@discourse.org.
FAQ and Support

Q: Can I run this on Discourse’s managed hosting?
A: Yes, you can use the API to create and update topics, but you cannot run the script on the server itself. You must use your own machine, cloud VM, etc.

Q: My topics aren’t updating—how do I debug?
A: See the troubleshooting section above, and check for errors in your log output.

Q: Who do I contact for help?
A: For script or API issues, consult or post in the Meta support topic. For Discourse-hosted support, email team@discourse.org.

Q: Are pull requests/contributions accepted?
A: Absolutely! Open a discussion or submit code feedback on Meta.

References and Further Reading

Thank you for your amazing work here with this much needed functionality!

Do you think it would be possible to roll the logic / functionality up into a Discourse plugin? With my limited understanding, I think this would mean converting it to Ruby etc, which is non-trivial.

This would hopefully

  1. make it accessible to non-dev sysadmins like me
  2. utilise the Discourse update mechanism
  3. be far more robust in the long term
  4. facilitate others contributing to the codebase efficiently
  5. ease the eventual assimilation into the official plugin
1 Like

@Cathyy01 and I have been working on that🙂


  • most of the site settings are redundant, because the ics_feeds includes which category ID should the events be posted into.

  • Unlike the python scripts, this sync’s a details pane for each event

  • On my test instance, i had to change many of the settings in Admin → All site settings → Posting


3 Likes