导出Zendesk社区图片的Python脚本

tyler.lamparter · 2025 年1 月 7 日 00:45

需要将图片从 Zendesk Community 迁移到 Discourse？这里有一个 Python 脚本！

大家好，

我一直在致力于将 Zendesk Community 迁移到 Discourse，并遇到了一个令人沮丧的问题：从 Zendesk 导出图片。问题是什么？Zendesk 的 CDN 在直接检索几张图片后开始阻止访问，这使得批量下载变得非常困难。

在尝试了几种方法后，我最终创建了一个 Python 脚本（在 AI 的帮助下），该脚本可以绕过此限制。该脚本使用 Selenium 打开浏览器中的每个图片 URL，截取图片的屏幕截图，然后将其保存在本地。这不像直接下载图片那么简洁，但它能可靠地工作，并且导出的图片质量很高。

如果您也面临类似的迁移问题，希望这个脚本能帮到您！

您需要什么

Python: 已安装并准备就绪。
ChromeDriver:
从 Chrome for Testing 下载，解压缩，然后更新脚本中的驱动程序路径。
CSV 文件:
- 创建一个名为 URL 的单列 CSV 文件。
- 用您要从 Zendesk 导出的图片的直接 URL 填充该文件。
- 更新脚本中的文件路径。
保存位置:
更新脚本中的文件夹路径，您希望将图片保存在那里。

最后，您需要安装几个 Python 库：

pip install selenium pillow

脚本

这是 Python 脚本。您可以随意修改它以适应您的设置：

import csv
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
from PIL import Image
import io
import re
import os

# Function to extract the image ID from the URL
def extract_image_id(url):
    match = re.search(r'/([^/]+)\.(png|jpg|jpeg|gif)$', url, re.IGNORECASE)
    if match:
        return match.group(1)
    return 'image'

# Function to download and save an image from a URL
def download_image(driver, image_url, download_folder):
    driver.get(image_url)
    
    # Wait for the image to load
    wait = WebDriverWait(driver, 20)
    img_element = wait.until(EC.presence_of_element_located((By.TAG_NAME, 'img')))

    # Get the image element's location and size
    location = img_element.location

    size = img_element.size

    # Take a screenshot of the entire page
    screenshot = driver.get_screenshot_as_png()
    
    # Convert screenshot to PIL Image
    screenshot_image = Image.open(io.BytesIO(screenshot))

    # Define the bounding box for the image (left, top, right, bottom)
    left = location['x']
    top = location['y']
    right = left + size['width']
    bottom = top + size['height']
    bbox = (left, top, right, bottom)

    # Crop the image to the bounding box
    cropped_image = screenshot_image.crop(bbox)

    # Extract the image ID from the URL
    image_id = extract_image_id(image_url)

    # Save the cropped image with the image ID as the filename in the download folder
    cropped_image.save(os.path.join(download_folder, f'{image_id}.png'))

# Function to load URLs from a CSV file
def load_urls_from_csv(csv_file):
    urls = []
    with open(csv_file, mode='r', newline='', encoding='utf-8') as file:
        reader = csv.DictReader(file)
        for row in reader:
            urls.append(row['URL'])  # Assuming the CSV has 'id' and 'url' columns
    return urls

# Set up ChromeDriver service
service = Service("C:\\Users\\tslam\\Downloads\\chromedriver-win64\\chromedriver-win64\\chromedriver.exe")
driver = webdriver.Chrome(service=service)

try:

    # Maximize browser window to full screen
    driver.maximize_window()
    
    # Load URLs from the CSV file
    csv_file = 'C:\\Users\\tslam\\Zendesk Migration\\image_urls.csv'
    image_urls = load_urls_from_csv(csv_file)
    
    # Define the download folder path
    download_folder = 'C:\\Users\\tslam\\Zendesk Migration\\downloads'
    
    # Ensure the download folder exists
    if not os.path.exists(download_folder):
        os.makedirs(download_folder)

    # Process each image URL
    for url in image_urls:
        download_image(driver, url, download_folder)

finally:
    driver.quit()

cocococosti · 2025 年1 月 7 日 19:52

太好了！这甚至可以用于解决同样问题的其他迁移，而不仅仅是Zendesk。非常感谢您的分享。

话题		回复	浏览量
Generate file preview from urls Migration uploads	3	271	2024 年2 月 19 日
Broken images in production after importing backup Support	2	1371	2024 年6 月 8 日
Migrating support tickets to a Discourse post Community Building	2	2279	2019 年9 月 27 日
Images imported from phpBB have artefact lines Migration phpbb	6	67	2024 年10 月 4 日
Zapier posts with broken image due to system edits Support	2	854	2020 年8 月 26 日

导出Zendesk社区图片的Python脚本

需要将图片从 Zendesk Community 迁移到 Discourse？这里有一个 Python 脚本！

您需要什么

脚本

相关话题