自動化

【投稿自動化】PythonでWordPressに投稿する ~その2【コード解説】

date 2021/1/5

こんにちは、Juntechです。
前回、PythonとWordPress APIを使って、記事を投稿してみました。
今回は完成版ができるまでの各コードの解説をしていきたいと思います。

ファイルの命名規則など、設計部分は前回の記事をご覧ください。

1. 画像ファイルをPOSTする
2. 記事をPOSTする
3. おまけ：品質改善のために
- 3.4. 関数で繰り返し使う処理を共通化する
- 3.5. 各種チェックを行う

画像ファイルをPOSTする

まずは画像ファイルのPOSTからです。

記事から画像ファイルを取得する

作成した記事の中に挿入されている画像パスを取得します。

import sys,re

# ファイルパス・ディレクトリパス
article_name = sys.argv[1] # 第一引数
single_dir = './single' # 記事の配置ディレクトリ
article_file_path = '{}/{}.md'.format(single_dir, article_name) # 記事のパス

image_file_dict = {}
with open(article_file_path, mode='r') as f:
    article = f.readlines()
    for i, line in enumerate(article):
        image_file = re.findall(r'!\[\]\(([^\):]+\.(jpg|png))\)', line)
        if len(image_file) == 1:
            image_file_dict[i] = {'local': image_file[0][0]}
print(image_file_dict)

これを保存して実行すると、
画像ファイルパスを記入した行数と、画像ファイルパスが取得できます。

python3 post-wp-article_1.py test-post
{
    14: {'local': '../single-image/test-1.png'},
    15: {'local': '../single-image/test-2.jpg'},
    16: {'local': '../single-image/test-3.png'}
}

記事内に挿入している画像ファイルをPOSTする

ここからはWordPressに画像ファイルをPOSTしていきます。
まずは記事内の画像ファイルをPOSTします。
前セクションの画像パス取得コードに続けて記述します。

### { 画像パス取得コード }  ###
import os
import requests  # pip3 install requests

# APIを叩く用の設定
wp_authorization_string = os.environ['WP_AUTHORIZATION_STRING']
url_post_media = 'https://autohacks.net/wp-json/wp/v2/media'

# 記事ファイル内の画像をPOST
for image_file_index in image_file_dict:
    image_file_local = image_file_dict[image_file_index]["local"]
    image_file_path = '{}/{}'.format(single_dir, image_file_local)
    file_name = os.path.basename(image_file_path)
    with open(image_file_path, mode='rb') as f:
        headers = {
            'Authorization': 'Basic {}'.format(wp_authorization_string),
            'Content-Type': 'application/octet-stream',
            'Content-Disposition': 'attachment; filename="{}"'.format(file_name)
        }
        image_data = f.read()
        response = requests.post(url_post_media, headers=headers, data=image_data)
        print('Post Succeed: ' + image_file_path)
        response_json = response.json()
        media_source_path = response_json["source_url"]
        media_medium_path = response_json["media_details"]["sizes"]["medium"]["source_url"]
        image_file_dict[image_file_index]["source"] = media_source_path
        image_file_dict[image_file_index]["medium"] = media_medium_path
print(image_file_dict)

前セクションで作成したimage_file_dictを使用して、
for文でPOSTを実行していきます。

wp_authorization_stringにBasic認証用の文字列をセットしていますが、
ここでは予めBasic認証用にユーザ名・パスワードをエンコードし、
環境変数にセットしたものを使用しています。
プログラム内で生成する場合には、こちらのコードで同様の文字列が生成できます。

import base64
user = '***' # ユーザ名
password = '***' # パスワード
wp_authorization_string = base64.b64encode('{}:{}'.format(user, password).encode('utf-8'))

画像をPOSTした後は、
Responseに含まれる画像パス・リサイズ後の画像パスを取得し、
image_file_dict内にセットします。

これを保存して実行すると、先程のディクショナリに、
WordPress上の画像パス・リサイズ後の画像パスが追加されていることがわかります。

python3 post-wp-article_2.py test-post
{
    14: {'local': '../single-image/test-1.png'},
    15: {'local': '../single-image/test-2.jpg'},
    16: {'local': '../single-image/test-3.png'}
}
Post Succeed: ./single/../single-image/test-1.png
Post Succeed: ./single/../single-image/test-2.jpg
Post Succeed: ./single/../single-image/test-3.png
{
    14: {
        'local': '../single-image/test-1.png',
        'source': 'https://autohacks.net/wp-content/uploads/2021/01/test-1-1.png',
        'medium': 'https://autohacks.net/wp-content/uploads/2021/01/test-1-1-300x300.png'
    },
    15: {
        'local': '../single-image/test-2.jpg',
        'source': 'https://autohacks.net/wp-content/uploads/2021/01/test-2.png',
        'medium': 'https://autohacks.net/wp-content/uploads/2021/01/test-2-300x235.png'
    },
    16: {
        'local': '../single-image/test-3.png',
        'source': 'https://autohacks.net/wp-content/uploads/2021/01/test-3.png',
        'medium': 'https://autohacks.net/wp-content/uploads/2021/01/test-3-300x225.png'
    }
}

アイキャッチ画像をPOSTする

続けてアイキャッチ画像をPOSTします。
こちらは下記単体で動きます。

import os,sys
import requests  # pip3 install requests
import glob # pip3 install glob

# ファイルパス・ディレクトリパス
article_name = sys.argv[1]
single_dir = './single'
article_file_path = '{}/{}.md'.format(single_dir, article_name)
eyecatch_dir = './eyecatch-image'

# APIを叩く用の設定
wp_authorization_string = os.environ['WP_AUTHORIZATION_STRING']
url_post_media = 'https://autohacks.net/wp-json/wp/v2/media'

# アイキャッチ画像をPOST
eyecatch_image_file_path = glob.glob('{}/{}'.format(eyecatch_dir, article_name + '.*'))[0]
eyecatch_image_file_name = os.path.basename(eyecatch_image_file_path)
eyecatch_image_file_id = ''
with open(eyecatch_image_file_path, mode='rb') as f:
    headers = {
        'Authorization': 'Basic {}'.format(wp_authorization_string),
        'Content-Type': 'application/octet-stream',
        'Content-Disposition': 'attachment; filename="{}"'.format(eyecatch_image_file_name)
    }
    image_data = f.read()
    response = requests.post(url_post_media, headers=headers, data=image_data)
    print('Post Succeed: ' + eyecatch_image_file_path)
    response_json = response.json()
    eyecatch_image_file_id = response_json["id"]
print("id: " + str(eyecatch_image_file_id))

今回、アイキャッチ画像は記事ファイルと同名にしているので、
記事ファイルを触らずとも、POSTする画像ファイル名がわかっています。
ただし拡張子までは固定していないため、
globを使って、指定ディレクトリに配置されている画像ファイルを取得します。
（拡張子違いで同名のファイルが複数ある場合には、最初の1件だけをPOSTします。）

画像をPOSTした後は、
Responseに含まれる画像ファイルのIDを取得し、記事のPOST時に使用します。

これを保存して実行すると、
POSTした画像のIDが取得できます。

python3 post-wp-article_3.py test-post
Post Succeed: ./eyecatch-image/test-post.png
id: 685

これで画像のPOSTが完了しました。

記事をPOSTする

これまでにPOST・取得した画像を使って、記事をPOSTします。
これまで記述してきたコードに続けて記述します。

import json

url_post_single = 'https://autohacks.net/wp-json/wp/v2/posts'

# 記事をPOST
with open(article_file_path, mode='r') as f:
    article = f.readlines()
    # 記事内の画像挿入箇所にWordPress上の画像パスをセット
    for image_file_index in image_file_dict:
        image_file_path_local = image_file_dict[image_file_index]["local"]
        image_file_path_medium = image_file_dict[image_file_index]["medium"]
        image_file_path_source = image_file_dict[image_file_index]["source"]
        text_before = "![]({})".format(image_file_path_local)
        text_after = "[![]({})]({})".format(image_file_path_medium, image_file_path_source)
        article[image_file_index] = article[image_file_index].replace(text_before, text_after)
    # 記事POST
    headers = {
        'Authorization': 'Basic {}'.format(wp_authorization_string),
        'Content-Type': 'application/json',
    }
    title = article[0].replace('# ','')
    # 記事の1行目（=タイトル行）を削除
    del article[0]
    content = ""
    for line in article:
        content += line
    body = {
        "slug": article_name,
        "title": title,
        "content": content,
        "featured_media" : eyecatch_image_file_id,
        "status": "draft"
    }
    requests.post(url_post_single, headers=headers, data=json.dumps(body))
print('Post Succeed: ' + article_file_path)
print('Success!')

記事ファイルを読み込んだら、
まずは画像ファイルのPOST時に作成したimage_file_dictを使って、
for文で記事内のローカル画像パスをWordPress上の画像パスに書き換えます。

書き換えが終わったら、
記事の1行目（=タイトルを記載してあります）を取得した後に削除し、
POSTを実行します。

POST時には、記事ファイル名と、記事内から取得したタイトル、
そしてアイキャッチ画像のPOST時に取得したidをリクエストに含めます。
本文は読み込んだものをそのまま詰め、jsonを使ってエンコードしたものをセットします。

これを保存して実行すると、画像ファイルの取得から記事のPOSTまで自動で実施できます。

python3 post-wp-article_4.py test-post
{
    14: {'local': '../single-image/test-1.png'},
    15: {'local': '../single-image/test-2.jpg'},
    16: {'local': '../single-image/test-3.png'}
}
Post Succeed: ./single/../single-image/test-1.png
Post Succeed: ./single/../single-image/test-2.jpg
Post Succeed: ./single/../single-image/test-3.png
{
    14: {
        'local': '../single-image/test-1.png',
        'source': 'https://autohacks.net/wp-content/uploads/2021/01/test-1-10.png',
        'medium': 'https://autohacks.net/wp-content/uploads/2021/01/test-1-10-300x300.png'
    },
    15: {
        'local': '../single-image/test-2.jpg',
        'source': 'https://autohacks.net/wp-content/uploads/2021/01/test-2-10.png',
        'medium': 'https://autohacks.net/wp-content/uploads/2021/01/test-2-10-300x235.png'
    },
    16: {
        'local': '../single-image/test-3.png',
        'source': 'https://autohacks.net/wp-content/uploads/2021/01/test-3-10.png',
        'medium': 'https://autohacks.net/wp-content/uploads/2021/01/test-3-10-300x225.png'
    }
}
Post Succeed: ./eyecatch-image/test-post.png
id: 753
Post Succeed: ./single/test-post.md
Success!

おまけ：品質改善のために

最後はおまけです。
よりコードの品質を高めるために、各種関数・チェック処理を実装します。

関数で繰り返し使う処理を共通化する

まずはエラーが発生した際に、
エラーメッセージを出力しつつ処理を異常終了させる関数です。

def exit_with_error(message):
    status_sys_error = 1
    message_error = 'Error! : {}'
    print(message_error.format(message))
    sys.exit(status_sys_error)

チェック処理の実装時に、NGがあればこの関数を呼び出して処理を中断します。
ステータス：異常終了（=1）を返却して終了させるので、
シェルなど外部プログラムからの呼び出し時にも扱いやすくなります。

次にPOSTを実行して、失敗時にはリトライする関数です。

import time
def post_with_retry(data_name,url,header_obj,data_obj):
    max_error_count = 3
    sleep_second = 3
    post_success_status = 201
    response_status_code = None
    response = None
    error_count = 0
    while response_status_code != post_success_status:
        response = requests.post(url, headers=header_obj, data=data_obj)
        response_status_code = response.status_code
        if response_status_code != post_success_status:
            print('Error! :' + response.json()["message"])
            if error_count == max_error_count:
                message = 'Failed post. url={}, data_name={}.'.format(url,data_name)
                exit_with_error(message)
            else:
                error_count += 1
                print('Retry : count=' + str(error_count))
                time.sleep(sleep_second)
        else:
            print('Post Succeed: ' + data_name)
            response_json = response.json()
            return response_json

POST時の結果ステータスを取得し、
201(POST成功)以外が返ってきた場合にはエラーメッセージを出力します。
その後3秒待ってリトライする処理を3回繰り返し、
3回リトライしても結果NGであれば、先ほど作成した関数を使って異常終了します。

各種チェックを行う

引数のタイポやファイルの命名規則違反といった、ケアレスミスによるエラーを防ぐため、
チェック処理を実装します。

まずは記事ファイルのチェックです。

import chardet

# 記事ファイルの存在チェック
if os.path.exists(article_file_path) != True:
    message = '{} is not found.'.format(article_file_path)
    exit_with_error(message)

# 記事ファイルの文字コードチェック（UTF-8でなければエラー）
with open(article_file_path, mode='rb') as f:
    charset = chardet.detect(f.read())['encoding']
    if charset != 'utf-8':
        message = 'charset of {} must be utf-8, but detected {}.'.format(
            article_file_path, charset)
        exit_with_error(message)

# 記事ファイル内で画像パスが1行に2件以上存在しないことをチェック
# 問題なければ画像パスのディクショナリを作成
violation_list = []
image_file_dict = {}
with open(article_file_path, mode='r') as f:
    article = f.readlines()
    for i, line in enumerate(article):
        image_file = re.findall(r'!\[\]\(([^\):]+\.(jpg|png))\)', line)
        if len(image_file) == 1:
            image_file_dict[i] = {'local': image_file[0][0]}
        elif len(image_file) > 1:
            violation_list.append(i)
if len(violation_list) > 0:
    for i in violation_list:
        line_num = i + 1
        print('line {} has 2 or more image files.'.format(line_num))
    message = 'md-article violate writing rule.'
    exit_with_error(message)

記事ファイルに対しては3つのチェックを実施します。

まずはos.path.existsを使って、
引数に指定したファイル名の記事ファイルが存在していることをチェックします。

次にchardet.detectを使って、
記事ファイルの文字コードをチェックします。
文字コードがUTF-8でない場合には、
後々の処理でエラーとなる可能性が高いため、ここで検知します。

最後に記事ファイル内で画像パスが1行に2件以上挿入されていないことをチェックします。
1行に2件以上挿入されている場合には画像の取得漏れが発生するため、
記事内に挿入されている画像ファイルの取得と並行してチェックを行います。

以上が記事ファイルのチェック処理です。

後は画像ファイルの存在チェックを実装して終わりです。

# 画像ファイル存在チェック
non_existence_list = []
for image_file_index in image_file_dict:
    image_file_local = image_file_dict[image_file_index]["local"]
    image_file_path = '{}/{}'.format(single_dir, image_file_local)
    if os.path.exists(image_file_path) != True:
        non_existence_list.append(image_file_path)

if len(non_existence_list) > 0:
    for non_existence_file in non_existence_list:
        print('{} is not found.'.format(non_existence_file))
    message = 'some image files are not found.'
    exit_with_error(message)

# アイキャッチ画像ファイル存在チェック
if len(glob.glob('{}/{}'.format(eyecatch_dir, article_name + '.*'))) == 0:
    message = 'eycatch image file is not found.'
    exit_with_error(message)