读取 mysql binlog 开始和结束时间

mysql binlog 记录了所有可能涉及更新的操作,可以用来作为增量备份的一种选择。为了管理 binlog ,需要读取每个 binlog 文件的准确的开始和结束时间。用 mysqlbinlog 工具可以解析 binlog 文件,所以也可以通过分析输出结果来获取。但是 mysqlbinlog 只能顺序读取记录,如果只是分析开始时间还好,要分析结束时间,就必须等它把整个 binlog 处理完。在 binlog 文件体积大的时候,代价就大了些。好在 mysql 对 binlog 文件的格式是公开的,所以我们可以直接通过解析文件自己实现。

binlog 文件的格式在 http://forge.mysql.com/wiki/MySQL_Internals_Binary_Log 可以找到。每个 binlog 文件都有相同的开头:0xfe 0x62 0x69 0x6e 。也就是 0xfe 后面加上 bin 。之后,就是一个个事件数据。binlog 的事件类型有很多种,但每个 binlog 文件的第一个事件一定是格式描述事件(format description event),描述了 binlog 文件格式版本信息;最后一个时间一定是轮转事件(rotate event),记录了下一个 binlog 的文件名和事件开始偏移位置。每个事件都有一个一致的事件头,其中就有事件的时间戳、事件类型等。读取第一个事件和最后一个事件的信息就可以获取 binlog 文件的准确开始和结束时间了。

读取第一个事件 format description event 要容易一些,seek 跳过文件头,读取事件头就行了。读取最后一个事件的时间要稍麻烦些。因为事件的长度是不固定的。对于轮转事件来说,除了事件头以外,后面还有一个 64位整数的开始位置偏移量以及下一个 binlog 的文件名。长度不确定的部分就是最后的文件名部分。好在那个偏移量是一个固定的值:4(也就是跳过文件头),所以可以从后往前读取,用它来作为标记,检查是否读完了文件名。然后就可以跳过文件名和偏移量,读取最后一个事件的事件头了。

php 代码如下:

<?php
/**
 * read binlog info
 *
 * A mysql binlog file is begin with a head "\xfebin" and then log evnets. The
 * first event is a format description event, the last event is a rotate event.
 *
 * For more infomation about mysql binlog format, see http://forge.mysql.com/wiki/MySQL_Internals_Binary_Log
 */
class BinlogInfo {
    const EVENT_HEAD_SIZE = 19;
    const FORMAT_DESCRIPTION_EVENT_DATA_SIZE = 59;
    const BINLOG_HEAD = "\xfebin";
    const FORMAT_DESCRIPTION_EVENT = 15;
    const ROTATE_EVENT = 4;

    private $eventHeadPackStr = '';
    private $formatDescriptionEventDataPackStr = '';

    function __construct() {
        $this->eventHeadPackStr = $this->eventHeadPackStr();
        $this->formatDescriptionEventDataPackStr = $this->formatDescriptionEventDataPackStr();
    }

    protected function eventHeadPackStr() {
        $event_header_struct = array(
            'timestamp' => 'l',
            'type_code' => 'c',
            'server_id' => 'l',
            'event_length' => 'l',
            'next_position' => 'l',
            'flags' => 's',
        );
        return $this->toPackStr($event_header_struct);
    }

    protected function formatDescriptionEventDataPackStr() {
        $format_description_event_data_struct = array(
            'binlog_version' => 's',
            'server_version' => 'a50',
            'create_timestamp' => 'l',
            'head_length' => 'c'
        );
        return $this->toPackStr($format_description_event_data_struct);
    }

    protected function toPackStr($arr) {
        $ret = '';
        foreach ($arr as $k=>$v) {
            $ret.= '/'.$v.$k;
        }
        $ret = substr($ret, 1);
        return $ret;
    }

    /**
     * @param resource $file
     *
     * Mysql binlog file begin with a 4 bytes head: "\xfebin". 
     */
    protected function isBinlog($file) {
        rewind($file);
        $head = fread($file, strlen(self::BINLOG_HEAD));
        return $head == self::BINLOG_HEAD;
    }

    /**
     * @param resource $file
     *
     * Format description event is the first event of a binlog file
     */
    protected function readFormatDescriptionEvent($file) {
        fseek($file, strlen(self::BINLOG_HEAD), SEEK_SET);
        $head_str = fread($file, self::EVENT_HEAD_SIZE);
        $head = unpack($this->eventHeadPackStr, $head_str);
        if ($head['type_code'] != self::FORMAT_DESCRIPTION_EVENT) {
            return null;
        }
        $data_str= fread($file, self::FORMAT_DESCRIPTION_EVENT_DATA_SIZE);
        $data = unpack($this->formatDescriptionEventDataPackStr, $data_str);

        return array('head'=>$head, 'data'=>$data);
    }

    /**
     * @param resource $file
     * 
     * Rotate event is the last event of a binglog.
     * After event header, there is a 64bit int indicate the first event
     * position of next binlog file and next binlog file name without \0 at end.
     * The position is always be 4 (hex: 0400000000000000).
     * 
     */
    protected function readRotateEvent($file)
    {
        /**
         * Rotate event size is 19(head size) + 8(pos) + len(filename).
         * 100 bytes can contain a filename which length less than 73 bytes and
         * it is short than the length of format description event so filesize -
         * bufsize will never be negative.
         */
        $bufsize = 100; 
        $size_pos = 8;
        fseek($file, -$bufsize, SEEK_END);
        $buf = fread($file, $bufsize);
        $min_begin = strlen(self::BINLOG_HEAD) + self::EVENT_HEAD_SIZE + $size_pos;
        $ok = false;
        for ($i = $bufsize - 1; $i > $min_begin; $i--) {
            if ($buf[$i] == "\0") {
                $ok = true;
                break;
            }
        }
        if (!$ok) {
            return null;
        }
        $next_filename = substr($buf, $i + 1);

        $head_str = substr($buf, $i + 1 - $size_pos - self::EVENT_HEAD_SIZE, self::EVENT_HEAD_SIZE);
        $head = unpack($this->eventHeadPackStr, $head_str);
        if ($head['type_code'] != self::ROTATE_EVENT) {
            return null;
        }
        return array('head'=>$head, 'nextFile'=>$next_filename);
    }

    /**
     * @param string $path path to binlog file
     */
    function read($path) {
        $file = fopen($path, 'r');
        if (!$file) {
            return null;
        }
        if (!$this->isBinlog($file)) {
            fclose($file);
            return null;
        }

        $fde = $this->readFormatDescriptionEvent($file);
        $re = $this->readRotateEvent($file);
        fclose($file);
        return array(
            'beginAt' => $fde['head']['timestamp'],
            'endAt' => $re['head']['timestamp'],
            'nextFile' => $re['nextFile'],
            'serverVersion' => $fde['data']['server_version'],
        );
    }
}

3 thoughts on “读取 mysql binlog 开始和结束时间

  1. Pingback: 第四部 » 博客推荐8:神仙的仙居

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.