pt-query-digest 官方解釋

OUTPUT

The default --output is a query analysis report. The --[no]report option controls whether or not this report is printed. Sometimes you may want to parse all the queries but suppress the report, for example when using --review or --history.

There is one paragraph for each class of query analyzed. A “class” of queries all have the same value for the --group-byattribute which is fingerprint by default. (See “ATTRIBUTES”.) A fingerprint is an abstracted version of the query text with literals removed, whitespace collapsed, and so forth. The report is formatted so it’s easy to paste into emails without wrapping, and all non-query lines begin with a comment, so you can save it to a .sql file and open it in your favorite syntax-highlighting text editor. There is a response-time profile at the beginning.

The output described here is controlled by --report-format. That option allows you to specify what to print and in what order. The default output in the default order is described here.

The report, by default, begins with a paragraph about the entire analysis run The information is very similar to what you’ll see for each class of queries in the log, but it doesn’t have some information that would be too expensive to keep globally for the analysis. It also has some statistics about the code’s execution itself, such as the CPU and memory usage, the local date and time of the run, and a list of input file read/parsed.

Following this is the response-time profile over the events. This is a highly summarized view of the unique events in the detailed query report that follows. It contains the following columns:

Column        Meaning
============  ==========================================================
Rank          The query's rank within the entire set of queries analyzed
Query ID      The query's fingerprint
Response time The total response time, and percentage of overall total
Calls         The number of times this query was executed
R/Call        The mean response time per execution
V/M           The Variance-to-mean ratio of response time
Item          The distilled query

A final line whose rank is shown as MISC contains aggregate statistics on the queries that were not included in the report, due to options such as --limit and --outliers. For details on the variance-to-mean ratio, please seehttp://en.wikipedia.org/wiki/Index_of_dispersion.

Next, the detailed query report is printed. Each query appears in a paragraph. Here is a sample, slightly reformatted so ‘perldoc’ will not wrap lines in a terminal. The following will all be one paragraph, but we’ll break it up for commentary.

# Query 2: 0.01 QPS, 0.02x conc, ID 0xFDEA8D2993C9CAF3 at byte 160665

This line identifies the sequential number of the query in the sort order specified by --order-by. Then there’s the queries per second, and the approximate concurrency for this query (calculated as a function of the timespan and total Query_time). Next there’s a query ID. This ID is a hex version of the query’s checksum in the database, if you’re using --review. You can select the reviewed query’s details from the database with a query like SELECT .... WHEREchecksum=0xFDEA8D2993C9CAF3.

If you are investigating the report and want to print out every sample of a particular query, then the following --filter may be helpful:

pt-query-digest slow.log           \
   --no-report                     \
   --output slowlog                \
   --filter '$event->{fingerprint} \
        && make_checksum($event->{fingerprint}) eq "FDEA8D2993C9CAF3"'

Notice that you must remove the 0x prefix from the checksum.

Finally, in case you want to find a sample of the query in the log file, there’s the byte offset where you can look. (This is not always accurate, due to some anomalies in the slow log format, but it’s usually right.) The position refers to the worst sample, which we’ll see more about below.

Next is the table of metrics about this class of queries.

#           pct   total    min    max     avg     95%  stddev  median
# Count       0       2
# Exec time  13   1105s   552s   554s    553s    554s      2s    553s
# Lock time   0   216us   99us  117us   108us   117us    12us   108us
# Rows sent  20   6.26M  3.13M  3.13M   3.13M   3.13M   12.73   3.13M
# Rows exam   0   6.26M  3.13M  3.13M   3.13M   3.13M   12.73   3.13M

The first line is column headers for the table. The percentage is the percent of the total for the whole analysis run, and the total is the actual value of the specified metric. For example, in this case we can see that the query executed 2 times, which is 13% of the total number of queries in the file. The min, max and avg columns are self-explanatory. The 95% column shows the 95th percentile; 95% of the values are less than or equal to this value. The standard deviation shows you how tightly grouped the values are. The standard deviation and median are both calculated from the 95th percentile, discarding the extremely large values.

The stddev, median and 95th percentile statistics are approximate. Exact statistics require keeping every value seen, sorting, and doing some calculations on them. This uses a lot of memory. To avoid this, we keep 1000 buckets, each of them 5% bigger than the one before, ranging from .000001 up to a very big number. When we see a value we increment the bucket into which it falls. Thus we have fixed memory per class of queries. The drawback is the imprecision, which typically falls in the 5 percent range.

Next we have statistics on the users, databases and time range for the query.

# Users       1   user1
# Databases   2     db1(1), db2(1)
# Time range 2008-11-26 04:55:18 to 2008-11-27 00:15:15

The users and databases are shown as a count of distinct values, followed by the values. If there’s only one, it’s shown alone; if there are many, we show each of the most frequent ones, followed by the number of times it appears.

# Query_time distribution
#   1us
#  10us
# 100us
#   1ms
#  10ms  #####
# 100ms  ####################
#    1s  ##########
#  10s+

The execution times show a logarithmic chart of time clustering. Each query goes into one of the “buckets” and is counted up. The buckets are powers of ten. The first bucket is all values in the “single microsecond range” – that is, less than 10us. The second is “tens of microseconds,” which is from 10us up to (but not including) 100us; and so on. The charted attribute can be changed by specifying --report-histogram but is limited to time-based attributes.

# Tables
#    SHOW TABLE STATUS LIKE 'table1'\G
#    SHOW CREATE TABLE `table1`\G
# EXPLAIN
SELECT * FROM table1\G

This section is a convenience: if you’re trying to optimize the queries you see in the slow log, you probably want to examine the table structure and size. These are copy-and-paste-ready commands to do that.

Finally, we see a sample of the queries in this class of query. This is not a random sample. It is the query that performed the worst, according to the sort order given by --order-by. You will normally see a commented # EXPLAIN line just before it, so you can copy-paste the query to examine its EXPLAIN plan. But for non-SELECT queries that isn’t possible to do, so the tool tries to transform the query into a roughly equivalent SELECT query, and adds that below.

If you want to find this sample event in the log, use the offset mentioned above, and something like the following:

tail -c +<offset> /path/to/file | head

See also --report-format.

QUERY REVIEW

A query --review is the process of storing all the query fingerprints analyzed. This has several benefits:

  • You can add metadata to classes of queries, such as marking them for follow-up, adding notes to queries, or marking them with an issue ID for your issue tracking system.
  • You can refer to the stored values on subsequent runs so you’ll know whether you’ve seen a query before. This can help you cut down on duplicated work.
  • You can store historical data such as the row count, query times, and generally anything you can see in the report.

To use this feature, you run pt-query-digest with the --review option. It will store the fingerprints and other information into the table you specify. Next time you run it with the same option, it will do the following:

  • It won’t show you queries you’ve already reviewed. A query is considered to be already reviewed if you’ve set a value for the reviewed_by column. (If you want to see queries you’ve already reviewed, use the --report-all option.)

  • Queries that you’ve reviewed, and don’t appear in the output, will cause gaps in the query number sequence in the first line of each paragraph. And the value you’ve specified for --limit will still be honored. So if you’ve reviewed all queries in the top 10 and you ask for the top 10, you won’t see anything in the output.

  • If you want to see the queries you’ve already reviewed, you can specify --report-all. Then you’ll see the normal analysis output, but you’ll also see the information from the review table, just below the execution time graph. For example,

    # Review information
    #      comments: really bad IN() subquery, fix soon!
    #    first_seen: 2008-12-01 11:48:57
    #   jira_ticket: 1933
    #     last_seen: 2008-12-18 11:49:07
    #      priority: high
    #   reviewed_by: xaprb
    #   reviewed_on: 2008-12-18 15:03:11
    

    This metadata is useful because, as you analyze your queries, you get your comments integrated right into the report.


發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章