Replication lag

From Toolserver wiki
Jump to: navigation, search

Replication lag or replag is the delay between data appearing on Wikimedia servers (like an edit), and that data appearing in the Toolserver databases. This delay occurs because Toolserver tools do not access the Wikimedia databases directly; instead, they access copies of those databases, replicated in real-time. Each update to the live database is logged, and the Toolserver databases follow this log to make the same updates.

Replication lag can be dramatically worsened by database crashes, expensive queries, and software or hardware issues.

Determine current lag

While there is a MySQL API to determine replication lag, it does not work correctly. Instead determine the lagging time from the most recent edit on a frequently edited wiki on the cluster.

You can view the Replication lag graphs tool to see current lag and trends.

You can also type @replag in the #wikimedia-toolserver IRC channel on freenode. The replag bot will output something like this:

<tsbot> jsmith: s1-sec-c: 13s [-0.01 s/s]; s2/s5-pri-c: 14m 44s [+0.00 s/s]; s3-rr: 60s [+0.00 s/s]; s3-user: 60s [+0.00 s/s]; s4-rr: 14m 44s [+0.00 s/s]; s4-user: 13s [-0.02 s/s]

This indicates the lag for each database server, and the rate of change for each. (sx-c are copies of the Commons database on each server)

The bot checks the most active wikis on each server to determine replication lag. (For example, if the last edit it sees to the English Wikipedia is 7 minutes old, it assumes a 7-minute lag on server 1.) The following databases are checked:

server database
s1 enwiki_p
s2 itwiki_p
s3 eswiki_p
s4
s5 dewiki_p
s6 frwiki_p
s7
s*-c commonswiki_p

Determining lag by wiki

To determine how much lag is affecting a specific wiki's database, find out what server it is on using the wiki server assignments table, then use the above methods to find out how much lag is affecting that server.

/* Use recentchanges table of the most frequently updated wiki */
SELECT UNIX_TIMESTAMP() - UNIX_TIMESTAMP(MAX(rc_timestamp)) FROM recentchanges;

See also

Administration
Personal tools