In the series of Oracle storage wait events I have covered so far, five different events are related to the storage: “db File Sequential Read”, “db File Scattered Read” wait events, “Direct Path Read”, “Direct Path Read/Write temp” and “Free Buffer Wait”. In this post, I will describe the log file sync wait event, which in many cases is caused by poor storage performance.
A user session issuing a commit command must wait until the LGWR (Log Writer) process writes the log entries associated with the user transaction to the log file on the disk. Oracle must commit the transaction’s entries to disk (because it is a persistent layer) before acknowledging the transaction commit. The log file sync wait event represents the time the session is waiting for the log buffers to be written to disk. For example, the following user transaction consists of Insert, Select and Update statements, and completes with a commit:
The Insert and Update queries modified some data, but the new blocks were not written to disk yet. When the session issues the commit statement, it is placed on hold and the LGWR flushes the corresponding log entries for the transaction to the log file. (NOTE: The modified data blocks are still in memory and are not committed to disk yet.) When the LGWR completes the log entries flushing, the log file sync wait is over and the transaction is completed.
The reasons why Oracle uses such a technique have to do with performance, reliability and high availability. In terms of performance, the concept behind this technique is the notion that sequential writes are faster than random writes (which is definitely true for mechanical disks). Oracle writes the log file sequentially, while data blocks are written randomly. In addition, the log files’ write size varies and is affected by the transaction size. In OLTP applications with small transaction sizes, it is common to see log files write sizes as small as 512 bytes.
The following diagram illustrates an Oracle shadow process that is waiting for the LGWR process to write its entries to the disk:
Below is an example of an AWR report showing an application where the “log file sync” is the dominant wait event:
High “log file sync” can be observed in case of slow disk writes (LGWR takes long time to write), or because the application commit rate is very high. To identify a LGWR contention, examine the “log file parallel write” background wait event (256ms latency in the example above with 12,465 calls).
For many SSD solutions (other than the Kaminario K2), the log file writes are a challenging operation for several reasons:
- The log file entries are not aligned to 4Kb I/O. For many SSD vendors, this significantly affects the performance of writes.
- Many SSD solutions utilize RAID 5 or RAID 4 for high availability. The log writes on very small I/Os affects their performance and endurance.a. Each write will actually translate to at least two writes (data + parity), which can cause an endurance problem for very active OLTP environments. This will also affect the performance of the write operations, as each write is doubled.b. In RAID 4, the parity drive will probably observed many more writes than the data blocks. This is an endurance issue.
For these reasons, there are several SSD vendors that recommend to not place the transaction logs on their array. Kaminario K2 does not have these limitations and is an excellent storage for even the most demanding redo files. K2 has no performance penalty on non- 4Kb I/Os and does not use the RAID 5 configuration. In addition, K2 is the only SSD array offering a real SSD hybrid solution by allowing placement of the transaction logs on DRAM LUs while the data tablespaces are kept in Flash media.