Monthly Archives: November 2009

Notes on ‘Expert Oracle’ — No.8: Transaction

1. Transaction: …

2. Transaction in Loop

  
We prefer "Transaction after Loop" to "Transaction in Loop". 

   a. "Transaction in Loop" => Too many seperated transactions => Not Atomic 

   b. "Transaction in Loop" commits frequently => Undo Log more easily erased, since transaction already commited => When "restart" happens, you may see "Snap Too old".

3.
Be generous with the size of undo space, otherwise a long transaction may fail.

4.
Autonomous Transaction — When it commits/rollback, the parent session doesn’t commit/back; When the parent session commits/rollbacks, its embedded Autonomous Transaction doesn’t commit/back

     

Notes on ‘Expert Oracle’ — No.7.2: Concurrency–Write Consistency

1. You have already known about "Read Consistency" — 如果数据在语句/事务开始时没变化,就返回这行数据;否则,从回滚段里读它的Snapshot. 这叫做"Consistent Read"

2. 与之相对的是
“Current Read” — 不去读回滚段,而是从当前数据文件里去读

   a. When does it happen? 在准备修改数据时进行Current Read

   b. 举例说明,

       Update t set y = 1 where x = 5
          i.首先会执行 "select .. from t where x = 5",这个过程使用的是"Consistent Read",即找到回滚段里满足 x = 5的记录
           ii.在准备执行 "set y = 1" 时会再检查一下数据文件中该条记录是否仍满足 "x = 5".这个过程使用的就是"Current Read"
        iii.如果这时该记录已经被另一个事务变成了"x = 6"了,那会怎么办?

3.
如果"current read"时发现数据已经变了(不再满足原来的where条件),则重启整条语句。

    a.重启后的执行步骤和原来一样,也是先用Consistent Read找到记录,再通过Current Read判断数据是否已经不再满足where条件

    b.唯一的不同是重新运行时会用 select … for update 找到数据并锁住

    c.如果重启后执行仍不成功,则再次重启,如此往复…

4.
重启的后果:

   a.
Before类型的触发器会执行两次!“试修改”时执行一次,正式修改时又执行一次!

   b.
重启可能会影响性能
如果一条语句在更新1000条记录被其它事务引发重启,该条语句又要从头开始运行,前面的1000条更新都白做了。

5.
Before类型的触发器本身也可能引发重启。如果这类触发器引用的列值与Where子句中的列值有交叠,那它在执行时也会做一下Current Read,然后与where子句中的条件进行比较,如果不一致就会引发重启。所以用这种触发器时要小心。

  

Notes on ‘Expert Oracle’ — No.7.1: Concurrency–Concurrency with Multi-version

1.
Oracle除了用锁实现并发控制,还利用了“多版本机制”
一条数据除了会记在表文件里,还会记在“回滚段”里。

2.
具体的并发控制策略

   a.ANSI标准定义的并发控制级别(从低到高)

      i.
Read Uncommited (可以读到脏数据)

     ii.
Read Commited   (不能读到脏数据)

     iii.
Repeatable Read (在同一个事务里两次读某条数据,读到的值一定是一样的。一个典型的反面例子是“丢失更新”)

     iv.
Serializable (不但可以重复读,而且还可以保证没有 幻读)

    

   b.Oracle支持哪些级别?

      i.Read Uncommited — 在Oracle中想脏读都读不到!

     ii.
Read Commited  — 支持,是Oracle的默认级别。Oralce不仅支持这个级别,
而且还有所增强:可以保证 “读一致性”(见下文)

     iii.
Repeatable Read — 不直接支持。需自己编程实现。

     iv.
Serializable — 支持。但是由于未使用锁机制,因此
不能真正地实现“串行”

3. 对各级别的支持的具体实现

  
基本原则是: 不用读锁,而是用多版本机制来支持。所谓的多版本,就是某个事务修改某行后,会把该行原来的值放到回滚段(Undo Segment)中,也就是说,除了表文件中的数据版本外,还有一个版本放在回滚段中。

   这里还必须引入一个概念:
读一致性(Read Consistency).

   举例来说,帐户A =100元, 帐户B = 100元,现在一个会话把50块钱从账户A转到帐户B,如果另一个会话在读取A 和 B时,要么读到 A=100, B=100,或者A=50, B=50,那么这个数据库就支持了读一致性。

      i.Read Uncommited — N/A

     ii.Read Commited  

        A.
Oracle如果发现某行数据是脏的(被另一个事务修改,但这个事务还没有提交),就会从回滚段中读出这个版本,即修改前的版本。其它的数据库没有Oracle这么强大,它们只会通过阻塞当前的读操作,直到那个修改事务释放了写锁。

          B.
Oracle在这里支持“语句”级别的读一致性。这里必须举例说明。    
  会话甲从10点整开始读取Sum(A,B),但转账会话乙也正要运行,具体步骤是:

          10:00 会话甲读取A = 100元
                                   10:05 会话乙插进来把A改成50元,把B改成150元,这时Oracle会在回滚段里设置(10:05, A = 100元,B = 100元);接着会话乙提交事务。
          10:10 会话甲读取B。但它发现B自它读取A时已经发生了变化,于是忽略表文件里的值,而是从会话乙的回滚段里去读本条语句开始后B的值,即10:00以后B的值, 读得10:05时,B= 100元
          所以最后,Sum(A,B) 是 200元,实现了“读一致性”。


        似乎平淡无奇。但如果换了其它的数据库,最后结果会是250元,因为会话甲会把B读成150元。             

     iii.Repeatable Read

        A.Oracle不提供直接支持。
其它某些数据库可能会支持Repeatable Read ,但它们是通过加读锁来支持的。也就是我在读时你不准改。Oracle认为这样做会严重影响并发能力。

         B.Oracle也可以通过自己加悲观锁来实现。如Select … for update

        C.当然,通过乐观锁也能实现。

     iv.Serializable 

        A.
也是通过多版本机制来实现的,不过在这里,读一致性从单条语句级别扩展到了整个事务级别。举例说明。

    
10:00 会话甲用select语句在读取A = 100元
         10:05 会话乙把A改成50元,并提交事务。这时Oracle会在回滚段里设置(10:05, A = 100元)
      10:10 会话再次用select语句读取A的值。它会发现A的值已经被改变了。于是它就从回滚段里读取本事务开始后A的值,也就是10:00以后的值,该值是100元。


 

         所以,Repeatable Read就实现了。幻读也是用同样的机制实现的。

         B. 此时Oracle还会通过乐观锁机制来防止丢失更新。

        C. Oracle中的Serializable不是真正的串行化。具体的例子可以看原书。

4. Oralce还支持一种特有的事务级别叫做"Read Only".
"Read Only"的隔离性与Serializable一致,但Read Only事务只能有读操作。

5. 回滚段的问题:
回滚段的大小是有限的。如果你一个事务长达24小时,而又想在事务结束前从回滚段中读取事务开始时的某个数据的值,那你可能读不到东西了,因为你要的数据已经被其它事务覆盖掉了。这时侯你会收到一个异常。

  
另一个问题是,从回滚段里读东西也是有一定代价的。在并发程度很高的生产环境里,一个报表查询可能会多次读取回滚段,所以执行速度会稍慢于你的测试环境,因为测试环境的并发通常会低一些。

Notes on ‘Expert Oracle’ — No.6.3: Locking — Locking Types

1.
DML Locks
  a.
TX Lock
     i. Preventing other transactions from updating the same row

    ii. Unlike other databases, it’s not costy for Oracle to have many Locks

  b.
TM Lock: Preventing others from modifying the table structure while I am updating its data

2.
DDL Locks
   a.
Exclusive DDL Lock: When I am DDLing something (create/drop/alter, etc.), you cannot DDL it or DML it.

   b.
Share DDL Lock: When I am DDLing somthing lightly(create index, etc.), you cannot DDL it, but you can DML it.

   c.
Breakable parse locks: ….

3.
Latch: To protect objects — Whatever, do use "BIND Variables".

4.
Manual Locking: …

Notes on ‘Expert Oracle’ — No.6.2: Locking — Blocking, Deadlock & Escalation

1.
Blocking: …

2.
Deadlock: …

3.
What will cause deadlock in Oracle?

   It may happen
when you try to delete a row in the parent table of which the child table has no indexes on foreign keys. Because in this case the child table will be locked as whole and it increase the chance of deadlock.

4.
Lock Escalation:

     When oracle locks some rows, it may also lightly-lock the whole table, for example, to avoid the table’s structure to be modifed.

   
Lock Escalation is not Lock Promotion. Lock Promotion only happens to other not-so-good happens who choose to promote the lock in order to reduce the number of locks, because locking is expensive for them.

Notes on ‘Expert Oracle’ — No.6.1: Locking — Optimistic & Pessimistic

1. The problem:
Lost Update

      Example: Increase Count

      Flow:

        a. I query the count. It is 5

        b. You increase the count immediately. It is 6 now.

        c. I don’t know it is 6 now. I still assume it is 5. So I increase it by set it as 6.

       Result:  We both try to increase it the count. But it is only increased by 1, not by 2.

2. Solution:
I should have locked the count before I submit my increased number.

     

3. Two Kinds of Locking.

  a.
Pessimistic Locking: Keep the row locked until I submit the update.

     i.Implmentation: select …
for update

    ii.Advantage: Very Simple

   iii.Drawback:
It requirse a keep-alive connection. When the transaction is time-consuming, the connection make live too long, which the concurrency may fall.

  b.Optimistic Locking: 
Don’t do the locking until the updating, and only update the row if the row has not been changed by others.

     i.Implementation:
How to check if the row is updated?

         a. Add an extra
column for versioning (Recommended)

         b.
Checksum of the row data (CPU-consuming)

         c.
ORA_ROWSCN (Oralce embedded versioning column). (Use it with ROWDEPENDENCIES)

    ii.Advantage: Locking period is short. Concurrency is good.

   iii.Drawback: Complicated. Retry is needed if the updating fails. 

 

Notes on ‘Expert Oracle’ — No.5.1: Oracle Processes — Server Processes

1.
Server Process

  a.Dedicated Server: one TCP/IP connection, one server process

  b.Shared Server: Serveral tcp/ip connections => one dispatcher => A pool of processes

2.Connection V.S. Session

  In terms of Domain Logic, the two are not ‘associated’ with each other, they only ‘depend’ each other.

   a. A
connection is
physical network connection, TCP/IP etc.

   b. A
session is a
logic connection.  

A connection can be reused by many sessions, and a session can also span several connections.(Think about Http Session)

 

Question1: Is the JDBC connection an Oralce Connection, or an Oracle Session ?

Question2: What’s their timeout strategies like for connections and sessions? 

3.Dedicated VS Shared

  
OLTP -> Dedicated

  
OLAP -> Shared. Don’t use a J2EE connection pool with this, otherwise there will be two pools, one in J2EE server, the other in DB.

           

An oracle process can be a real OS process, or an OS thread. It depends on what kind of OS you are using.

Notes on ‘Expert Oracle’ — No.4: Memory Areas

1.
SGA & PGA

  a. SGA: Shared Area for all the processes/threads

  b. PGA: The area for a single process/thread

  SGA ∩ PGA == Ø

2.
UGA: The area for the user 

  a. UGA ⊆ SGA if "Shared Server" is used.

  b. UGA ⊆ PGA if "Dedicated Server" is used.

3.Important components of SGA

  a.
Redo Buffer — extremely helpful for big/long transactions, since the redo log is involved for transactions.

  b.
Block Buffer Cache — Caching query results

  c.
Shared Pool

     i. Bascially for compiled SQLs. So, always use "Bind variables" in SQLs because this way there will be too much SQLs in this pool(Think otherwise).

     ii.LRU-Cache

  d.
Large Pool  — For large memory alloacation

4.
SQLs for SGA

  a. See pools

     select * from V$SGASTAT

  b. See pool size

     select sum(bytes)/1024/1024 mbytes from V$SGASTAT where pool = ‘shared pool’