Oracle Pro Tips, Trends & Technology eNewsletter Pinnacle Publishing http://www.pinnaclepublishing.com Issue 2.21 October 23, 2001 TABLE OF CONTENTS 1) Pop Quiz 2) Solution for Previous Pop Quiz 3) Oracle Tip: Oracle9i Installation 4) Answer To This Week's Pop Quiz --------------------------------------------------------- 1) POP QUIZ (answer at the end) --------------------------------------------------------- Where can you download an evaluation copy of Oracle9i? --------------------------------------------------------- 2) SOLUTION FOR PREVIOUS POP QUIZ --------------------------------------------------------- Two weeks ago, we sent out a pop quiz on a common business problem. We wanted to produce a derived table, which merges the data from the three tables into 1. Sample Data select * from t1; PID SD ED X CBY ---------- --------- --------- ---------- ---------- 1 28-SEP-01 08-OCT-01 12 ann 1 09-OCT-01 23-JUN-04 20 joe 2 28-SEP-01 03-OCT-01 12 ann 2 04-OCT-01 23-JUN-04 20 joe select * from t2; PID SD ED Y CBY ---------- --------- --------- - ---------- 1 27-SEP-01 03-OCT-01 a frank 1 04-OCT-01 06-OCT-01 b mike 1 07-OCT-01 23-JUN-04 c joe 2 28-SEP-01 03-OCT-01 a frank 2 04-OCT-01 06-OCT-01 b mike 2 07-OCT-01 23-JUN-04 c joe select * from t3; PID SD ED Z CBY ---------- --------- --------- ---------- ---------- 1 01-OCT-01 23-JUN-04 10 frank 2 28-SEP-01 03-OCT-01 10 lucy 2 04-OCT-01 23-JUN-04 30 dan Desired Results select * from d; PID SD ED X Y Z CBY ---------- --------- --------- ---------- - ---------- ---------- 1 01-OCT-01 03-OCT-01 12 a 10 frank 1 04-OCT-01 06-OCT-01 12 b 10 mike 1 07-OCT-01 08-OCT-01 12 c 10 joe 1 09-OCT-01 23-JUN-04 20 c 10 joe 2 28-SEP-01 03-OCT-01 12 a 10 ann 2 04-OCT-01 06-OCT-01 20 b 30 joe 2 07-OCT-01 23-JUN-04 20 c 30 joe Solution #1 The easiest solution is a "brute force" method -- that is, a PL/SQL procedure, which uses a cursor to select from the first table and inside the cursor query the other two tables using current date as part of the input. Although, conceptually, this is easy, performance of this approach will be considerably slow. Solution #2 Another solution is to use a number of incremental steps that do inserts and updates. This solution will also be slow. Solution #3 A third solution is to come up with an algorithm, which minimizes I/O. In summary, it performs a sort-merge join using PL/SQL. It defines a cursor per table and orders the rows by pid and start date. Then the code synchronizes the rows from the three cursors by using the least and greatest functions on dates to control which cursor(s) to fetch next. There are several ways to code the details, but the idea is clear. This solution is easy to understand and is relatively efficient because there are only three full table scans. The solution below should be enhanced to include bulk insert, which will improve performance by 30 percent. Solution #4 The fourth alternative is a similar algorithm to the above which does a UNION ALL of the three tables so that the data is already sorted by pid and start date across all three tables. This arguable and slightly simpler algorithm will yield comparable performance to the above. Keep in mind that although this is only one cursor, it will return three records for the same start date for some people. The important point so far is that both of these last two solutions are faster to develop, easier to maintain, and much more efficient. However, they demand that we take the time to come up with a structured design. If we jump into development, we will definitely end up with a "brute force" solution. Solution #5 The next alternative is to solve the entire problem with a simple SQL statement: select t1.pid person_id ,greatest(t1.sd,t2.sd,t3.sd) start_date ,least(t1.ed,t2.ed,t3.ed) end_date ,t1.x ,t2.y ,t3.z ,decode(greatest(t1.sd,t2.sd,t3.sd),t1.sd,t1.cby ,t2.sd,t2.cby ,t3.sd,t3.cby) changed_by from t1 ,t2 ,t3 where t2.pid = t1.pid and t3.pid = t1.pid and greatest(t1.sd, t2.sd, t3.sd) <= least(t1.ed, t2.ed, t3.ed); -- the last condition can also be re-written as (this is more intuitive): and t1.sd <= t2.ed and t1.ed >= t2.sd and t1.sd <= t3.ed and t1.ed >= t3.sd and t2.sd <= t3.ed and t2.ed >= t3.sd This solution is obviously easy to develop and maintain, but it also performs significantly better than the best PL/SQL solution (three to four times faster). In general, SQL performs better than PL/SQL (for instance, context switches). Most people would not think of this solution because they think programmatically (one row at a time vs. in sets). It would be unfair if I didn't mention that my colleague, Andrew Okimi (mailto:aokimi@procaseconsulting.com), came up with this solution in less than five minutes! Also, many thanks to Hartlieb Wild and Jeffrey Coltrain, among others, for responding. --------------------------------------------------------- 3) ORACLE TIP: Oracle9i Installation --------------------------------------------------------- Have you installed the Oracle9i server yet? If you've not and are able to do so, we've a few tips to share. My colleague Lev Moltyaner (mailto:lmoltyaner@procaseconsulting.com) has come up with a checklist for Oracle9i installation. I'll discuss the pre-installation planning in this article, and examine the actual installation and post-installation steps in subsequent articles. Let me know if this is what you do as well. A. Oracle9i Pre-Installation Activities * Determine the operating system: Unix vs. Windows * Determine the size of hard disks Operating system and Oracle software will require space Estimate the total size of all database files (should be a rough estimate rounded up to the nearest G bytes) Consider a need for multiple database copies (Dev, Test, and Prod) Allocate space for online backups, exports, and archive logs * Determine the number of CPUs Consider Oracle license cost (based on number of CPUs and their speed) * Determine amount of memory Consider a need for multiple concurrent databases * We ran a test install on Windows 2000. The Windows software occupied 1.3 G bytes; Oracle 9i software used 1.7 G; and the Oracle seed database used 1 G for a total of 4 G * The standard database files required for any database take up almost 1G System (330 M), undo (200 M), temporary (40 M), and redo (300 M) files While installing Oracle, it's always useful to refer to Oracle's documentation. Documentation is available in HTML and PDF formats. For the test install, it took 10 minutes to install the documentation from CD to disk. There are over 120 books in the documentation library. It's best to use the Master Index or a specific book index to search for a specific topic. The following books should be read by anyone planning to work with Oracle: - Oracle9i Database Concepts - Oracle9i SQL Reference - PL/SQL User's Guide and Reference The following reference books will often be required: - SQL*Plus User's Guide and Reference - SQL*Plus Quick Reference - Oracle9i Database Utilities - Oracle9i Database Error Messages The following books are highly recommended for developers: - Oracle9i Application Developer's Guide - Fundamentals - Oracle9i Application Developer's Guide - Large Objects (LOBs) The following books are highly recommended for DBAs - Oracle9i Database Administrator's Guide - Backup and Recovery Concepts There following books will also serve as good references: - Oracle9i Database Performance Guide and Reference - Oracle9i Database New Features - Oracle9i Supplied PL/SQL Packages and Types Reference - Oracle9i Database Reference - Oracle9i Database Performance Methods In the next article, we'll discuss actual Oracle9i installation and the post-installation steps. --------------------------------------------------------- 4) ANSWER TO THIS WEEK'S POP QUIZ --------------------------------------------------------- You can download from Oracle's technet Web site: http://otn.oracle.com/software/products/oracle9i/content.html --------------------------------------------------------- Well, that's it for this week. I welcome your feedback, input, tips, suggestions, Web sites, and other Oracle- related news. If you send me something, please let me know whether I can use your name with your comments. I apologize in advance if I don't respond personally to each of your questions or suggestions, but I'll get to as many as I can in the eNewsletter if not personally. Garry Chan, Editor Database Architect mailto:GChan@ProcaseConsulting.com This eNewsletter is brought to you compliments of Pinnacle Publishing, Inc. Copyright(c) 2001 http://www.pinnaclepublishing.com All rights reserved.