This is one of those books that are perhaps nice to have IN ADDITION to something better. Btw, do NOT overestimate the "for smarties" part in the title: the book is not all that advanced: it's more like an extensive cookbook with a lot of personal opinion thrown in (not always consistent; for example, in one place he inveighs against the evils of using sequential-number sequences as primary keys -- 'cause a table is not a sequence, you see, we're talking about sets here, who, by definition, are unordered -- OK, fine. Ten pages later he blasts the GUID type -- why? Because it's not inherently sequential and it's hard to spot the gaps in the sequences. But hey, why do we care about gaps? All we care about is that the field values be unique, which they are, gaps or no gaps. Seems like GUIDs should be perfect from the set-theoretical point of view, but no, he doesn't like them -- precisely because of the presence of those aspects, the lack of which he bemoaned one chapter back in the IDENTITY type. It's like he wrote these two chapters one ten years after the other, and forgot what he was talking about in one when writing the other.
The content (or rather the intent behind it) is very good. There's a logical progression from the overall-schema things, to tables, and so on, including such esoterica as hierarchies and graphs (which is good not only, or even not so much because of the topics themselves, but because nice recent SQL features like CTEs are used a lot in the sample code thus demonstrating their non-trivial use).
The downsides: the main flaw in Celko's writing is that whatever he writes reads like a two-page journal article, by which I mean it's all a perfunctorily dashed-off collection of tidbits; the overall structure is very tenuous (for example, he starts the hierarchies' chapter by saying they're a sort of graphs -- but the graphs proper chapter comes afterwards: wouldn't it make more sense to switch their order in the book then?) Some chapters are borrowed from other writers: for example, the chapter on temporal databases is taken, or rather squeezed out of Snodgrass's book (which I happened to be reading in parallel and thus was able to notice that). I'm not hinting at plagiarism here: I'm sure Snodgrass was aware of this borrowing and had OK'ed it, but first it would be nice to mention the fact of borrowing (I think), and second, and most important, when you compress a book into a chapter, you gotta do it very carefully so as to keep the material connected, coherent, and clear. This is not the case here (go for the original: it's good, and can be downloaded for free; google on the name).
In general, Celko's writing, while not abhorrent, is mostly (though not everywhere) very sloppy; everything reads like a first draft never touched again by either the author or an editor. A fair amount of typos, and a lot of unclear, careless pages that make you struggle for meaning (not always successfully). Mangled French again: Joseph, if there's an accent over the last 'e' in a masculine form of a participle, it's gotta be an accent aigu, not accent grave (feminine simply adds an extra 'e' w/o changing anything else); thus it's 'née' not 'nèe'. Maybe it's a typo, but it's consistently repeated thoughout the book (and actually present in his other books!). While we're here: "Borland (née Inprise)" is actually the other way around: Inprise (née Borland).
There's no question that the author is a smart guy, but writing isn't his forte and he should be less casual about it -- and then, he should also insist that his publisher provide good editorial oversight. I think MK is a good publisher, and I think they ought to be able to do a better job helping their writers achieve readability.
Bottom line: I don't regret having this book; it's friendly and chatty (in a good sense); the inherently dry material is livened up a bit by a sprinkling of curious trivia; it's been somewhat enlightening on the first read, and repeatedly useful as a reference afterwards. Otoh, it's written sloppily and I feel that, inasmuch as I benefitted from it, I did only because I happen to have enough foundation to compensate for its flaws on my own. But I'm not looking for this kind of effort when reading technical books though, so three stars. Could be more, but for that the book needs to be aggressively edited and restructured. It's the third edition, btw: it would seem that there's been plenty of time to do just that.
All in all, I recommend it, but only half-heartedly: as long as the prospective reader understands that this is not a terribly advanced book, as well that the reading won't be easy. The book's OK, but not on par with what you get from writers like Gray or Date.
Author(s): Joe Celko
Series: The Morgan Kaufmann Series in Data Management Systems
Edition: 3
Publisher: Morgan Kaufmann
Year: 2005
Language: English
Pages: 840
Contents......Page 10
1.1 What Changed in Ten Years......Page 26
1.2 What Is New in This Edition......Page 28
1.3 Corrections and Additions......Page 29
Database Design......Page 30
1.1 Schema and Table Creation......Page 31
1.1.1 CREATE SCHEMA Statement......Page 32
1.1.2 Manipulating Tables......Page 34
1.1.3 Column Constraints......Page 39
1.1.4 UNIQUE Constraints versus UNIQUE Indexes......Page 46
1.1.5 Nested UNIQUE Constraints......Page 47
1.1.6 Overlapping Keys......Page 51
1.1.8 Using VIEWs for Schema Level Constraints......Page 54
1.1.9 Using PRIMARY KEYs and ASSERTIONs for Constraints......Page 58
1.1.10 Avoiding Attribute Splitting......Page 60
1.1.11 Modeling Class Hierarchies in DDL......Page 63
1.2 Generating Unique Sequential Numbers for Keys......Page 65
1.2.1 IDENTITY Columns......Page 66
1.2.3 Sequential Numbering in Pure SQL......Page 68
1.2.4 GUIDs......Page 70
1.2.6 Unique Value Generators......Page 71
1.2.7 Preallocated Values......Page 73
1.2.8 Random Order Values......Page 74
1.3 A Remark on Duplicate Rows......Page 77
1.4.1 Schema Tables......Page 79
1.4.3 CREATE DOMAIN Statement......Page 80
1.4.4 CREATE TRIGGER Statement......Page 81
1.4.6 DECLARE CURSOR Statement......Page 82
Normalization......Page 90
2.2 First Normal Form (1NF)......Page 93
2.2.1 Note on Repeated Groups......Page 95
2.3 Second Normal Form (2NF)......Page 99
2.4 Third Normal Form (3NF)......Page 100
2.5 Elementary Key Normal Form (EKNF)......Page 101
2.6 Boyce-Codd Normal Form (BCNF)......Page 102
2.7 Fourth Normal Form (4NF)......Page 104
2.8 Fifth Normal Form (5NF)......Page 105
2.9 Domain-Key Normal Form (DKNF)......Page 107
2.10 Practical Hints for Normalization......Page 116
2.11 Key Types......Page 117
2.11.3 Exposed Physical Locators......Page 118
2.11.4 Practical Hints for Denormalization......Page 120
2.11.5 Row Sorting......Page 122
3.1 Numeric Types......Page 130
3.1.1 BIT, BYTE, and BOOLEAN Data Types......Page 133
3.2.1 Rounding and Truncating......Page 134
3.2.2 CAST() Function......Page 136
3.3 Four-Function Arithmetic......Page 137
3.4 Arithmetic and NULLs......Page 138
3.5.1 NULLIF() Function......Page 139
3.5.2 COALESCE() Function......Page 140
3.6.1 Number Theory Operators......Page 142
3.6.3 Scaling Functions......Page 145
3.6.4 Converting Numbers to Words......Page 146
4.1 Notes on Calendar Standards......Page 148
4.2 SQL Temporal Data Types......Page 152
4.2.2 Date Format Standards......Page 153
4.2.3 Handling Timestamps......Page 154
4.2.4 Handling Times......Page 156
4.3 Queries with Date Arithmetic......Page 157
4.4.1 Temporal Duplicates......Page 158
4.4.2 Temporal Databases......Page 164
4.4.3 Temporal Projection and Selection......Page 166
4.4.4 Temporal Joins......Page 168
4.4.5 Modifying Valid-Time State Tables......Page 174
4.4.6 Current Modification......Page 175
4.4.7 Sequenced Modification......Page 179
4.4.8 Nonsequenced Modification......Page 184
4.4.9 Transaction-Time State Tables......Page 185
4.4.10 Maintaining the Audit Log......Page 187
4.4.11 Querying the Audit Log......Page 189
4.4.13 Bitemporal Tables......Page 193
4.4.14 Temporal Support in Standard SQL......Page 196
Character Data Types in SQL......Page 198
5.1.1 Problems of String Equality......Page 199
5.1.2 Problems of String Ordering......Page 200
5.2 Standard String Functions......Page 201
5.3 Common Vendor Extensions......Page 203
5.3.1 Phonetic Matching......Page 204
5.4 Cutter Tables......Page 211
NULLs: Missing Data in SQL......Page 214
6.2 Missing Values in Columns......Page 216
6.3 Context and Missing Values......Page 218
6.5 NULLs and Logic......Page 219
6.5.1 NULLS in Subquery Predicates......Page 220
6.7 Functions and NULLs......Page 222
6.8 NULLs and Host Languages......Page 223
6.9 Design Advice for NULLs......Page 224
6.9.1 Avoiding NULLs from the Host Programs......Page 226
6.10 A Note on Multiple NULL Values......Page 227
7.1 Distance Functions......Page 230
7.2 Storing an IP Address in SQL......Page 231
7.2.2 One INTEGER Column......Page 232
7.3 Currency and Other Unit Conversions......Page 234
7.4 Social Security Numbers......Page 235
7.5 Rational Numbers......Page 238
8.1 DELETE FROM Statement......Page 240
8.1.2 The WHERE Clause......Page 241
8.1.4 Deleting within the Same Table......Page 245
8.1.5 Deleting in Multiple Tables without Referential Integrity......Page 249
8.2.1 INSERT INTO Clause......Page 250
8.2.2 The Nature of Inserts......Page 251
8.3.1 The UPDATE Clause......Page 252
8.3.2 The WHERE Clause......Page 253
8.3.3 The SET Clause......Page 254
8.3.4 Updating with a Second Table......Page 255
8.3.5 Using the CASE Expression in UPDATEs......Page 257
8.4 A Note on Flaws in a Common Vendor Extension......Page 260
8.5 MERGE Statement......Page 261
Comparison or Theta Operators......Page 264
9.1 Converting Data Types......Page 265
9.2 Row Comparisons in SQL......Page 267
10.1 IS NULL Predicate......Page 270
10.2 IS [NOT]{TRUE | FALSE | UNKNOWN} Predicate......Page 271
10.3 IS [NOT] NORMALIZED Predicate......Page 273
11.1 The CASE Expression......Page 276
11.1.1 The COALESCE() and NULLIF() Functions......Page 280
11.1.2 CASE Expressions with GROUP BY......Page 281
11.1.3 CASE, CHECK() Clauses and Logical Implication......Page 282
11.1.4 Subquery Expressions and Constants......Page 286
11.2 Rozenshtein Characteristic Functions......Page 287
LIKE Predicate......Page 290
12.1 Tricks with Patterns......Page 291
12.4 Avoiding the LIKE Predicate with a Join......Page 293
12.5 CASE Expressions and LIKE Predicates......Page 295
12.6 SIMILAR TO Predicates......Page 296
12.7.1 String Character Content......Page 298
12.7.3 Creating an Index on a String......Page 299
13.1 The BETWEEN Predicate......Page 302
13.1.3 Programming Tips......Page 303
13.2.1 Time Periods and OVERLAPS Predicate......Page 304
The [NOT] IN() Predicate......Page 316
14.1 Optimizing the IN() Predicate......Page 317
14.2 Replacing ORs with the IN() Predicate......Page 321
14.3 NULLs and the IN() Predicate......Page 322
14.4 IN() Predicate and Referential Constraints......Page 324
14.5 IN() Predicate and Scalar Queries......Page 326
EXISTS() Predicate......Page 328
15.1 EXISTS and NULLs......Page 329
15.2 EXISTS and INNER JOINs......Page 331
15.3 NOT EXISTS and OUTER JOINs......Page 332
15.4 EXISTS() and Quantifier......Page 333
15.5 EXISTS() and Referential Constraints......Page 334
15.6 EXISTS and Three-Valued Logic......Page 335
Quantified Subquey Predicates......Page 338
16.1 Scalar Subquery Comparisons......Page 339
16.2 Quantifiers and Missing Dat......Page 340
16.3 The ALL Predicate and Extrema Functions......Page 342
16.4 The UNIQUE Predicate......Page 343
17.1.1 One-Level SELECT Statement......Page 346
17.1.2 Correlated Subqueries in a SELECT Statement......Page 353
17.1.3 SELECT Statement Syntax......Page 355
17.1.4 The ORDER BY Clause......Page 357
17.2 OUTER JOINs......Page 365
17.2.1 Syntax for OUTER JOINs......Page 366
17.2.2 NULLs and OUTER JOINs......Page 371
17.2.3 NATURAL versus Searched OUTER JOINs......Page 373
17.2.4 Self OUTER JOINs......Page 374
17.2.5 Two or More OUTER JOINs......Page 375
17.2.6 OUTER JOINs and Aggregate Functions......Page 377
17.2.7 FULL OUTER JOIN......Page 378
17.2.8 WHERE Clause OUTER JOIN Operators......Page 379
17.3 Old versus New JOIN Syntax......Page 380
17.4 Scope of Derived Table Names......Page 382
17.5 JOINs by Function Calls......Page 383
17.6 The UNION JOIN......Page 385
17.7 Packing Joins......Page 387
17.8 Dr. Codd’s T-Join......Page 388
17.8.1 The Croatian Solution......Page 392
17.8.3 The Colombian Solution......Page 393
VIEWs, Derived Tables, Materialized
Tables, and Temporary Tables......Page 398
18.1 VIEWs in Queries......Page 399
18.2 Updatable and Read-Only VIEWs......Page 400
18.3.3 Translated Columns......Page 402
18.3.4 Grouped VIEWs......Page 403
18.3.5 UNIONed VIEWs......Page 404
18.3.7 Nested VIEWs......Page 406
18.4.2 VIEW Materialization......Page 408
18.4.3 In-Line Text Expansion......Page 409
18.4.4 Pointer Structures......Page 411
18.5 WITH CHECK OPTION Clause......Page 412
18.5.1 WITH CHECK OPTION as CHECK() Clause......Page 417
18.6 Dropping VIEWs......Page 418
18.7 TEMPORARY TABLE Declarations......Page 419
18.8 Hints on Using VIEWs and TEMPORARY TABLEs......Page 420
18.8.2 Using TEMPORARY TABLEs......Page 421
18.8.3 Flattening a Table with a VIEW......Page 422
18.9.1 Derived Tables in the FROM clause......Page 424
18.10 Derived Tables in the WITH Clause......Page 426
19.1 Coverings and Partitions......Page 430
19.1.1 Partitioning by Ranges......Page 431
19.1.2 Partition by Functions......Page 432
19.1.3 Partition by Sequences......Page 433
19.2 Relational Division......Page 435
19.2.1 Division with a Remainder......Page 437
19.2.2 Exact Division......Page 438
19.2.4 Todd’s Division......Page 439
19.2.6 Division with Set Operators......Page 442
19.3 Romley’s Division......Page 443
19.4 Boolean Expressions in an RDBMS......Page 447
19.5 FIFO and LIFO Subsets......Page 449
20.1 GROUP BY Clause......Page 454
20.2 GROUP BY and HAVING......Page 456
20.2.1 Group Characteristics and the HAVING Clause......Page 458
20.3 Multiple Aggregation Levels......Page 460
20.3.1 Grouped VIEWs for Multiple Aggregation Levels......Page 461
20.3.2 Subquery Expressions for Multiple Aggregation Levels......Page 462
20.3.3 CASE Expressions for Multiple Aggregation Levels......Page 463
20.4 Grouping on Computed Columns......Page 464
20.5 Grouping into Pairs......Page 465
20.6 Sorting and GROUP BY......Page 466
Aggregate Functions......Page 468
21.1 COUNT() Functions......Page 469
21.2 SUM() Functions......Page 472
21.3 AVG() Functions......Page 473
21.3.1 Averages with Empty Groups......Page 475
21.3.2 Averages across Columns......Page 477
21.4.1 Simple Extrema Functions......Page 478
21.4.2 Generalized Extrema Functions......Page 480
21.4.3 Multiple Criteria Extrema Functions......Page 489
21.4.4 GREATEST() and LEAST() Functions......Page 491
21.5 The LIST() Aggregate Function......Page 494
21.5.1 The LIST() Function with a Procedure......Page 495
21.5.2 The LIST() Function by Crosstabs......Page 496
21.6 The PRD() Aggregate Function......Page 497
21.6.1 PRD() Function by Expressions......Page 498
21.6.2 The PRD() Aggregate Function by Logarithms......Page 499
21.7 Bitwise Aggregate Functions......Page 502
21.7.1 Bitwise OR Aggregate Function......Page 503
21.7.2 Bitwise AND Aggregate Function......Page 504
22.1 The Sequence Table......Page 506
22.1.1 Enumerating a List......Page 508
22.1.2 Mapping a Sequence into a Cycle......Page 510
22.1.3 Replacing an Iterative Loop......Page 512
22.2 Lookup Auxiliary Tables......Page 514
22.2.2 Multiple Translation Auxiliary Tables......Page 516
22.2.3 Multiple Parameter Auxiliary Tables......Page 517
22.2.4 Range Auxiliary Tables......Page 518
22.2.5 Hierarchical Auxiliary Tables......Page 519
22.2.6 One True Lookup Table......Page 520
22.3 Auxiliary Function Tables......Page 522
22.3.1 Inverse Functions with Auxiliary Tables......Page 524
22.3.2 Interpolation with Auxiliary Function Tables......Page 533
22.4 Global Constants Tables......Page 535
Statistics in SQL......Page 538
23.1 The Mode......Page 539
23.3 The Median......Page 541
23.3.1 Date’s First Median......Page 542
23.3.2 Celko’s First Median......Page 543
23.3.4 Murchison’s Median......Page 545
23.3.5 Celko’s Second Median......Page 546
23.3.6 Vaughan’s Median with VIEWs......Page 548
23.3.7 Median with Characteristic Function......Page 549
23.3.8 Celko’s Third Median......Page 551
23.3.9 Ken Henderson’s Median......Page 555
23.4 Variance and Standard Deviation......Page 556
23.6 Cumulative Statistics......Page 557
23.6.1 Running Totals......Page 558
23.6.2 Running Differences......Page 559
23.6.3 Cumulative Percentages......Page 560
23.6.4 Rankings and Related Statistics......Page 562
23.6.5 Quintiles and Related Statistics......Page 566
23.7 Cross Tabulations......Page 567
23.7.1 Crosstabs by Cross Join......Page 571
23.7.2 Crosstabs by Outer Joins......Page 572
23.7.3 Crosstabs by Subquery......Page 573
23.8 Harmonic Mean and Geometric Mean......Page 574
23.9.1 Covariance......Page 575
23.9.2 Pearson’s r......Page 576
23.9.3 NULLs in Multivariable Descriptive Statistics......Page 577
Regions, Runs, Gaps, Sequences, and Series......Page 578
24.1 Finding Subregions of Size (n)......Page 579
24.2 Numbering Regions......Page 580
24.3 Finding Regions of Maximum Size......Page 581
24.5 Run and Sequence Queries......Page 586
24.5.1 Filling in Sequence Numbers......Page 589
24.6 Summation of a Series......Page 591
24.7 Swapping and Sliding Values in a List......Page 594
24.9 Folding a List of Numbers......Page 596
24.10 Coverings......Page 597
Arrays in SQL......Page 604
25.1 Arrays via Named Columns......Page 605
25.2 Arrays via Subscript Columns......Page 609
25.3 Matrix Operations in SQL......Page 610
25.3.2 Matrix Addition......Page 611
25.3.3 Matrix Multiplication......Page 612
25.4 Flattening a Table into an Array......Page 614
25.5 Comparing Arrays in Table Format......Page 616
Set Operations......Page 620
26.1 UNION and UNION ALL......Page 621
26.1.1 Order of Execution......Page 623
26.1.3 UNION of Columns from the Same Table......Page 624
26.2 INTERSECT and EXCEPT......Page 625
26.2.1 INTERSECT and EXCEPT without NULLs and Duplicates......Page 628
26.2.2 INTERSECT and EXCEPT with NULLs and Duplicates......Page 629
26.3 A Note on ALL and SELECT DISTINCT......Page 630
26.4 Equality and Proper Subsets......Page 631
27.1 Every nth Item in a Table......Page 634
27.2 Picking Random Rows from a Table......Page 636
27.3.1 Proper Subset Operators......Page 641
27.3.2 Table Equality......Page 642
27.4 Picking a Representative Subset......Page 647
Trees and Hierarchies in SQL......Page 652
28.1 Adjacency List Model......Page 653
28.1.1 Complex Constraints......Page 654
28.1.2 Procedural Traversal for Queries......Page 656
28.2 The Path Enumeration Model......Page 657
28.2.1 Finding Subtrees and Nodes......Page 658
28.2.3 Deleting Nodes and Subtrees......Page 659
28.3 Nested Set Model of Hierarchies......Page 660
28.3.1 The Counting Property......Page 662
28.3.2 The Containment Property......Page 663
28.3.3 Subordinates......Page 664
28.3.5 Deleting Nodes and Subtrees......Page 665
28.3.6 Converting Adjacency List to Nested Set Model......Page 666
28.4 Other Models for Trees and Hierarchies......Page 668
Temporal Queries......Page 670
29.1 Temporal Math......Page 671
29.2 Personal Calendars......Page 672
29.3.1 Gaps in a Time Series......Page 674
29.3.2 Continuous Time Periods......Page 677
29.3.3 Missing Times in Contiguous Events......Page 681
29.3.4 Locating Dates......Page 685
29.3.5 Temporal Starting and Ending Points......Page 687
29.3.6 Average Wait Times......Page 689
29.4 Julian Dates......Page 690
29.5 Date and Time Extraction Functions......Page 694
29.6 Other Temporal Functions......Page 695
29.7 Weeks......Page 696
29.7.1 Sorting by Weekday Names......Page 698
29.8 Modeling Time in Tables......Page 699
29.8.1 Using Duration Pairs......Page 701
29.9 Calendar Auxiliary Table......Page 702
29.10.1 The Zeros......Page 704
29.10.2 Leap Year......Page 705
29.10.3 The Millennium......Page 706
29.10.4 Weird Dates in Legacy Data......Page 708
29.10.5 The Aftermath......Page 709
Graphs in SQL......Page 710
30.1.1 All Nodes in the Graph......Page 711
30.1.3 Reachable Nodes......Page 712
30.1.5 Indegree and Outdegree......Page 713
30.1.6 Source, Sink, Isolated, and Internal Nodes......Page 714
30.2 Paths in a Graph......Page 715
30.2.2 Shortest Path......Page 716
30.2.3 Paths by Iteration......Page 717
30.2.4 Listing the Paths......Page 720
30.3 Acyclic Graphs as Nested Sets......Page 724
30.4 Paths with CTE......Page 726
30.4.1 Nonacyclic Graphs......Page 732
30.5 Adjacency Matrix Model......Page 734
30.6 Points inside Polygons......Page 735
OLAP in SQL......Page 738
31.1 Star Schema......Page 739
31.2.2 Row Numbering......Page 740
31.2.3 GROUPING Operators......Page 741
31.2.4 The Window Clause......Page 743
31.2.5 OLAP Examples of SQL......Page 745
31.2.6 Enterprise-Wide Dimensional Layer......Page 746
31.3 A Bit of History......Page 747
32.1 Sessions......Page 748
32.2.1 Atomicity......Page 749
32.2.3 Isolation......Page 750
32.3.1 The Five Phenomena......Page 751
32.3.2 The Isolation Levels......Page 753
32.4 Pessimistic Concurrency Control......Page 755
32.5 SNAPSHOT Isolation: Optimistic Concurrency......Page 756
32.6 Logical Concurrency Control......Page 758
32.7 Deadlock and Livelocks......Page 759
Optimizing SQL......Page 760
33.1.2 Indexed Access......Page 761
33.2 Expressions and Unnested Queries......Page 762
33.2.1 Use Simple Expressions......Page 763
33.3 Give Extra Join Information in Queries......Page 767
33.4 Index Tables Carefully......Page 769
33.5 Watch the IN Predicate......Page 771
33.6 Avoid UNIONs......Page 773
33.7 Prefer Joins over Nested Queries......Page 774
33.9 Avoid Sorting......Page 775
33.10 Avoid CROSS JOINs......Page 779
33.11 Learn to Use Indexes Carefully......Page 780
33.12 Order Indexes Carefully......Page 781
33.13 Know Your Optimizer......Page 783
33.14 Recompile Static SQL after Schema Changes......Page 785
33.15 Temporary Tables Are Sometimes Handy......Page 786
33.16 Update Statistics......Page 789
References......Page 790
Index......Page 806
About the Author......Page 839