Database Information: June 2008

Monday, June 30, 2008

New Database Synchronization Tool Helps Sql Developers and Dbas Keep Databases in Sync

By. Perpetuum Software Team

New version of Database Restyle - Library includes a software component and visual wizard. Component is integrated directly into your application, so that you can execute database synchronization from your application. This feature provides undoubtful competitive advantage as you don’t need to install additional software on the client side and incur additional costs.

Visual tool provides the ability to save database snapshot, compare two databases and synchronize them. It works in the wizard mode: convenient and intuitive interface allows execution of mentioned tasks with a few mouse clicks. You just need to specify connections to databases, compare them, then the product will compose compare list containing all objects that should be modified, added or removed and generate script for synchronization; you just need to click buttons. The wizard provides clear visual display of schema differences so you understand them at a glance. It's all about saving time, the Database Restyle wizard eliminates the need for you to figure out how things work in the product – you just follow the prompts from one step to the other. No coding or manual comparing or writing scripts!

Thus, all scripts generated for database synchronization will be typified and address corporate standards. Database developers and DBAs won’t waste their time to write scripts at first, and then to find out that they are not acceptable for some reason. Database Restyle – Library is a real lifesaver, allowing developers to focus on optimizing the production database instead of spending all the time making sure it is up-to-date.

Automatic generation of scripts and synchronization of databases doesn’t mean you are unable to control the process. Developers and DBAs can control any stage of database synchronization and force virtually any aspect of the work. Using Database Restyle component they can re-define procedures, add extra entities, execute additional scripts, skip some objects during synchronization, etc. Using the Wizard they can select objects to be synchronized and review generated scripts.

Both visual and non-visual constituents of Database Restyle -Library are easy-to-use: component uses only a few methods to get and synchronize database schemas, and you don’t even need to know how the product works when you use the wizard.

Using the product, you will be able to execute database synchronization in two ways: directly from your application and via a stand-alone tool. Get both features in a single product!

Database Restyle – Library used in the project will significantly increase its ROI, as product will pay for itself during 2-4 billing hours of your developers. Quite a good deal, isn’t it?

Sunday, June 29, 2008

Fix and Recover Corrupted Dbf Databases Automatically

By. Afonin Oleg

Heavy-load workstations and busy environments are home to most database applications. A system crash or a power failure can easily damage an open database, causing loss of data and corrupted database structures. Once the problem is fixed and the workstation rebooted, the corrupted database will fail to open.

What would you do when pressured to get the database up and running right away? Reach for a backup just to discover it's a week old, or attempt to recover the database, risking causing irreparable damage to what's still left?

Don't panic! If your database uses DBF files, your chances to get back all of the original data are extremely high if you use the right tool. Even if you don't know anything about the internal structure of DBF files, you still can get it fixed easily and automatically.

DBF Repair tool by www.dbf2002.com/dbf-recovery/ recovers damaged and corrupted DBF files completely automatically. You can be a professional database manager or a complete novice - DBF Recovery will repair the broken database regardless of your experience. With DBF Recovery, you simply select the corrupted file, and the program does the rest completely automatically.

Why paying for DBF Recovery instead of just using any of the numerous recovery tools that claim to repair your database in a matter of minutes? While DBF Recovery may not be as fast and as cheap as its freeware competitors, it does a much better job in correctly repairing the corrupted database structures and data records. Unlike many free database repair tools, DBF Recovery does not limit its operation to just the headers. The newly developed database recovery engine automatically detects the exact file format and database that created a DBF file, and thoroughly analyzes and repairs the structure of the database as well as all data records, resulting in the most comprehensive and quality recovery.

Have more corruption in supporting files? DBF Recovery fixes memo-files such as DBT and FPT in addition to DBF.

While new database administrators will certainly appreciate the unprecedented level of automation and the ease of use provided by DBF Recovery, expert users will enjoy the advanced features that provide even more automation to the database recovery process. Comprehensive command line parameters and batch mode support allow using DBF Recovery to process multiple databases automatically, or to fix certain database files on Windows startup.

Novice database users and advanced system administrators will appreciate the time savings provided by DBF Recovery. Supporting all DBF databases, including Dbase III and IV, FoxPro and Visual FoxPro, DBF Recovery is the perfect choice for a concerned database administrator. Download a free evaluation version of DBF Repair and rescue your DBF databases after corruption!

Friday, June 20, 2008

Introductory Transact-SQL (part 10 - last)

By Addison Wesley

Summary

This concludes Introductory Transact-SQL. You should now be able to create a database, build tables, and populate those tables with data. You should also be familiar with the basic syntax required for querying tables and for making rudimentary changes to them. Be sure you have a good grasp of basic Transact-SQL before proceeding with the rest of the book.
About the Author

Ken Henderson, a nationally recognized consultant and leading DBMS practitioner, consults on high-end client/server projects for such customers as the U.S. Air Force, the U.S. Navy, H&R Block, Travelers Insurance, J.P. Morgan, the CIA, Owens-Corning, and CNA Insurance. He is the author of five previous books on client/server and DBMS development, a frequent magazine contributor to such publications as Software Development Magazine and DBMS Magazine, and a speaker at technical conferences.

Introductory Transact-SQL (part 9)

By Addison Wesley

Column Aliases

You might have noticed that some of the earlier queries in this chapter use logical column names for aggregate functions such as COUNT() and SUM(). Labels such as these are known as column aliases and make the query and its result set more readable. As with joins, Transact-SQL provides two separate syntaxes for establishing column aliases: legacy or classical and ANSI standard. In the classical syntax, the column alias immediately precedes the column and the two are separated with an equal sign, like so:

SELECT TodaysDate=GETDATE()

ANSI syntax, by contrast, places a column alias immediately to the right of its corresponding column and optionally separates the two with the AS keyword, like so:

SELECT GETDATE() AS TodaysDate

or

SELECT GETDATE() TodaysDate

Unlike joins, the column alias syntax you choose won't affect query result sets. This is largely a matter of preference, though it's always advisable to use the ANSI syntax when you can if for no other reason than compatibility with other products.

You can use column aliases for any item in a result set, not just aggregate functions. For example, the following example substitutes the column alias LName for the LastName column in the result set:

SELECT customers.LastName AS LName, COUNT(*) AS NumberWithName
FROM customers
GROUP BY customers.LastName

Note, however, that you cannot use column aliases in other parts of the query except in the ORDER BY clause. In the WHERE, GROUP BY, and HAVING clauses, you must use the actual column name or value. In addition to supporting column aliases, ORDER BY supports a variation on this in which you can specify a sort column by its ordinal position in the SELECT list, like so:

SELECT FirstName, LastName
FROM customers
ORDER BY 2

This syntax has been deprecated and is less clear than simply using a column name or alias.
Table Aliases

Similar to column aliases, you can use table aliases to avoid having to refer to a table's full name. You specify table aliases in the FROM clause of queries. Place the alias to the right of the actual table name (optionally separated with the AS keyword), as illustrated here:

SELECT c.LastName, COUNT(*) AS NumberWithName
FROM customers AS c
GROUP BY c.LastName

Notice that the alias can be used in the field list of the SELECT list before it is even syntactically defined. This is possible because a query's references to database objects are resolved before the query is executed.
Managing Transactions

Transaction management is really outside the scope of introductory T-SQL. Nevertheless, transactions are at the heart of database applications development and a basic understanding of them is key to writing good SQL (see Chapter 14, "Transactions," for in-depth coverage of transactions).

The term transaction refers to a group of changes to a database. Transactions provide for change atomicity—which means that either all the changes within the group occur or none of them do. SQL Server applications use transactions to ensure data integrity and to avoid leaving the database in an interim state if an operation fails.

The COMMIT command writes a transaction permanently to disk (technically speaking, if nested transactions are present, this is true only of the outermost COMMIT, but that's an advanced topic). Think of it as a database save command. ROLLBACK, by contrast, throws away the changes a transaction would have made to the database; it functions like a database undo command. Both of these commands affect only the changes made since the last COMMIT; you cannot roll back changes that have already been committed.

Unless the IMPLICIT_TRANSACTIONS session variable has been enabled, you must explicitly start a transaction in order to commit or roll it back. Transactions can be nested, and you can check the current nesting level by querying the @@TRANCOUNT automatic variable, like so:

SELECT @@TRANCOUNT AS TranNestingLevel

Here's an example of some Transact-SQL code that uses transactions to undo changes to the database:

BEGIN TRAN
DELETE customers
GO
ROLLBACK
SELECT * FROM customers

CustomerNumber LastName FirstName StreetAddress City State Zip
-------------- -------- --------- --------------- ----- ----- -----
1 Doe John 123 Joshua Tree Plano TX 75025
2 Doe Jane 123 Joshua Tree Plano TX 75025
3 Citizen John 57 Riverside Reo CA 90120

As you can see, ROLLBACK reverses the row removals carried out by the DELETE statement.
Caution
Be sure to match BEGIN TRAN with either COMMIT or ROLLBACK. Orphaned transactions can cause serious performance and management problems on the server.

Introductory Transact-SQL (part 8)

By Addison Wesley

Grouping Data

Since SQL is a set-oriented query language, statements that group or summarize data are its bread and butter. In conjunction with aggregate functions, they are the means by which the real work of SQL queries is performed. Developers familiar with DBMS products that lean more toward single-record handling find this peculiar because they are accustomed to working with data one row at a time. Generating summary information by looping through a table is a common technique in older database products—but not in SQL Server. A single SQL statement can perform tasks that used to require an entire COBOL program to complete. This magic is performed using SELECT's GROUP BY clause and Transact-SQL aggregate functions. Here's an example:

SELECT customers.CustomerNumber,
SUM(orders.Amount) AS TotalOrders
FROM customers JOIN orders
ON customers.CustomerNumber=orders.CustomerNumber
GROUP BY customers.CustomerNumber

This query returns a list of all customers and the total amount of each customer's orders.

How do you know which fields to include in the GROUP BY clause? You must include all the items in the SELECT statement's column list that are not aggregate functions or absolute values. Take the following SELECT statement:

-- Bad SQL - don't do this
SELECT customers.CustomerNumber, customers.LastName,
SUM(orders.Amount) AS TotalOrders
FROM customers JOIN orders
ON customers.CustomerNumber=orders.CustomerNumber
GROUP BY customers.CustomerNumber

This query won't execute because it's missing a column in the GROUP BY clause. Instead, it should read:

GROUP BY customers.CustomerNumber, customers.LastName

Note that the addition of the LastName column doesn't really affect the results since CustomerNumber is a unique key. That is, including LastName as a GROUP BY column won't cause any additional grouping levels to be produced since there is only one LastName for each CustomerNumber.
HAVING

The HAVING clause is used to limit the rows returned by a SELECT with GROUP BY. Its relationship to GROUP BY is similar to the relationship between the WHERE clause and the SELECT itself. Like the WHERE clause, it restricts the rows returned by a SELECT statement. Unlike WHERE, it operates on the rows in the result set rather than the rows in the query's tables. Here's the previous query modified to include a HAVING clause:

SELECT customers.CustomerNumber,
customers.LastName, SUM(orders.Amount)
AS TotalOrders
FROM customers JOIN orders
ON customers.CustomerNumber=orders.CustomerNumber
GROUP BY customers.CustomerNumber, customers.LastName
HAVING SUM(orders.Amount) > 700

CustomerNumber LastName TotalOrders
-------------- -------- -----------
3 Citizen 86753.09
1 Doe 802.35

There is often a better way of qualifying a query than by using a HAVING clause. In general, HAVING is less efficient than WHERE because it qualifies the result set after it's been organized into groups; WHERE does so beforehand. Here's an example that improperly uses the HAVING clause:

-- Bad SQL - don't do this
SELECT customers.LastName, COUNT(*) AS NumberWithName
FROM customers
GROUP BY customers.LastName
HAVING customers.LastName<>'Citizen'

Properly written, this query's filter criteria should be in its WHERE clause, like so:

SELECT customers.LastName, COUNT(*) AS NumberWithName
FROM customers
WHERE customers.LastName<> 'Citizen'
GROUP BY customers.LastName

In fact, SQL Server recognizes this type of HAVING misuse and translates HAVING into WHERE during query execution. Regardless of whether SQL Server catches errors like these, it's always better to write optimal code in the first place.
Ordering Data

The ORDER BY clause is used to order the rows returned by a query. It follows the WHERE and GROUP BY clauses (if they exist) and sorts the result set just prior to returning it. Here's an example:

SELECT LastName, State
FROM customers
ORDER BY State

Here's another example:

SELECT FirstName, LastName
FROM customers
ORDER BY LastName DESC

Note the use of the DESC keyword to sort the rows in descending order. If not directed otherwise, ORDER BY always sorts in ascending order.

Introductory Transact-SQL (part 7)

By Addison Wesley

Other Types of Joins

Thus far, we've explored only left joins—both inner and outer. There are a few others that are worth mentioning as well. Transact-SQL also supports RIGHT OUTER JOINs, CROSS JOINs, and FULL OUTER JOINs.

A RIGHT OUTER JOIN isn't really that different from a LEFT OUTER JOIN. In fact, it's really just a LEFT OUTER JOIN with the tables reversed. It's very easy to restate a LEFT OUTER JOIN as a RIGHT OUTER JOIN. Here's the earlier LEFT OUTER JOIN query restated:

SELECT customers.CustomerNumber+2, orders.Amount
FROM orders RIGHT OUTER JOIN customers ON
(customers.CustomerNumber+2=orders.CustomerNumber)

Amount
------ --------
3 86753.09
4 NULL
5 NULL

A RIGHT JOIN returns the columns in the first table as NULLs when the join condition fails. Since you decide which table is the first table and which one's the second, whether you use a LEFT JOIN or a RIGHT JOIN is largely a matter a preference.

A CROSS JOIN, by contrast, is an intentional Cartesian product. The size of a Cartesian product is the number of rows in one table multiplied by those in the other. So for two tables with three rows each, their CROSS JOIN or Cartesian product would consist of nine rows. By definition, CROSS JOINs don't need or support the use of the ON clause that other joins require. Here's a CROSS JOIN of the customers and orders tables:

SELECT customers.CustomerNumber, orders.Amount
FROM orders CROSS JOIN customers

CustomerNumber Amount
-------------- --------
1 123.45
1 678.90
1 86753.09
1 678.90
2 123.45
2 678.90
2 86753.09
2 678.90
3 123.45
3 678.90
3 86753.09
3 678.90

(12 row(s) affected)

A FULL OUTER JOIN returns rows from both tables regardless of whether the join condition succeeds. When a join column in the first table fails to find a match in the second, the values from the second table are returned as NULL, just as they are with a LEFT OUTER JOIN. When the join column in the second table fails to find a matching value in the first table, columns in the first table are returned as NULL, as they are in a RIGHT OUTER JOIN. You can think of a FULL OUTER JOIN as the combination of a LEFT JOIN and a RIGHT JOIN. Here's the earlier LEFT OUTER JOIN restated as a FULL OUTER JOIN:

SELECT customers.CustomerNumber+2, orders.Amount
FROM customers FULL OUTER JOIN orders ON
(customers.CustomerNumber+2=orders.CustomerNumber)

Amount
------ --------
3 86753.09
4 NULL
5 NULL
NULL 123.45
NULL 678.90
NULL 678.90

Subqueries

A SELECT statement that's enclosed in parentheses and embedded within another query (usually in its WHERE clause) is called a subquery. A subquery is normally used to return a list of items that is then compared against a column in the main query. Here's an example:

SELECT * FROM customers
WHERE CustomerNumber IN (SELECT CustomerNumber FROM orders)

Of course, you could accomplish the same thing with an inner join. In fact, the SQL Server optimizer turns this query into an inner join internally. However, you get the idea—a subquery returns an item or set of items that you may then use to filter a query or return a column value.

Introductory Transact-SQL (part 6)

By Addison Wesley

Joins
A query that can access all the data it needs in a single table is a pretty rare one. John Donne said "no man is an island," and, in relational databases, no table is, either. Usually, a query will have to go to two or more tables to find all the information it requires. This is the way of things with relational databases. Data is intentionally spread out to keep it as modular as possible. There are lots of good reasons for this modularization (formally known as normalization) that I won't go into here, but one of its downsides is that what might be a single conceptual entity (an invoice, for example) is often split into multiple physical entities when constructed in a relational database.

Dealing with this fragmentation is where joins come in. A join consolidates the data in two tables into a single result set. The tables aren't actually merged; they just appear to be in the rows returned by the query. Multiple joins can consolidate multiple tables—it's quite common to see joins that are multiple levels deep involving scads of tables.

A join between two tables is established by linking a column or columns in one table with those in another (CROSS JOINs are an exception, but more on them later). The expression used to join the two tables constitutes the join condition or join criterion. When the join is successful, data in the second table is combined with the first to form a composite result set—a set of rows containing data from both tables. In short, the two tables have a baby, albeit an evanescent one.

There are two basic types of joins, inner joins and outer joins. The key difference between them is that outer joins include rows in the result set even when the join condition isn't met, while an inner join doesn't. How is this? What data ends up in the result set when the join condition fails? When the join criteria in an outer join aren't met, columns in the first table are returned normally, but columns from the second table are returned with no value—as NULLs. This is handy for finding missing values and broken links between tables.

There are two families of syntax for constructing joins—legacy and ANSI /ISO SQL-92 compliant. The legacy syntax dates back to SQL Server's days as a joint venture between Sybase and Microsoft. It's more succinct than the ANSI syntax and looks like this:

SELECT customers.CustomerNumber, orders.Amount
FROM customers, orders
WHERE customers.CustomerNumber=orders.CustomerNumber

CustomerNumber Amount
-------------- --------
1 123.45
2 678.90
3 86753.09
1 678.90

Note the use of the WHERE clause to join the customers and orders tables together. This is an inner join. If an order doesn't exist for a given customer, that customer is omitted completely from the list. Here's the ANSI version of the same query:

SELECT customers.CustomerNumber, orders.Amount
FROM customers JOIN orders
ON (customers.CustomerNumber=orders.CustomerNumber)

This one's a bit loquacious, but the end result is the same: customers and orders are merged using their respective CustomerNumber columns.

As I mentioned earlier, it's common for queries to construct multilevel joins. Here's an example of a multilevel join that uses the legacy syntax:

SELECT customers.CustomerNumber, orders.Amount,
items.Description
FROM customers, orders, items
WHERE customers.CustomerNumber=orders.CustomerNumber
AND orders.ItemNumber=items.ItemNumber

CustomerNumber Amount Description
-------------- -------- -----------
1 123.45 WIDGET A
2 678.90 WIDGET B
3 86753.09 WIDGET C
1 678.90 WIDGET B

This query joins the composite of the customers table and the orders table with the items table. Note that the exact ordering of the WHERE clause is unimportant. In order to allow servers to fully optimize queries, SQL requires that the ordering of the predicates in a WHERE clause must not affect the result set. They must be associative—the query must return the same result regardless of the order in which they're processed.

As with the two-table join, the ANSI syntax for multitable inner joins is similar to the legacy syntax. Here's the ANSI syntax for the multitable join above:

SELECT customers.CustomerNumber, orders.Amount, items.Description
FROM customers JOIN orders
ON (customers.CustomerNumber=orders.CustomerNumber)
JOIN items ON (orders.ItemNumber=items.ItemNumber)

Again, it's a bit wordier, but it performs the same function.
Outer Joins

Thus far, there hasn't been a stark contrast between the ANSI and legacy join syntaxes. Though not syntactically identical, they seem to be functionally equivalent.

This all changes with outer joins. The ANSI outer join syntax addresses ambiguities inherent in using the WHERE clause—whose terms are by definition associative—to perform table joins. Here's an example of the legacy syntax that contains such ambiguities:

-- Bad SQL - Don't run
SELECT customers.CustomerNumber, orders.Amount, items.Description
FROM customers, orders, items
WHERE customers.CustomerNumber*=orders.CustomerNumber
AND orders.ItemNumber*=items.ItemNumber

Don't bother trying to run this—SQL Server won't allow it. Why? Because WHERE clause terms are required to be associative, but these aren't. If customers and orders are joined first, those rows where a customer exists but has no orders will be impossible to join with the items table since their ItemNumber column will be NULL. On the other hand, if orders and items are joined first, the result set will include ITEM records it likely would have otherwise missed. So the order of the terms in the WHERE clause is significant when constructing multilevel joins using the legacy syntax.

It's precisely because of this ambiguity—whether the ordering of WHERE clause predicates is significant—that the SQL-92 standard moved join construction to the FROM clause. Here's the above query rewritten using valid ANSI join syntax:

SELECT customers.CustomerNumber, orders.Amount, items.Description
FROM customers LEFT OUTER JOIN orders ON
(customers.CustomerNumber=orders.CustomerNumber)
LEFT OUTER JOIN items ON (orders.ItemNumber=items.ItemNumber)

CustomerNumber Amount Description
-------------- -------- -----------
1 123.45 WIDGET A
1 678.90 WIDGET B
2 678.90 WIDGET B
3 86753.09 WIDGET C

Here, the ambiguities are gone, and it's clear that the query is first supposed to join the customers and orders tables, then join the result with the items table. (Note that the OUTER keyword is optional.)

To understand how this shortcoming in the legacy syntax can affect query results, consider the following query. We'll set it up initially so that the outer join works as expected:

SELECT customers.CustomerNumber, orders.Amount
FROM customers, orders
WHERE customers.CustomerNumber*=orders.CustomerNumber
AND orders.Amount>600

CustomerNumber Amount
-------------- --------
1 678.90
2 678.90
3 86753.09

Since every row in customers finds a match in orders, the problem isn't obvious. Now let's change the query so that there are a few mismatches between the tables, like so:

SELECT customers.CustomerNumber+2, orders.Amount
FROM customers, orders
WHERE customers.CustomerNumber+2*=orders.CustomerNumber
AND orders.Amount>600

This version simply adds 2 to CustomerNumber to ensure that at least a few of the joins will fail and the columns in orders will be returned as NULLs. Here's the result set:

CustomerNumber Amount
-------------- --------
3 86753.09
4 NULL
5 NULL

See the problem? Those last two rows shouldn't be there. Amount is NULL in those rows (because there are no orders for customers 4 and 5), and whether it exceeds $600 is unknown. The query is supposed to return only those rows whose Amount column is known to exceed $600, but that's not the case. Here's the ANSI version of the same query:

SELECT customers.CustomerNumber+2, orders.Amount
FROM customers LEFT OUTER JOIN orders ON
(customers.CustomerNumber+2=orders.CustomerNumber)
WHERE orders.Amount>600

CustomerNumber Amount
-------------- --------
3 86753.09

The SQL-92 syntax correctly omits the rows with a NULL Amount. The reason the legacy query fails here is that the predicates in its WHERE clause are evaluated together. When Amount is checked against the >600 predicate, it has not yet been returned as NULL, so it's erroneously included in the result set. By the time it's set to NULL, it's already in the result set, effectively negating the >600 predicate.

Though the inner join syntax you choose is largely a matter a preference, you should still use the SQL-92 syntax whenever possible. It's hard enough keeping up with a single way of joining tables, let alone two different ways. And, as we've seen, there are some real problems with the legacy outer join syntax. Moreover, Microsoft strongly recommends the use of the ANSI syntax and has publicly stated that the legacy outer join syntax will be dropped in a future release of the product. Jumping on the ANSI /ISO bandwagon also makes sense from another perspective: interoperability. Given the way in which the DBMS world—like the real world—is shrinking, it's not unusual for an application to communicate with or rely upon more than one vendor's DBMS. Heterogeneous joins, passthrough queries, and vendor-to-vendor replication are now commonplace. Knowing this, it makes sense to abandon proprietary syntax elements in favor of those that play well with others.

Introductory Transact-SQL (part 5)

By Addison Wesley

Converting Data Types

Converting data between types is equally simple. You can use either the CAST() or CONVERT() function to convert one data type to another, but CAST() is the SQL-92 P compliant method. Here's a SELECT that converts the Amount column in the orders table to a character string:

SELECT CAST(Amount AS varchar) FROM orders

--------
123.45
678.90
86753.09
678.90

Here's an example that illustrates how to convert a datetime value to a character string using a specific format:

SELECT CONVERT(char(8), GETDATE(),112)
--------
19690720

This example highlights one situation in which CONVERT() offers superior functionality to CAST(). CONVERT() supports a style parameter (the third argument above) that specifies the exact format to use when converting a datetime value to a character string. You can find the table of supported styles in the Books Online, but styles 102 and 112 are probably the most common.
CASE

In the examples throughout this book, you'll find copious use of the CASE function. CASE has two basic forms. In the simpler form, you specify result values for each member of a series of expressions that are compared to a determinant or key expression, like so:

SELECT CASE sex
WHEN 0 THEN 'Unknown'
WHEN 1 THEN 'Male'
WHEN 2 THEN 'Female'
ELSE 'Not applicable'
END

In the more complex form, known as a "searched" CASE, you specify individual result values for multiple, possibly distinct, logical expressions, like this:

SELECT CASE
WHEN DATEDIFF(dd,RentDueDate,GETDATE())>15 THEN Desposit
WHEN DATEDIFF(dd,RentDueDate,GETDATE())>5 THEN DailyPenalty*
DATEDIFF(dd,RentDueDate,GETDATE())
ELSE 0
END

A searched CASE is similar to an embedded IF...ELSE, with each WHEN performing the function of a new ELSE clause.

Personally, I've never liked the CASE syntax. I like the idea of a CASE function, but I find the syntax unwieldy. It behaves like a function in that it can be nested within other expressions, but syntactically, it looks more like a flow-control statement. In some languages, "CASE" is a flow-control keyword that's analogous to the C/C11 switch statement. In Transact-SQL, CASE is used similarly to an inline or "immediate" IF—it returns a value based on if-then-else logic. Frankly, I think it would make a lot more sense for the syntax to read something like this:

CASE(sex, 0, 'Unknown', 1, 'Male', 2, 'Female', 'Unknown')

or

CASE(DATEDIFF(dd,RentDueDate,GETDATE())>15, Deposit,
DATEDIFF(dd,RentDueDate,GETDATE())>5, DailyPenalty*
DATEDIFF(dd,RentDueDate,GETDATE()),0)

This is the way that the Oracle DECODE() function works. It's more compact and much easier to look at than the cumbersome ANSI CASE syntax.
Aggregate Columns

Aggregate columns consist of special functions that perform some calculation on a set of data. Examples of aggregates include the COUNT(), SUM(), AVG(), MIN(), STDDEV(), VAR(), and MAX() functions. They're best understood by example. Here's a command that returns the total number of customer records on file:

SELECT COUNT(*) FROM customers

Here's one that returns the dollar amount of the largest order on file:

SELECT MAX(Amount) FROM orders

And here's one that returns the total dollar amount of all orders:

SELECT SUM(Amount) FROM orders

Aggregate functions are often used in tandem with SELECT's GROUP BY clause (covered below) to produce grouped or partitioned aggregates. They can be employed in other uses as well (e.g., to "hide" normally invalid syntax), as the chapters on statistical computations illustrate.
Filtering Data

You use the SQL WHERE clause to qualify the data a SELECT statement returns. It can also be used to limit the rows affected by an UPDATE or DELETE statement. Here are some queries that use WHERE to filter the data they return:

SELECT UPPER(LastName), FirstName
FROM customers
WHERE State='TX'

FirstName
--- ---------
DOE John
DOE Jane

The following code restricts the customers returned to those whose address contains the word "Joshua."

SELECT LastName, FirstName, StreetAddress FROM customers
WHERE StreetAddress LIKE '%Joshua%'

LastName FirstName StreetAddress
-------- --------- ---------------
Doe John 123 Joshua Tree
Doe Jane 123 Joshua Tree

Note the use of "%" as a wildcard. The SQL wildcard % (percent sign) matches zero or more instances of any character, while _ (underscore) matches exactly one.

Here's a query that returns the orders exceeding $500:

SELECT OrderNumber, OrderDate, Amount
FROM orders
WHERE Amount > 500

OrderNumber OrderDate Amount
----------- ----------------------- --------
102 1992-02-27 00:00:00.000 678.90
103 1995-05-20 00:00:00.000 86753.09
104 1997-11-21 00:00:00.000 678.90

The following example uses the BETWEEN operator to return orders occurring between October 1990 and May 1995, inclusively. I've included the time with the second of the two dates because, without it, the time would default to midnight (SQL Server datetime columns always store both the date and time; an omitted time defaults to midnight), making the query noninclusive. Without specification of the time portion, the query would return only orders placed up through the first millisecond of May 31.

SELECT OrderNumber, OrderDate, Amount FROM orders
WHERE OrderDate BETWEEN '10/01/90' AND '05/31/95 23:59:59.999'

OrderNumber OrderDate Amount
----------- ----------------------- --------
101 1990-10-18 00:00:00.000 123.45
102 1992-02-27 00:00:00.000 678.90
103 1995-05-20 00:00:00.000 86753.09

Introductory Transact-SQL (part 4)

By Addison Wesley

Querying Data

The SELECT command is used to query tables and views for data. You specify what you want via a SELECT statement, and the server "serves" it to you via a result set—a collection of rows containing the data you requested. SELECT is the Swiss Army knife of basic SQL. It can join tables, retrieve data you request, assign local variables, and even create other tables. It's a fair guess that you'll use the SELECT statement more than any other single command in Transact-SQL.

We'll begin exploring SELECT by listing the contents of the tables you just built. Execute

SELECT * FROM tablename

in Query Analyzer, replacing tablename with the name of each of the three tables. You should find that the CUSTOMER and items tables have three rows each, while orders has four.

SELECT * FROM customers

(Results abridged)

CustomerNumber LastName FirstName StreetAddress
-------------- -------- --------- -------------
1 Doe John 123 Joshua Tree
2 Doe Jane 123 Joshua Tree
3 Citizen John 57 Riverside

SELECT * FROM orders

In the following line, CustomerNumber should NOT be broken.
It was done in this case to fit the Developer.com web page.

Customer
OrderNumber OrderDate Number ItemNumber Amount
----------- ----------------------- ------ ---------- --------
101 1990-10-18 00:00:00.000 1 1001 123.45
102 1992-02-27 00:00:00.000 2 1002 678.90
103 1995-05-20 00:00:00.000 3 1003 86753.09
104 1997-11-21 00:00:00.000 1 1002 678.90

SELECT * FROM items

ItemNumber Description Price
---------- ----------- --------
1001 WIDGET A 123.45
1002 WIDGET B 678.90
1003 WIDGET C 86753.09

Column Lists

SELECT * returns all the columns in a table. To return a subset of a table's columns, use a comma-delimited field list, like so:

SELECT CustomerNumber, LastName, State FROM customers

CustomerNumber LastName State
-------------- -------- -----
1 Doe TX
2 Doe TX
3 Citizen CA

A SELECT's column can include column references, local variables, absolute values, functions, and expressions involving any combinations of these elements.
SELECTing Variables and Expressions

Unlike most SQL dialects, the FROM clause is optional in Transact-SQL when not querying database objects. You can issue SELECT statements that return variables (automatic or local), functions, constants, and computations without using a FROM clause. For example,

SELECT GETDATE()

returns the system date on the computer hosting SQL Server, and

SELECT CAST(10+1 AS CHAR(2))+'/'+CAST(POWER(2,5)-5
AS CHAR(2))+'/19'+CAST(30+31
AS CHAR(2))

returns a simple string. Unlike Oracle and many other DBMSs, SQL Server doesn't force the inclusion of a FROM clause if it makes no sense to do so. Here's an example that returns an automatic variable:

SELECT @@VERSION

And here's one that returns the current user name:

SELECT SUSER_SNAME()

@@VERSION is an automatic variable that's predefined by SQL Server and read-only. The SQL Server Books Online now refers to these variables as functions, but they aren't functions in the true sense of the word—they're predefined constants or automatic variables (e.g., they can be used as parameters to stored procedures, but true functions cannot). I like variable better than constant because the values they return can change throughout a session—they aren't really constant, they're just read-only as far as the user is concerned. You'll see the term automatic variable used throughout this book.
Functions
Functions can be used to modify a column value in transit. Transact-SQL provides a bevy of functions that can be roughly divided into six major groups: string functions, numeric functions, date functions, aggregate function, system functions, and meta-data functions. Here's a Transact-SQL function in action:

SELECT UPPER(LastName), FirstName
FROM customers

FirstName
-------------- ---------
DOE John
DOE Jane
CITIZEN John

Here, the UPPER() function is used to uppercase the LastName column as it's returned in the result set. This affects only the result set—the underlying data is unchanged.

Introductory Transact-SQL (part 3)

By Addison Wesley

Inserting Data

Use the Transact-SQL INSERT statement to add data to a table, one row at a time. Let's explore this by adding some test data to the customers table. Enter the following SQL commands to add three rows to customers:

INSERT INTO customers
VALUES(1,'Doe','John','123 Joshua Tree','Plano','TX','75025')
INSERT INTO customers
VALUES(2,'Doe','Jane','123 Joshua Tree','Plano','TX','75025')
INSERT INTO customers
VALUES(3,'Citizen','John','57 Riverside','Reo','CA','90120')

Now, add four rows to the orders table using the same syntax:

INSERT INTO orders
VALUES(101,'10/18/90',1,1001,123.45)

INSERT INTO orders
VALUES(102,'02/27/92',2,1002,678.90)

INSERT INTO orders
VALUES(103,'05/20/95',3,1003,86753.09)

INSERT INTO orders
VALUES(104,'11/21/97',1,1002,678.90)

Finally, insert three rows into the items table like so:

INSERT INTO items
VALUES(1001,'WIDGET A',123.45)

INSERT INTO items
VALUES(1002,'WIDGET B',678.90)

INSERT INTO items
VALUES(1003,'WIDGET C',86753.09)

Notice that none of these INSERTs specifies a list of fields, only a list of values. The INSERT command defaults to inserting a value for all columns in order, though you could have specified a column list for each INSERT using syntax like this:

INSERT INTO items (ItemNumber, Price)
VALUES(1001,123.45)

Also note that it's unnecessary to follow the table's column order in a column list; however, the order of values you supply must match the order of the column list. Here's an example:

INSERT INTO items (Price, ItemNumber)
VALUES(123.45, 1001)

One final note: The INTO keyword is optional in Transact-SQL. This deviates from the ANSI SQL standard and from most other SQL dialects. The syntax below is equivalent to the previous query:

INSERT items (Price, ItemNumber)
VALUES(123.45, 1001)

Updating Data

Most people eventually want to change the data they've loaded into a database. The SQL UPDATE command is the means by which this happens. Here's an example:

UPDATE customers
SET Zip='86753-0900'
WHERE City='Reo'

Depending on the data, the WHERE clause in this query might limit the UPDATE to a single row or to many rows. You can update all the rows in a table by omitting the WHERE clause:

UPDATE customers
SET State='CA'

You can also update a column using columns in the same table, including the column itself, like so:

UPDATE orders
SET Amount=Amount+(Amount*.07)

Transact-SQL provides a nice extension, the SQL UPDATE command, that allows you to update the values in one table with those from another. Here's an example:

UPDATE o
SET Amount=Price
FROM orders o JOIN items i ON (o.ItemNumber=i.ItemNumber)

Deleting Data

The SQL DELETE command is used to remove data from tables. To delete all the rows in a table at once, use this syntax:

DELETE FROM customers

Similarly to INSERT, the FROM keyword is optional. Like UPDATE, DELETE can optionally include a WHERE clause to qualify the rows it removes. Here's an example:

DELETE FROM customers
WHERE LastName<>'Doe'

SQL Server provides a quicker, more brute-force command for quickly emptying a table. It's similar to the dBASE ZAP command and looks like this:

TRUNCATE TABLE customers

TRUNCATE TABLE empties a table without logging row deletions in the transaction log. It can't be used with tables referenced by FOREIGN KEY constraints, and it invalidates the transaction log for the entire database. Once the transaction log has been invalidated, it can't be backed up until the next full database backup. TRUNCATE TABLE also circumvents the triggers defined on a table, so DELETE triggers don't fire, even though, technically speaking, rows are being deleted from the table. (See Chapter 4, "DDL Insights," for more information.)

Introductory Transact-SQL (part 2)

By Addison Wesley

Creating a Database

You might already have a database in which you can create some temporary tables for the purpose of working through the examples in this book. If you don't, creating one is easy enough. In Transact-SQL, you create databases using the CREATE DATABASE command. The complete syntax can be quite complex, but here's the simplest form:

CREATE DATABASE GG_TS

Run this command in Query Analyzer to create a scratch database for working through the examples in this book. Behind the scenes, SQL Server creates two operating system files to house the new database: GG_TS.MDF and GG_TS_Log.LDF. Data resides in the first file; transaction log information lives in the second. A database's transaction log is the area where the server first carries out changes made to the data. Once those changes succeed, they're applied atomically—in one piece—to the actual data. It's advantageous for both recoverability and performance to separate user data from transaction log data, so SQL Server defaults to working this way. If you don't specifically indicate a transaction log location (as in the example above), SQL Server selects one for you (the default location is the data directory that was selected during installation).

Notice that we didn't specify a size for the database or for either of the files. Our new database is set up so that it automatically expands as data is inserted into it. Again, this is SQL Server's default mode of operation. This one feature alone—database files that automatically expand as needed—greatly reduces the database administrator's (DBA's) workload by alleviating the need to monitor databases constantly to ensure that they don't run out of space. A full transaction log prevents additional changes to the database, and a full data segment prevents additional data from being inserted.
Creating Tables
Once the database is created, you're ready to begin adding objects to it. Let's begin by creating some tables using SQL's CREATE TABLE statement. To ensure that those tables are created in the new database, be sure to change the current database focus to GG_TS before issuing any of these commands. You can do this two ways: You can execute a USE command—USE GG_TS—in the query edit window prior to executing any other commands, or (assuming you're using Query Analyzer) you can select the new database from the DB: combo-box on the edit window's toolbar (select from this list if your new database is not visible at first). The DB: combo-box reflects the currently selected database, so be sure it points to GG_TS before proceeding.

Execute the following command to create the customers table:

USE GG_TS Change the current database context to GG_TS
GO
CREATE TABLE customers
(
CustomerNumber int NOT NULL,
LastName char(30) NOT NULL,
FirstName char(30) NOT NULL,
StreetAddress char(30) NOT NULL,
City char(20) NOT NULL,
State char(2) NOT NULL,
Zip char(10) NOT NULL
)

Once the customers table is built, create the orders table using similar syntax:

CREATE TABLE orders
(
OrderNumber int NOT NULL,
OrderDate datetime NOT NULL,
CustomerNumber int NOT NULL,
ItemNumber int NOT NULL,
Amount numeric(9,2) NOT NULL
)

Most SQL concepts can be demonstrated using three or fewer tables, so we'll create a third table. Create the items table using this command:

CREATE TABLE items
(
ItemNumber int NOT NULL,
Description char(30) NOT NULL,
Price numeric(9,2) NOT NULL
)

These commands are fairly self-explanatory. The only element that might look a little strange if you're new to SQL Server is the NOT NULL specification. The SQL NULL keyword is a special syntax token that's used to represent unknown or nonexistent values. It is not the same as zero for integers or blanks for character string columns. NULL indicates that a value is not known or completely missing from the column—that it's not there at all. The difference between NULL and zero is the difference between having a zero account balance and not having an account at all. (See Chapter 3, "Missing Values," for more information on NULLs.) The NULL/NOT NULL specification is used to control whether a column can store SQL's NULL token. This is formally referred to as column nullability. It dictates whether the column can be truly empty. So, you could read NULL/NOT NULL as NOT REQUIRED/ REQUIRED, respectively. If a field can't contain NULL, it can't be truly empty and is therefore required to have some other value.

Note that you don't have to specify column nullability when you create a table—SQL Server will supply a default setting if it's omitted. The rules governing default column nullability go like this:

* If you explicitly specify either NULL or NOT NULL, it will be used (if valid—see below).
* If a column is based on a user-defined data type, that data type's nullability specification is used.
* If a column has only one nullability option, that option is used. Timestamp columns always require values, and bit columns can require them as well, depending on the server compatibility setting (specified via the sp_dbcmptlevel system stored procedure).
* If the session setting ANSI_NULL_DFLT_ON is set to true (it defaults to the setting specified in the database), column nullability defaults to true. ANSI SQL specifies that columns are nullable by default. Connecting to SQL Server via ODBC or OLEDB (which is the normal way applications connect) sets ANSI_NULL_DFLT_ON to true by default, though this can be changed in ODBC data sources or by the calling application.
* If the database setting ANSI null default is set to true (it defaults to false), column nullability is set to true.
* If none of these conditions specifies an ANSI NULL setting, column nullability defaults to false so that columns don't allow NULL values.

Thursday, June 19, 2008

Introductory Transact-SQL (part 1)

By Addison Wesley

The single biggest challenge to learning SQL programming is unlearning procedural programming.—Joe Celko

SQL is the lingua franca of the database world. Most modern DBMSs use some type of SQL dialect as their primary query language, including SQL Server. You can use SQL to create or destroy objects on the database server such as tables and to do things with those objects, such as put data into them or query them for that data. No single vendor owns SQL, and each is free to tailor the language to better satisfy its own customer base. Despite this latitude, there is a multilateral agreement against which each implementation is measured. It's commonly referred to as the ANSI /ISO SQL standard and is governed by the National Committee on Information Technology Standards (NCITS H2). This standard is actually several standards—each named after the year in which it was adopted. Each standard builds on the ones before it, introducing new features, refining language syntax, and so on. The 1992 version of the standard—commonly referred to as SQL-92—is probably the most popular of these and is definitely the most widely adopted by DBMS vendors. As with other languages, vendor implementations of SQL are rated according to their level of compliance with the ANSI/ISO standard. Most vendors are compliant with at least the entry-level SQL-92 specification, though some go further.

Transact-SQL is Microsoft SQL Server's implementation of the language. It is largely SQL-92 compliant, so if you're familiar with another vendor's flavor of SQL, you'll probably feel right at home with Transact-SQL. Since helping you to become fluent in Transact-SQL is the primary focus of this book and an important step in becoming a skilled SQL Server practitioner, it's instructive to begin with a brief tour of language fundamentals.

Much of the difficulty typically associated with learning SQL is due to the way it's presented in books and courseware. Frequently, the would-be SQL practitioner is forced to run a gauntlet of syntax sinkholes and query quicksand while lugging a ten-volume set on database design and performance and tuning on her back. It's easy to get disoriented in such a situation, to become inundated with nonessential information—to get bogged down in the details. Add to this the obligatory dose of relational database theory, and the SQL neophyte is ready to leave summer camp early.

As with the rest of this book, this chapter attempts to keep things simple. It takes you through the process of creating tables, adding data to them, and querying those tables, one step at a time. This chapter focuses exclusively on the practical details of getting real work done with SQL—it illuminates the bare necessities of Transact-SQL as quickly and as concisely as possible.
Note
In this chapter, I assume you have little or no prior knowledge of Transact-SQL. If you already have a basic working knowledge of the language, you can safely skip to the next chapter.

Like most computer languages, Transact-SQL is best learned by experience. The view from the trenches is usually better than the one from the tower.

Choosing a SQL Editor

The first step on the road to Transact-SQL fluency is to pick a SQL entry and editing tool. You'll use this facility to enter SQL commands, execute them, and view their results. The tool you pick will be your constant companion throughout the rest of this book, so choose wisely.

The Query Analyzer tool that's included with SQL Server is a respectable SQL entry facility. It's certainly capable of allowing you to work through the examples in this book. Those familiar with previous versions of SQL Server will remember this tool as ISQL / W. The new version resembles its predecessor in many ways but sports a slightly more modern interface. The name change reflects the fact that the new version is more than a mere SQL entry facility. In addition to basic query entry and execution facilities, it provides a wealth of analysis and tuning info (see Chapter 16, "Transact-SQL Performance Tuning," for more information).

The first order of business when you start Query Analyzer is to connect to the server, so make sure your server is running. Enter your username and password when prompted (if your server is newly installed, username sa defaults to an empty password) and select your server name. If Query Analyzer and SQL Server are running on the same machine, you can use "." (a period—with no quotes) or (local) (don't forget the parentheses) for the server name. The user interface of the tool is self-explanatory: You key T-SQL queries into the top pane of the window and view results in the bottom one.

The databases currently defined on your server are displayed in a combo-box on each window's toolbar. You can select one from the list to make it the active database for the queries you run in that window. Pressing Ctrl-E, F5, or Alt-X runs your query, while Ctrl-F5 checks it for syntax errors.
Hot Tip
If you execute a query while a selection is active in the edit window, Query Analyzer will execute the selection rather than the entire query. This is handy for executing queries in steps and for quickly executing another command without opening a new window.

One of the features sorely missed in Query Analyzer is the Alt-F1 object help facility. In ISQL/ W, you could select an object name in the edit window and press Alt-F1 to get help on it. For tables and views, this presented an abbreviated sp_help report. It was quite handy and saved many a trip to a new query window merely to list an object's columns.

If you're a command-line devotee, you may prefer the OSQL utility to Query Analyzer. OSQL is an ODBC-based command-line utility that ships with SQL Server. Like Query Analyzer, OSQL can be used to enter Transact-SQL statements and stored procedures to execute. Once you've entered a query, hit return to drop to a new line, then type GO and hit return again to run it (GO must be leftmost on the line). To exit OSQL, type EXIT and hit return.

OSQL has a wealth of command-line and runtime options that are too lengthy to go into here. See the SQL Books Online for more info.

A third option is to use the Sequin SQL editor included on the CD with this book. Sequin sports many of Query Analyzer's facilities without abandoning the worthwhile features of its predecessors. It has the advantage of being able to query any server or DBMS for which an ODBC provider exists. This means, for example, that you can query Access, Oracle, and SQL Server using just one tool.

Tuesday, June 17, 2008

A Database Driven Reporting Application (2)

By. Chi-Wei Wang

As shown in Figure 1, the application first needs to determine whether there is a report that needs to be generated and, if so, what type of report it is. In this scenario, the reporting application is scheduled to run daily and will check the ScheduledReport table, as shown in Figure 3.

Figure 3: ScheduledReport Table

Based on the ScheduledDate field, the application knows which reports need to be generated and their types. Using the ReportTypeID field, the application finds the corresponding row from the ReportTypes table, shown in Figure 4.

Figure 4: ReportType table

From the ReportType table, the application knows the location of the report's .RPT file by using the ReportRPTFilePath field, which may be a path to a local folder or a network drive. In addition, the table offers a location to store other report type-specific information. For example:

* ExportFileFormat: A particular type format that all reports of a particular type should be exported in.
* SharePointDocLib/SharePointFolder: SharePoint document library or folder that all reports of a particular type should be uploaded to.
* Description: Description of the report type that can be tied into a user interface, such as a combo box where the user can select which report to generate.

Note that such fields could be treated as defaults that can be over-ridden by corresponding fields added to the ScheduledReport table.

The application now has to determine what parameters are required by the report and their corresponding values. As shown in the code sample later in this article, the application needs to know both the parameter's name and the value when loading parameters into the report. Using the ReportTypeID, the application can determine the report's parameters and type from the ReportTypeParameter table, shown in Figure 5.

Figure 5: ReportTypeParameter

The ReportTypeParameter table, as shown in Figure 5, contains extra columns that would be useful if the application had a user interface for getting user input, such as a DataGridView.

* MinValue and MaxValue: Minimum and maximum values to be used with ParameterType for input validation.
* DataSourceTable: For a user interface, a database table from which potential parameter values could be used to populate a combo box.

In the scenario of scheduled reports, the parameter values are already available and are stored in the ScheduledReportParameter table shown in Figure 6.

Figure 6: ScheduledReportParameter Table

The application has a collection of parameters from the ReportTypeParameter table that can be mapped to values from the ScheduledReportParameter table by using the ScheduledReportID and ParameterID columns.
Loading Reports in C#

The application now has all the report information necessary to generate the report. As described in Figure 1, the application then loads the information into the report to generate the report, as shown in the following code sample.

using CrystalDecisions.CrystalReports.Engine;
...
// Crystal Report Document
ReportDocument reportDoc = new ReportDocument();
// Path to Crystal Report .RPT file
String reportPath;
// Collection for Report Parameters and values
Dictionary reportParameters =
new Dictionary();
...
// Determine whether application should generate a report:
// 1. If this is checking for Scheduled Reports, then check the
// ScheduledReport table.
// 2. If this application has a user interface, wait for user
// inputs.
...
// Populate the reportPath, and reportParameters collection
// based on the information provided in the database and/or
// the user.
...
// where reportPath is the file path to the Crystal .rpt file.
reportDoc.Load(reportPath);

// Pass in each parameter and value from the parameter collection.
foreach (KeyValuePair
paramEntry in reportParameters)
{
// SetParameterValue(string, object)
reportDoc.SetParameterValue(paramEntry.Key.ToString(),
paramEntry.Value);
}

// Export the report in PDF format
reportDoc.ExportToDisk(ExportFormatType.PortableDocFormat,
fileName);

Conclusion

This article shows how report-related knowledge can be shifted from an application's code to a database. This allows the application itself to remain unchanged because updates to the database will enable the application to handle new and modified reports.

Additional customization and enhancements can be handled by extending the database tables to hold additional information or to support new functionality such as emailing reports to a set of users. Although the article describes a reporting application that checks scheduled reports, the design also can be integrated with a user interface to allow the user to select a report to generate and then input the desired parameter values.
About the Author

Chi-Wei Wang is a senior software consultant for Crowe Chizek and Company LLC, focusing on Windows application development, Crystal Reports, SQL Server, and SharePoint. Crowe Chizek and Company LLC is a Microsoft Gold Certified Partner.

Chi-Wei graduated with a Master's Degree in Computer Engineering from the University of Illinois at Urbana-Champaign.

A Database Driven Reporting Application (1)

By. Chi-Wei Wang

The reporting needs for an organization often can change, requiring updates to existing reports as well as the addition of new reports. Having to update and re-deploy software each time reports are updated or added can quickly become time consuming. Rather than writing custom code for each type of report within the application, the report-specific information can be shifted outside of the core application. This allows an application to adapt to changing and new reports while minimizing the need for code updates. Instead, updates can be made to the database.

The article gives an overview of a how an application can retrieve report-related information from a database to generate reports. Crystal Reports is used for the reports since it is integrated with Visual Studio.
Overview

To minimize code changes, the application should not have any custom code tied to any specific report. All report-specific knowledge can be kept outside of the application. Even though such information can be stored in configuration files or flat files, leveraging a database has many advantages.

A database offers a convenient way to organize and store related information and can be accessed easily by applications. It can serve as a central location for storing all reporting information as well as offering secure access. In addition, it allows multiple applications to access the data and request reports through the same reporting application.

The basic process used by the reporting application is shown in Figure 1. Once the application finds a report to generate, it needs to load the report's .RPT file, and pass in the appropriate parameter values to the report. Afterwards, depending on the application requirements, the application might display the report to a user, upload it to a SharePoint site, or even email it to users.

Figure 1: Report Generation Process

Database Report Information

The sample application that will be discussed is a scheduled report generator used to generate reports scheduled for specific dates. The pending report requests are maintained in the database by other applications through which users schedule reports to be generated. The sample application requires the database tables shown in Figure 2.

Figure 2: Database Diagram

* ScheduledReport: A table populated with scheduled requests for reports to be generated on certain dates. Populated by another application.
* ScheduledReportParameter: A table populated with the parameter values required by each scheduled report. Populated by another application.
* ReportType: A table populated with all the possible type of reports and related information.
* ReportTypeParameter: A table populated with the parameters required by each report type and related information.

Sunday, June 15, 2008

Pros and Cons of MySQL Table Types

By. John W. Horn PhD

Of all the positive things that MySQL brings to the table, probably the most overlooked is multiple table types. This facet of the application is overlooked as a feature and more importantly is overlooked at design time.

MySQL has six distinct table types.

* MyISAM
* MERGE
* ISAM
* HEAP
* InnoDB
* BDB or BerkeleyDB Tables

Finding a transaction table that's just my type

Two of these table types currently support transactions. Transactions give the user greater control when working with data. You would use syntax similar to the following for a manual transaction.

START TRANSACTION;

SELECT @A:=SUM(salary) FROM table1 WHERE type=1;

UPDATE table2 SET summmary=@A WHERE type=1;

COMMIT;

Of the two commonly used transaction table types, the first is BerkeleyDB transactional tables provided by SleepyCat (www.sleepycat.com). In order to use BDB tables use a binary with BDB support or configure the source with the withberkeleydb option. If you don't want to use BDB tables, start the MySQL server with the skipbdb option. This will save a lot of memory, since the BDB library won't be included. However, you won't be able to use BDB tables. BDB is not used nearly as much as our second alternative which is InnoDB. InnoDB features rowlevel locking, consistent nonlocking read in SELECTs and common tablespace for all tables.

InnoDB Tables are made by Innobase Oy (www.innodb.com), and are distributed under the GPL as well as commercially. If you need commercial support or commercial licenses for your application and cost is a concern, not using InnoDB will save you about 20-50 % for licenses and support contracts. If data integrity is a concern InnoDB provides MySQL with a transactional storage engine and crash recovery capabilities. InnoDB has been designed for maximum performance when processing large data volumes and any other diskbased relational database engine does probably not match CPU efficiency. There are other transactional table types out there (such as Gemini), but they do not seem to be used any more than BDB. So, in a nutshell, most users prefer the speed and features of InnoDB.
A Database is no fun when you are locked out

The default table type for MySQL is MyISAM. It has table level locking, which means during an UPDATE, nobody can access any other record of the same table. BDB uses Page level locking, and during an UPDATE, nobody can access any other record residing in the same database page of that table, until the locking transaction issues a COMMIT.

InnoDB however, uses Row level locking. Row level locking ensures that during an UPDATE, nobody can access that particular row, until the locking transaction issues a COMMIT. Any of the above table types will probably be fine for a web server, but in a LAN application can cause unecessary issues.
Special circumstances call for special tools (or tables)

MERGE tables are a collection of identical MyISAM tables that can be used as one. You can only SELECT, DELETE, and UPDATE from the collection of tables. If you DROP the MERGE table, you are only dropping the MERGE specification. One reasons why you would use MERGE tables is to get more speed. You can split a big read-only table and then put the different table parts on different disks. You could do more efficient searches. If you know exactly what you are looking for, you can search in just one of the split tables for some queries and use a MERGE table for others. Repairs are more efficient. It's easier to repair the individual files that are mapped to a MERGE file than trying to repair a really big file. MyISAM and therefore MERGE tables are represented as individual files on the harddrive. You can go around the file-size limit for the operating system.

Some of the disadvantages of using MERGE tables are:

* You can only use identical MyISAM tables for a MERGE table.
* REPLACE doesn't work.
* Key reads are slower.

Also, you can't do DROP TABLE, ALTER TABLE, DELETE FROM table_name without a WHERE clause, REPAIR TABLE, TRUNCATE TABLE, OPTIMIZE TABLE, or ANALYZE TABLE on any of the table that is mapped by a MERGE table that is "open". If you do this, the MERGE table may still refer to the original table and you will get unexpected results. The easiest way to get around this deficiency is to issue the FLUSH TABLES command, ensuring no MERGE tables remain "open".

Well, that should make you think twice about using MERGE tables. ISAM tables will disappear in MySQL version 5.0, so it wouldn't be a good idea to use them. Last but not least is the HEAP table type. HEAP tables use hashed indexes and are stored in memory. This makes them very fast, but if MySQL crashes you will lose all data stored in them. They are very useful for temporary tables. HEAP sounds cool but I don't think the risk justifies the performance.
The Lowdown on MySQL Table Types

Most people use MyISAM if they need speed and InnoDB for data integrity. You can use more than one or any combination of these table types in your database. Remember to asses the needs of your application before building it. Even though MyISAM is faster than InnoDB in the MySQL world, InnoDB is fast compared to any database engine. With InnoDB you get transactions, speed and integrity three features not usually used in the same sentence. Most of my customers want as much speed as they can get, but at the end of the day, good data integrity lets them sleep at night.

Sunday, June 8, 2008

SQL Server 2008 (Katmai) Reporting Services Performance: Two Steps Forward, One Step Back?

by David Leibowitz

In this article, David Leibowitz compares the performance of the prerelease of Microsoft's SQL Server 2008 Reporting Services to SSRS 2005. While it is clear performance improvements have been made for report automation, developers might have differing opinions on the new Report Designer.

One of the largest changes in SQL Server 2008 Reporting Services (SSRS) is that it no longer relies on IIS. Due mostly to feedback from end users on security (most do not like to keep web hosts active on their database servers), the added benefits include reduced memory requirements and service spool-up time.

Testing the Performance of Requests
The tests used for this article were designed to time the performance of requests against the Reporting Service web service (ReportServer2005.asmx). Report rendering and data retrieval was not tested.

The performance tests were conducted using virtual servers running both SQL 2005 and SQL 2008 (CTP4). The test harness was built in VS 2008 and run from the host machine, which was then connected to each Virtual Hard Disk (VHD) (via a web service) when needed. See the sidebar for the specifications and applications.

Both virtual machine’s (VM’s) were running in parallel for the duration of each test. Web service functionality was evaluated using three separate tests:

1. Cold Server Spool Up
2. Warm Refresh
3. Warm Server Spool Up

A simple ASPX test harness running on the host laptop OS made three connections to ReportServer2005.asmx:

* Accessed the web service on the tested VHD via URL
* Opened a connection to read a report definition from the web service
* Looped through the list of parameters in that report

Author's Note: SQL Server Katmai CTP4 currently uses ReportServer2005.asmx; a “2008” version is not currently available, though it works against Katmai DBs.

In Visual Studio Web Developer Express 2008 (a free download from Microsoft), the author created a new project and a proxy to the Reporting Service 2005 web service, and named it RS2005. Listing 1 shows the code for the default.aspx.

Each test was provided three trials. If an anomaly occurred, several more tests were run to minimize the effects of running all the tests on a laptop. At one point, the SQL 2005 VHD timed out after 90 seconds on a cold boot test, but after several successful trials it did not seem fair to keep that test in. The tests were repeated until the results were consistent; you could eventually predict the length of time with fair accuracy for each test, as there was not much discrepancy.

Wednesday, June 4, 2008

Raking MySQL Over Rails

By W. Jason Gilmore

I'm a programmer, not a database administrator; I prefer to live in a world of code rather than table schemas and confusing SQL queries. This mindset is one of the reasons I find the Rails framework such an appealing solution; Rails so effectively implements an approach to database integration that plays to the developer's strengths. This approach is most notable in Rails' Active Record implementation, in which the object-relational mapping mechanism allows you to treat database tables like objects.

Yet inevitably, most developers are going to be forced to interact with the database outside of the world of models, controllers, and views. But, you may not be aware of other Rails features at your disposal for performing database-related tasks. For instance, did you know you can create and manage table schemas, populate tables with seed data, and back up your databases from within the confines of your Rails project by using the Rake build tool, which greatly simplifies otherwise complicated tasks such as the aforementioned? In this tutorial, I'll introduce you to just a few of the significant boosts in productivity Rake has to offer you by showing you how to carry out these important, yet often mind-numbing, tasks.
Managing Table Schemas with Migrations

Thanks to Active Record, interacting with a MySQL table via a Rails controller is done as easily as modifying the database.yml file, creating a model, and creating an appropriately named database table. For example, if you create a model named contact, the table name should be contacts. From there, you can begin populating and interacting with the table as easily as instantiating the contact model class and accessing its attributes like any other object. Newcomers to Rails find this feature alone a particularly compelling reason to adopt the framework. But, did you know you can actually manage the table schemas in Rails as well?

This feature is known as Migrations, and like Active Record, is bundled with the standard Rails distribution. Migrations give you the ability to programatically manage table schemas, allowing you to create, modify, and drop tables and columns using a very simple syntax. What's more, Migrations offers built-in version control, enabling you to roll schema versions forward and backwards with a simple command.

In fact, migrations are such a key part of Rails that a migration file is created automatically when you create a new model. If you create the aforementioned contact model, navigate to the project's db directory and you'll see a file named 001_create_contacts.rb that looks like this:

class CreateContacts < ActiveRecord::Migration
def self.up
create_table :contacts do |t|
end
end

def self.down
drop_table :contacts
end
end

To create the table schema, you can modify this file to look like this:

class CreateContacts < ActiveRecord::Migration
def self.up
create_table :contacts do |t|
t.column :name, :string, :null => false
t.column :email, :string
t.column :phone, :string, :limit => 10, :null => false
end
end

def self.down
drop_table :contacts
end
end

Now, apply the migration by running the following Rake command from within your project directory:

%>rake db:migrate

If you log in to your MySQL server, you'll see that the contacts table has been created! Want to undo your changes? Roll the migration back to the previous version by using the VERSION option. This will execute the self.down method as defined in the migration file:

%>rake db:migrate VERSION=0

Check the database again and you'll see the table has been dropped. What's more, as your schema evolves, you'll naturally create additional migrations to add new tables, or modify column datatypes or constraints, for instance. You can create a new migration file by executing the following command from within your project directory:

%>ruby script/generate migration your_desired_migration_name

Notice how each migration file is prepended with a version number? You can use that number to roll the schema backwards or forwards as far as you please!

An earlier installment of this ongoing series introduced migrations, so if you're not yet familiar with this great feature be sure to check it out.
Use Fixtures to Populate Tables

When creating a new application, chances are you already have at least some data on hand, and need to populate the application database. For instance, you might require customers to identify their state of residence when checking out for billing and shipping purposes. Because state names and abbreviations might be used in multiple places within your application, it makes sense to create a state model and corresponding table. But, you certainly wouldn't want to manually populate this table; not only would that be a tedious, but error prone.

Instead, you can take advantage of fixtures, a Rails feature created with the intent of simplifying the population of sample data into the test database for use during the ongoing testing of your application. However, they also provide a great way to populate tables with seed data such as state names and abbreviations you'll use in all phases of the application, including production. All you need to do is store the data in a format recognized by fixtures, of which there are two: YAML and comma-separated. Because the latter is the more widely-recognized of the two, I'll show you how to create a comma-separated fixture file containing state names and abbreviations and then use Rake to populate the table with this data. I'll presume you've created the state model and migration file, and now want to populate the table.

To begin, keep in mind that you always want to conform to Rails conventions; therefore, these sorts of fixtures shouldn't reside in the project's test/fixtures directory because that data should be used solely for testing purposes. Therefore, consider instead creating a directory named seed (for seed data) within the project's db directory.

Next, create the states.csv file, and store it within the newly created db/seed directory. A shortened version of the file is presented here:

id, name, abbreviation
1, Alabama, AL
2, Alaska, AK
3, Arizona, AZ
4, Arkansas, AR
5, California, CA
6, Colorado, CO

Next, you'll need to create a Rake file. The process and syntax behind doing so is worthy of an article unto itself, so just trust me on this. I'd been using a homebrewed task for some time, but recently came across a much more succinct solution created by Jeffrey Alan Hardy, which I've slightly modified to account for my preferred use of comma-separated seed files. Paste the following code into a file named seeder.task and save it to your Rails project's lib/tasks directory:

namespace :db do
desc "Load seed fixtures (from db/seed) into the current
environment's database."
task :seed => :environment do
require 'active_record/fixtures'
Dir.glob(RAILS_ROOT + '/db/fixtures/*.csv').each do |file|
Fixtures.create_fixtures('db/seed',
File.basename(file, '.*'))
end
end
end

To populate the states table, execute the following command from within your project directory:

%>rake db:seeder

Check the states table, and you'll see it's been populated!
Migrating Data Between Databases

Ask five developers what the superior database solution is, and you're sure to get five different answers. Like the endless vi/emacs and Linux/Windows arguments, there's never a shortage of opinion when it comes to database adoption. However, reality occasionally can take precedence over preference, and you may find yourself in a position where the client has decided to make a last minute switch to MySQL after you've been developing the application for weeks using PostgreSQL. Due to a variety of inconsistencies among various database solutions, it just isn't as easy to migrate data as one might think.

Noted Rails developer Tobias Lütke encountered a similar problem, and created a great Rake task for dumping database data into YAML format, and repopulating any database supported by Rails Migrations (at the time of writing, MySQL, PostgreSQL, SQLite, SQL Server, Sybase, and Oracle). I won't reproduce the task here because it's rather lengthy. Instead, download it from here and place it into your project's lib/tasks directory.

Next, run the following Rake command to retrieve the data in YAML format.

%>rake db:backup:write

All of the tables found in the current environment's database have been backed up to db/backup! Now, all you need to do is update your database.yml file to point to your new database solution, and then execute the following command to populate the new database:

%>rake db:backup:write

Keep in mind that anything found in that new database will be deleted before the new data is populated!
Conclusion

Rake is an amazing tool capable of doing so much more than what was demonstrated in this tutorial. Be sure to check out the Rake homepage, and this great tutorial for more information about this powerful tool.
About the Author

W. Jason Gilmore is co-founder of IT Enlightenment. He's the author of several books, including the best-selling "Beginning PHP and MySQL 5: Novice to Professional, Second Edition" (Apress, 2006. 913pp.). Jason loves receiving email, so don't hesitate to write him at wjATwjgilmore.com.

Exposing a Database as a Web Service

By Deepal Jayasinghe

Introduction

When you look back the computer industry, you can clearly identify different different technologies in different time periods. In any given time period, databases have a very high priority in the industry, from small scale to large scale business. You know that Web Services are becoming today's technology and everyone is in the process of moving their applications into the SOA or Web Services world. When doing so, you have a number of advantages, although you have to do a considerable amount of work to do get everything working. Therefore, when considering the advantages (such as accessibility, security, extensibility, and so forth), companies are trying to expose their applications as Web Services, or they are trying to give a Web Service interface to their applications. So, exposing and giving a Web service interface to a database is also becoming a hot topic, and DataServices is very good example of that.

There are a number of approaches that industries have employed when they want to expose their databases as Web Services. The DataService approach can be considered as one of the good approaches, and you can find a number of different DataServices solutions as well. You can consider the WSO2 DataService solution as a good example candidate for a DataService solution that is built on Axis2.
An Approach for Exposing a Database as a Web Service

However, in this article you are not going to examine any of the DataServices approaches; rather, you will learn a very simple approach of exposing a databases as a Web Service using Axis2. You can consider that as exposing a database using Axis2 POJO. To get a better understanding about this approach, having good knowledge about Axis2 will be an added advantage; the Reference section has links to the recommended articles. If you follow then, you are in good shape.
Creating a Database

To expose a database as a Web Service, you first need to have the database around, so create a very simple databases with one table to store personal information. The table will have four fields to store ID, name, address, and age. This sample application is based on MySQL databases, but you can do the exact same thing with any given database.

Run the following database script to create the database table. (First, create a DB schema called "dbsample" and then create the table inside that.)

CREATE TABLE PERSON (ID INTEGER NOT NULL,
NAME VARCHAR (100) NOT NULL,
ADDRESS VARCHAR (500),
AGE INTEGER,
PRIMARY KEY (ID));

Inserting Sample Data

By running the following script, you can populate the database with a set of data.

INSERT INTO PERSON (ID, NAME, ADDRESS, AGE)
Values (100, "Deepal Jayasinghe", "No 59, Flower Road,
Colombo, Sir Lanka", 29)
INSERT INTO PERSON (ID, NAME, ADDRESS, AGE)
Values (101, "Franck", "San Jose, CA", 30)
INSERT INTO PERSON (ID, NAME, ADDRESS, AGE)
Values (102, "Samisa Abeysinghe", "Colombo, Sri Lanka", 34)

You can add some more data if you want. You can either insert data by running SQL or you can just insert the data by using the MySQL Query Browser.

In the rest of this article, you will learn different ways of exposing the database as a Web service:

* List all the people in the DB
* List the names of all the people in the DB
* List the names and ages of given people in the DB
* Insert a person into the DB

Now, it is time to write your POJO class to perform the functionality you want. Before you do this, you need to address a few questions:

* Where will you create the database connection?
* Where will you store the database connection?
* How will you close the database connection?

Service Life Cycle Management and Database Connection Handling

To answer all those questions, Axis2 has something called ServiceLifeCycle management support. Therefore, the correct and best approach would be to get the life cycle management support from Axis2. First, you need to write the life cycle management class and create and store the database connection there. It should be noted here that when Axis2 starts up (at the time of service deployment), the ServiceLifeCycle class will be invoked; also, when the system goes down, the s ServiceLifeCycle class will be invoked again. As you can see, you are going to create the DB connection at the service deployment time and store the database connection inside the ConfigurationContext object.

You can download the source code for the s ServiceLifeCycle implementation class from the Download section (DBSampleServiceLifeCycle.java).
Creating POJO Classes

Now, you have written code to open the DB connection and to store that in ConfigurationContext, so now it's time to write the POJO classes. In this case, you will write a JavaBean object to represent the Person with four fields (id, name, address, and age). When listing all the people in the DB, you just create an array of Person objects and return that (Person.java).

When listing names of all the persons in the DB, you just return the String array. When getting the name and age for a given person, you create new JavaBean to represent those two fields and return that (NameAge.java). In the case of inserting a person object into the DB, you write a method in the POJO class (PersonDBService.java) to get four method parameters.

Your service implementation class, which does all four operations described above, is shown below.

package dbsample;

import org.apache.axis2.context.MessageContext;

import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.util.ArrayList;

public class PersonDBService {

public Person[] listAllPeople() {
// implementation logic
}

public String[] listPeopleNames() {
// implementation logic
}

public NameAge getNameAge(int id) {
// implementation logc
}

public void insertPerson(int id,
String name,
String address,
int age) {
// implementation logic
}

}

You can download the source code for all the classes from the Download section.

The service description file, or the services.xml for your service, will be a very simple one. You can download that from the Download section. The services.xml file will look like the following:

class="dbsample.DBSampleServiceLifeCycle">
Exposing a DB as a Web Service

mep="http://www.w3.org/2004/08/wsdl/in-only"
class="org.apache.axis2.rpc.receivers.
RPCInOnlyMessageReceiver"/>
mep="http://www.w3.org/2004/08/wsdl/in-out"
class="org.apache.axis2.rpc.receivers.
RPCMessageReceiver"/>

dbsample.PersonDBService

Deploying the Service

Before you deploy the service, you need to create a service archive file using your compiled classes and services.xml. You can use any available tools, or you can just create a zip file from the compiled code and services.xml and rename that as dbsample.aar. I have created a service archive file from the compiled code so that you can download that and just try it out.

Because you need to have the DB connection jar, first you have to copy the mysql-connector jar file in the class path or to /webappes/axis2/WEB-INF/lib. Next, you can copy your service archive file into /webappes/axis2/WEB-INF/services directory. Then start Tomcat (or your application server).

Now, type http://localhost:8080/axis2/services/DBSampleService?wsdl in your browser (the port may vary depending on the application server configurations); then, you will be able to see the WSDL file for your sample DB service. This is simply an indicator that your service is up and running. If you do not get the WSDL file, something has gone wrong with your database or database driver.
Invoking the Service

In this sample, you are not going to write a Java client to invoke the service; if you want, you can try that out. However, you will focus more on invoking the service just using the browser or REST manner.
List all the People in the DB

To see all the people in the database, you can invoke the "listAllPeople" method in your service. Just type the following in the browser and see what you are getting. (This is how you invoke the listAllPeople method in the REST manner.)

http://localhost:8080/axis2/services/DBSampleService/listAllPeople

You will get something like the following, which is simply all the people in the DB.

No 59, Flower Road, Colombo, Sir Lanka

29
100
Deepal

San Jose, CA
30
101
Franck

Colombo, Sri Lanka
34
102
Samisa Abeysinghe

List the Names of All the People in the DB

Getting a list of names of the people is almost equal to the method invocation above. To get all the names, just type the following in the browser and see what you get:

http://localhost:8080/axis2/services/DBSampleService/listPeopleNames

Then, you will see something like the following, which is exactly all the names of the people in the DB.

Deepal
Franck
Samisa Abeysinghe

Getting the Name and Age for a Given Person

Now, you will the name and age of a given person, so will be a matter of giving the person ID and the Web Service gives you the name and age of the person represented by the id. Invoking that service is just a matter or typing the following in the browser:

http://localhost:8080/axis2/services/DBSampleService/getNameAge?id=100

Then, you get the following output:

29
Deepal

Inserting a New Person Object

Inserting a person object also can be done using a REST call. You need to pass the id, name, address, and the age to add a new person into the table. You can pass them as URL query parameters, as you can see below.

http://localhost:8080/axis2/services/
DBSampleService/insertPerson?id=130
&name=Peter&address=No 5, Colombo, Sri Lanka&age=56

This will call the service and insert a new raw to the database. If you want to make sure that the new data in the database just lists all the people again, you can see the new person also in the DB.
Conclusion

You now have a very good understanding about exposing a database as a Web Service using Axis2. The sample you did was a very simple one, but you can write any complex example using this approach. If you have a good understanding about Databases and Axis2, you can do very cool stuff. Trying out the sample and creating your own sample will help you to understand the concept very clearly.
Downloads

* The ServiceLifeCycle implementation class
* person.java
* NameAge.java
* PersonDBService.java
* services.xml
* dbsample.aar
* All the files accompanying this article

References

* Axis2 deployment architecture: www.developer.com/open/article.php/3557741
* Six ways of deployment: www-128.ibm.com/developerworks/library/ ws-axis2soap/index.html
* Axis2 POJO: http://www.developer.com/java/other/article.php/ 10936_3726461_3

Database Information