Coldfusion Database Basics

This isn't about how to write queries, or stored procedures. It's also not about relational databases function or how to properly index. It is about being kind to your database and more efficiently communicating with it.

One of the great things about CFML is how easy it is to communicate with the database and how incredibly easy it is to write queries. We don't need to concatenate a huge string to create a query with variables we just need to surround it with #'s. I remember how painful it was to create a query in ASP and how I rejoiced in the ease of CFML.

Stored Procedures

We all know that farming out the heavy lifting to the database is the right thing to do. Typically the DB is a beefy machine with lots of horsepower and ram so why do we keep doing so much work in CF? One theory is that we are web developers not database experts, we can write simple queries and some complex ones too but stop at stored procedures and user defined functions etc.

Well I am going to tell you again that creating stored procedures for complex database work is ALWAYS going to be more efficient than doing the same work in CF.

Aside from learning how to create stored procedures in your favorite database there are some things you can do to make your work with DB's better.

CFQUERYPARAM

We all know that Adobe recommends using the cfqueryparam to help prevent sql injection attacks but there is an even better reason to use the tag, performance. When you submit a query to your database it runs it through an optimizer and generally stores the result in a system table, when the same query is run again the already optimized query is retrieved from the system table so you save the optimizer time. By using parameters in your query via the cfqueryparam tag the database can more easily reuse the query. This is especially important when it's a frequently used query.

Don't do in CF what you can do in the DB

If you need to do some math, totals or otherwise it's best to do it with SQL. We all know the basic SQL aggregate functions the most popular being sum() and count(), max()

So what I am about to show you is something that you would never consider doing EVER!


<cffunction name="getTotal" access="public" returntype="numeric">
<cfargument name="year" type="numeric" required="true"
<cfquery name="getData" datasource="#request.dsn#">
select * from mytable
where year = #arguments.year#
</cfquery>

<cfset total=0>
<cfloop query="getData">
<cfset total= total + getData.amount>
</cfloop>

<cfreturn total/>
</cffunction>

The example might be a bit over the top, no one would ever consider doing something so inefficient but when we reach the end of our comfort zone in SQL we start getting inventive and thats when things generally start to fall apart.

The who? no the Where

The WHERE clause is your friend, in most cases he is all that stands between you getting the right results in a timely manner.

If you are writing a query with an inline view be sure to restrict your results there too, this will greatly speed up your query.

Here is also where you would use the cfqueryparam, and don't forget that you can still use the cfqueryparam with a list of values, it's just not for a single value.

Query of Queries and In-memory Queries

Query of Queries is a great tool but it's use, like in-memory queries, should be, in my opinion, limited. When you use QoQ the results, like that of the initial query are stored using the servers memory therefore heavy usage will also have a negative affect on your severs memory.

QoQ can be used to easily get column totals on a returned query instead of assigning a column value to a variable you can simply use QoQ to get the sums.

Insert Into Select...

I can't tell you how often I see a select query followed immediately with a loop that loops over the results of the query and then immediately inserts that data into another table.


<cfquery name="getdata" datasource="#request.dsn#">
select col1,col2,...,coln
from sometable
where dataCreated between .. and ..
</cfquery>

<cfloop query="getdata">
<cfquery name="insertHistory" datasource="#request.dsn#">
insert into sometable_history(col1,col2,...,coln)
values(#col1#,#col2#,..,#coln#)
</cfquery>
</cfloop>

The above code is bad for a number of reasons least of which is the network traffic as it chugs through the returned result set. A better way to tackle this is the use of an "Insert into select" statement, an even better approach would be to create an update trigger on the table that automatically inserts the old record into your audit table every time a record in the table is updated. Now if your Database System doesn't support triggers then that's another problem altogether.

Stored prcoedures

I mentioned above that I wasn't going to talk about how to write stored procedures and I am not. What I am going to say to this topic is to learn how to write them...correctly. Like in everything there is a right way and a wrong way to writing stored procedures and I encourage you to learn the right way for your database (in sqlserver try to avoid exec() with dynamic sql for example). Once you have learned the right way Coldfusion provides a great mechanism to communicate with your new found stored procedure writing genius.

CFStoredproc.

The CFStoredproc tag, in conjunction with cfprocparam and cfprocresult, allows you to call stored procedures and get results back from the stored procedure. The cfprocparam tag lets you pass variables to the stored procedure as well as get single results back. If your stored procedure returns a recordset then you need to use the cfprocresult tag.

Conclusion

In the end be kind to your database and your database will be kind to you. You database server is typically going to be a brute he likes to lift heavy things so let him do the heavy lifting he is better and faster at it.

Learn your Database (especially if you don't have a DBA on staff to do all the dbwork for you) find out about indexes, stored procedures, inline views and built-in functions, the more work you do on the database the better. In my opinion CF should only be used for simple selects, updates and deletes, anything more complex should be left to the database where it belongs. Lastly please for the love of God don't use CFML to compile statistics!

Happy Coding....

7 Comments to "Coldfusion Database Basics"- Add Yours
todd sharp's Gravatar You can avoid QofQ for summing columns too by using arraySum().

http://www.bennadel.com/blog/167-Calling-Array-Fun...

:)
# Posted By todd sharp | 7/16/08 9:44 PM
Peter Bell's Gravatar Some earlier thoughts/comments on the subject:

http://www.pbell.com/index.cfm/2008/3/21/Moving-Bu...
# Posted By Peter Bell | 7/16/08 10:40 PM
Tariq Ahmed's Gravatar A few cents for consideration (IMHO):

o It's easier to horizontally scale out expendable commodity web servers, than it is to do that with database servers (which is usually a single point of failure).

o Because of that, there's a higher ROI in focusing on maintainability than performance. The performance can be recuperated through web server scalability (especially with all the nifty advancements in virtualization).

o S/W development is extremely expensive, while commodity web servers are extremely cheap.

o Commodity web servers are extremely cheap compared to beefier DB servers.

o Because of that, put the load on the web servers vs. DB servers.

In an abstract simplistic view... if RAM were unlimited, and power never went out... there'd be no need for a DB server. They exist just to persist data, so in my **humble opinion** I'd minimize the amt of logic that is used on the DB tier.

Though I've heard strong technical arguments either way - so I'd boil it down to what's the best business decision.
# Posted By Tariq Ahmed | 7/16/08 3:31 AM
Gary Gilbert's Gravatar @Tariq,

I have heard the "throw more servers at it" argument lots of times before. Hardware is cheaper than manpower etc. But cheaper isn't better and software that is properly written scales better than poorly written code.

A report that takes 15 minutes to put together on the webserver (poorly written queries, heavy usage of in-memory query objects) is going to take 15 minutes no matter how many servers you have in your cluster. But by rewriting the component that generates the report to correctly use a database server (and of course properly written queries) you can shave 14.9 minutes off the time.

Now what would you rather have a report that takes 15minutes to run or one that takes a few seconds?

The point I am/was trying to make here is that paying attention to not only what your write but how you write it is more worth than hacking something quickly together that just works.

In the end the poorly written code will always cost more than the extremely expensive programmer time it would cost to do the job right the first time.
# Posted By Gary Gilbert | 7/17/08 5:39 PM
Peter Bell's Gravatar @Gary,

I disagree with the generalization that it's always better to "do the job right". Doing the job right isn't about optimizing performance - it is about having acceptable performance balanced with the time to market, cost of development, maintainability and all of the other factors. The balance of those factors depends heavily on the use case.

There is a reason why there is so much literature out there complaining about "premature optimization". That doesn't mean you should write badly performing code all other things being equal. But all other things seldom are equal and over-engineering a solution is just as bad an underengineering it.
# Posted By Peter Bell | 7/17/08 5:51 PM
Gary Gilbert's Gravatar @Peter,

I certainly did not intend to imply that a solution be over-engineered as I agree with you that it can be as bad or worse as under-engineering. I also agree that in a lot of cases there will be performance trade-offs.

But I believe those performance trade-off should not be a result of uninformed or poorly planned coding practices. As I stated in my previous comment about the report taking 15min; if the developer knew SQL better he would not have chosen to do so much work using the in memory query objects and therefore would have farmed the work out to the database server where it belonged.

This is not about premature optimization or over-engineering a solution its about a knowledge gap about what to use when and how. Knowing that spending a little extra time working on a query (or learning more sql) instead of cobbling something together with in-memory query objects is going to give you a 90% performance increase is something I think we can all agree should be done.
# Posted By Gary Gilbert | 7/17/08 6:11 PM
Ben Forta's Gravatar Agreed! I even wrote something very similar a while back:

http://www.forta.com/blog/index.cfm/2006/12/1/Cold...

--- Ben
# Posted By Ben Forta | 7/17/08 1:04 AM

Powered By Railo

Subscribe

Subscribe via RSS
Follow garyrgilbert on Twitter Follow me on Twitter
Or, Receive daily updates via email.

Tags

adobe air ajax apple cf community cfml coldfusion examples ext flash flex google javascript jquery max2007 max2008 misc open source programming railo software technology ui

Recent Entries

No recent entries.

Blogroll

An Architect's View
CFSilence
Rey Bango
TalkingTree

Wish List

My Amazon.com Wish List