A Question of Speed
I have been doing some code review lately and am seeing quite a few cases that look something like this:
<cftransaction>
<cfloop from="1" to="#arrayLen(aData)#" index="i">
<cfquery ....>
<!--- insert query --->
</cfquery>
</cfloop>
<cftransaction>
You have a structure with some information in it and you are looping over the structure and adding the data from the structure into the database. No bid deal right?
Well you probably won't be surprised to find out that having a loop around the cfquery tag forces the Coldfusion server to execute that query N number of times, where N is the size of your array.
So what can you do?
You could do something like this:
<cftransaction>
<cfquery ....>
<cfloop from="1" to="#arrayLen(aData)#" index="i">
<!--- insert query --->
</cfloop>
</cfquery>
<cftransaction>
This may actually be an oversimplification as you will have to consult your Database documentation to determine the right syntax for allowing multiple inserts in one query.
What this does is instead of executing the query N number of times you send over one big query all in one go and let the database chew on it. But does this actually result in better performance?
The Test
I decided to do a simple test set up two queries, one with a loop on the outside and one with the loop in the inside. Before starting the loop I do a getTickCount and then at the end another getTickCount and simply determine the millisecond difference between the two queries.
The code I put together looks like this:
<!--- change the number of itemsToinsert here --->
<cfset itemsToInsert =100>
<cfset tickFirstRun = gettickcount()>
<cftransaction>
<cfquery name="insert1test" datasource="insertTest">
insert into inserttest(col1,col2,col3,col4,col5,col6,col7) values
<cfloop from="1" to="#itemsToInsert#" index="i">
("col1_#i#","col2_#i#","col3_#i#","col4_#i#","col5_#i#","col6_#i#","col7_#i#")
<cfif i neq itemsToInsert>,</cfif>
</cfloop>
</cfquery>
</cftransaction>
<cfset tick2FirstRun = getTickCount()>
<!---second test --->
<cfset tickSecondRun = gettickcount()>
<cftransaction>
<cfloop from="1" to="#itemsToInsert#" index="i">
<cfquery name="insert2test" datasource="insertTest2">
insert into inserttest(col1,col2,col3,col4,col5,col6,col7)
values ("col1_#i#","col2_#i#","col3_#i#","col4_#i#","col5_#i#","col6_#i#","col7_#i#")
</cfquery>
</cfloop>
</cftransaction>
<cfset tick2SecondRun = getTickCount()>
Loop Inside <cfoutput> elapsedtime: #(tick2FirstRun - tickfirstrun)#</cfoutput>
<Br>Loop Outside <cfoutput>elapsedtime: #(tick2SecondRun - ticksecondrun)#</cfoutput>
The Results
The results are probably not the most scientifically accurate although I think they give a pretty good indication of the performance differences between the two approaches.| # Inserts | Loop Inside (ms) | Loop Outside (ms) |
| 100 | 16 | 109 |
| 1000 | 94 | 1047 |
| 10000 | 2572 | 9350 |
| 100000 | 312594 | 60859 |
Results Discussed
What happened at 100,000 records?Firstly I bumped up against the maximum packet size for MySQL, after adjusting the max allowed packet size the query executed but it executed slower than the loop on the outside of the cfquery tag. This issue may be database specific but I think that beyond a certain number of insert statements in a single run all database systems will have an issue digesting the shear volume.
I used 10,000 and 100,000 just to see what would happen. I would never recommend even trying to process that many records using Coldfusion.
Most, if not all, databases come with special bulk import routines. If you need to massage the data before importing it into your database you can use a temporary table to temporarily store your raw data in. Then once imported manipulate the data in the temporary table and use a Select Into or similar database command to move your data from the temporary table to it's end destination.
For smaller number of inserts, 10000 and under, placing the cfloop inside the cfquery tag seems to be on average 75% faster than having the cfloop surrounding the cfquery.
Thats a huge savings if you have an application that makes use of the cfloop on the outside of the cfquery tag, especially if the application is a high traffic/high volume site.

Subscribe
Subscribe via RSS
Follow me on Twitter
Or, Receive daily updates via email.
Tags
adobe air ajax apple cf community cfml coldfusion examples ext flash flex google javascript max2007 max2008 misc open source programming railo software technology ui
Recent Entries
No recent entries.
Blogroll
An Architect's View
CFSilence
Rey Bango
TalkingTree

http://zacster.blogspot.com/2007/04/trouble-with-c...
for oracle
insert into foo( x, y,z)
select 'foo', 5, 'bar'
from dual
union all
select 'xxx', 4, 'asdf'
from dual
union all
....
for sql server you can remove the from dual and would work fine.
NOTE: MS Sql server and Oracle have a limit that each support for unions.
If the date you are instering also exists in reference table you can make a insert that pulls the data from those tables.
insert into cart (customer_id, product_id)
select #customer#, product_id
from ref_product
where product_id in (#listofproducts#)
Both of these methods also provide much improved performance for inserts.