Home Contact

PD Versus-inspired Logophilduba.com

Adventures in Web Application Develompent by Phil Duba

Recent Entries

Popular Entries

Top Commenters

  • Nathan Mische (11)
  • CFdevfusion (6)
  • Peter Bell (4)
  • Rey Bango (3)
  • Terrence Ryan (3)
  • Scott (2)
  • Sean Corfield (2)
  • Jim Priest (2)
  • David (2)
  • Scott Stroz (2)

Slideshows

Pool Surprises...

Sponsored Links

Text Link Ads

Improving Performance: Concatenating Large Strings

Posted On October 20, 2006 7:12 AM By Phil in ColdFusion,Performance

One of the more common programming tasks nowadays is to write audit log files, place large amounts of tracking within a database, or write large CSV files for export. The most common response when a question is asked on how to reduce the processing time of these transactions is to use the Java StringBuffer method. There is, however, a fairly straightforward method within ColdFusion itself that is just as efficient as using Java. This method is what I like to call the ArrayAppend/ArrayToList method. Let's look at a piece of code where a large CSV file is being created off of a very significatn report.
In the previous post in this series, I was referencing a User Group meeting/presentation/attendance list. I will continue to use the same example here. From the previous entry, the following query was being executed:

<CFQUERY datasource="myCFUG" name="qryMeetings_base">
SELECT Mtg.meetingId, Mtg.Title AS Meeting, Mtg.Description AS MeetingDescription, Mtg.MeetingDate
Pres.presentationId, Pres.Title AS Presentation, Pres.Description AS PresentationDescription,
File.fileId, File.Filename, File.FileType,
Speak.FirstName AS SpeakerFirstName, Speak.LastName AS SpeakerLastName,
Attend.FirstName AS AttendeeFirstName, Attend.LastName AS AttendeeLastName
FROM Meeting Mtg INNER JOIN MeetingAttendees MtgAtt ON Mtg.MeetingId = MtgAtt.MeetingId
INNER JOIN Member Attend ON MtgAtt.MemberId = Attend.MemberId
OUTER JOIN Presentation Pres ON Mtg.MeetingId = Pres.MeetingId
OUTER JOIN File ON Pres.PresentationId = File.FileId
INNER JOIN Member Speak ON Pres.MemberId = Speak.MemberId
</CFQUERY>
<CFQUERY dbtype="query" name="qryMeetings">
SELECT MeetingId, Meeting, MeetingDescription, MeetingDate
PresentationId, Presentation, PresentationDescription,
fileId, Filename, FileType,
SpeakerFirstName, SpeakerLastName,
AttendeeFirstName, AttendeeLastName
FROM   qryMeetings_base
ORDER BY MeetingDate, Presentation, Filename, AttendeeLastName, AttendeeFirstName
</CFQUERY>
Now, let's say we want to output this as a report that builds a CSV file as the report is being displayed to the user so that if the user desires, that CSV file could be emailed or made readily available to them. The code below is one way many developers execute this type of functionality: < cflock name="CSVReport" timeout="20" type="exclusive" throwontimeout="true">
The main problem with this method is that there could be thousands of writes to the file. This is usually a first pass at solving this problem. Once this page comes to a crawl, the next phase is to usually try building one large string to use, replacing the CFLOCK and CFFILE tags in the previous example as follows:
<cfset CSVFileName="MeetingReport_#DateFormat(Now(),'mmddyyyy_hhnnss')#.csv">
<cfset strCSVReport = "" />
<table>
<tr>
<!-- Column Headings here -->
</tr>
<cfset strCSVReport = strCSVReport & qryMeetings.columnList >
</cflock>
<cfoutput query="qryMeetings">
<tr>
<!-- Data Element output items here -->
<cfset strCSVReport = strCSVReport & "#qryMeetings.Meeting#,#qryMeetings.Description#, ..." />
</tr>
</cfoutput>
</table>
<cflock name="CSVReport" timeout="20" type="exclusive" throwontimeout="true">
<cffile action="write" file="#GetTempDirectory#\#CSVFileName#" output="#strCSVReport#" >
</cflock>
This is closer to optimal as it has eliminated the many calls to write the file, however, it is eating up memory on each concatenation as ColdFusion creates a new variable instance each time a string is concatenated to itself. This is where we implement the ArrayAppend/ArrayToList methods, replacing the concatenation as follows:
<cfset CSVFileName="MeetingReport_#DateFormat(Now(),'mmddyyyy_hhnnss')#.csv">
<cfset arrCSVReport = ArrayNew(1) />
<table>
<tr>
<!-- Column Headings here -->
</tr>
<cfset ArrayAppend(arrCSVReport,qryMeetings.columnList) >
</cflock>
<cfoutput query="qryMeetings">
<tr>
<!-- Data Element output items here -->
<cfset ArrayAppend(arrCSVReport,"#qryMeetings.Meeting#,#qryMeetings.Description#, ...") />
</tr>
</cfoutput>
</table>
<cflock name="CSVReport" timeout="20" type="exclusive" throwontimeout="true">
<cffile action="write" file="#GetTempDirectory#\#CSVFileName#" output="#ArrayToList(arrCSVReport,'#Chr(10)#'#" >
</cflock>
We used this method in a number of our reports at work as well as in import/export logging and it reduced their processing time by almost 75%. I ran metrics, which I don't have access to anymore, where a report of 65,000 records processed the same using the ArrayAppend/ArrayToList method as it did using Java classes for the string building. Overall, if you are building large strings through concatenation, take a look at using this method, the processing savings are tremendous the large the data set gets.

Related Blog Entries

Comments

Post Your Comments

Captcha

If you subscribe, any new posts to this thread will be sent to your email address.