Wednesday, October 24, 2012

Sheesh - Am I still comparing ORM and Stored Procs?

I have worked with both traditional DB/SP setup and ORM solutions a wee bit myself and believe have a thin degree of understanding to make a few comments.
The most common arguments (as claimed by a strong proponent) pro stored procs. are the following

Unlike ORM generated [Ad-hoc] SQL queries, SP represents well-planned, well-tuned, controlled SQL queries that are very adaptable to business change and in contrast generated SQL is very brittle. Tiny changes to the database have severe impact on the application.


Really? Yes Really. So true. But SP is the way to fix that problem? Let’s examine it.
Bank account needs a new field called “National_Unique_ID_Number” to be stored. Ok change the SP and there is no impact on the rest of the application – OOPS that’s not true, Biz. Layer, UI, error handling all calling code, everything needs to be updated. And guess what; because SP is written in a different language in a different IDE there is no “compiler-driven” way to even assess the impact. Bugs proliferate costing $$$. In a ORM backed approach you add a new property and mark it as mandatory, you change method signatures and boom the IDE starts painting your code red. You fix the compilation problems by adjusting all those Biz. Layer, UI, error handling (hopefully driven down thorough the abstraction provided by interfaces) and mostly the pieces of your app will start talking to each other again.
Another example: Field “DOB” that was optional for an account holder is to be made mandatory for new accounts per updates in regulatory requirements. All of the arguments above hold equally true for this case as well.

Security – aka SQL Injection attacks. Stored procedures allow for better data protection by controlling how the data is accessed. By granting a database login EXECUTE permissions on stored procedures you can specify limited actions to the application. Additionally, stored procedures are a counter-measure to dangerous SQL Script injection attacks, a susceptibility that applications using embedded SQL are more vulnerable to.


That view is so orthogonal to a holistic approach to application security in terms of securing methods/functions using declarative security controls with the help of paradigms like ACEGI that arguing against feels like a bottomless task. In the case that one indeed wants to control who-sees-what at the DB layer my view is to use views. Pun not intended. 

Self-wound-inflicting-serious-security-pundits (which most DBAs are by the way) have also heard of another technique called “role based security”. Such roles are defined at the DB layer. You have 2 roles: the average user, and the administrator user which configures the application. Define 2 roles in the DB, place the users in the right role, define the rights per role on the tables and views and off you go: fine grained security which works, without a single stored procedure in sight. A new user is added? You just add it to a user role and it has the rights it should have.

Performance – The holy grail of application programming: Stored procedures are pre-compiled and therefore can bypass the cost of creating an execution plan whereas Ad-Hoc SQL statements created and issued by ORMs have to go through this cost every time. SPs can take advantage of DB layer code dealing with looping through and filtering data much better than data-access-layers sitting far out in the application server can. SPs can batch common work [select-update-select-insert-update] together smartly and give benefits in terms of reducing transport layer loopbacks in the order of thousands of cycle times.


No – Not all of the above arguments are not false. Yep at least one of them is true. Let’s examine.
Pre-compiled SP – Truth or Myth? SQL Server 2000 and SQL Server version 7.0 incorporate a number of changes to statement processing that extend many of the performance benefits of stored procedures to all SQL statements. SQL Server 2000 and SQL Server 7.0 do not save a partially compiled plan for stored procedures when they are created. A stored procedure is compiled at execution time, like any other Transact-SQL statement. SQL Server 2000 and SQL Server 7.0 retain execution plans for all SQL statements in the procedure cache, not just stored procedure execution plans. The database engine uses an efficient algorithm for comparing new Transact-SQL statements with the Transact-SQL statements of existing execution plans. If the database engine determines that a new Transact-SQL statement matches the Transact-SQL statement of an existing execution plan, it reuses the plan. This reduces the relative performance benefit of precompiling stored procedures by extending execution plan reuse to all SQL statements. – I did not make that up
Caching of execution plan happens for any query; not just the ones fired from SPs. It’s a democracy out there.

Loops over resultsets vis-à-vis SP (loops in DB). SP are sure to win? Not so fast. SPs do their looping through data sets in T-SQL. That requires a cursor. A cursor always creates a temp table in tempdb, which can hurt if tempdb is full and has to be re-allocated. In a large application, looping a lot will reach that condition sooner or later. Also T-SQL is a set-based language (not procedural) and that means it does have an advantage over procedural languages (multiple but separate SQL statements feeding the result of one into another) when it works on very large sets of data. That said, if you have looped a lot over resultsets means only one of the following two things; Either you did not understand what set-based means or whatever you are doing can’t be done in a set-based batch of statements. Need I mention that all matured ORMs let you write lambda based batch statements specifically designed to do this and reduce or eliminate looping through resultsets. For the second case of things that just cannot be done in a set-based batch, SPs may have a real advantage as choosing another procedural language may not be a realistic option within the scope of a project.

Now some performance detail – Let’s say we have an application with 100 tables with an average to 10 fields in each of them. This requires 100 ‘update’ SPs. These 100 SPs will have 10 fields to update on an average (because you cannot have optional update statements in SPs). One cannot create SPs for all combination of fields either; that would require 100^10 SPs and lead to madness. You can have optional parameters in the SP itself but that’s a performance killer anyway. SP open up a whole new problem front in terms of maintenance. Because they become a sort of API set which can’t be extended. They violate the single most important design talisman - “open for extension, closed for modification”. Every time a business need changes the SP cannot be modified; a new one is to be written. This is not a minor problem specially when the system is large and has been maintained for years. Microsoft also believes in this and advises this.