Common Report Performance Problems
Ahhh performance. This topic can really be a can of worms! I will try to touch on a few very common problems here and maybe follow up with more details depending on what anyone might like to see.
I continue to see this pop up over and over in almost every different type of solution, so I thought I would try to just do a quick write-up on some of the very common and often easily solvable performance problems that developers and customers run into with reports in BusinessObjects.
My first and most-important rule of thumb: the results fetched from the data provider should almost always match what is actually being displayed on the report to the end users wherever it is possible!
This means that if you are looking at something like monthly sales totals for the last 12 months, then your database query should most likely be fetching only 12 rows of data. I do understand that sometimes other items are involved (such as the need for merging or doing cross-tabs, pulling in additional information for drill-downs, etc), but that is usually the exception from what I have seen. And, even in those exceptions, you can still apply this same principle: try to only fetch the rows from the database to achieve whatever near-level drill-downs might be necessary.
Why should we do this? Because we want to let the database do database work.
Don’t pull an entire year’s worth of order sales into your report and then sum the value for each order to only display total sales by month. Databases have these nifty functions that will do all of this aggregation for us, and almost always much more efficiently than any of our reporting tools will! I’m looking at you, SUM(), MIN(), MAX(), AVG(), and all of your other friends. These functions should be included as part of the Select statement for objects in your universe (assuming you are using a Relational Database on a Universe!).
You need to turn this query:
SELECT saleid, saledate, salemonth, region, amount FROM sales …
SELECT salemonth, SUM(amount) FROM sales … GROUP BY salemonth…
And while we are at it, You Should Really Learn SQL if you really want to be able to get a good handle on this stuff!
The principle behind this idea is by far the most important concept to always keep in mind when dealing with report performance issues. Always try to only pull in the data that is needed on the report (both rows and columns), and get rid of everything else!
Here is a quick list just off the top of my head for general practices to try and keep in mind. Some of it might be common sense and I am sure that I am probably missing a few hundred other possible ideas here! I will number them not for order of importance or sequence, but only so that they can be referred to later if needed.
- Try to limit the amount of times that a large table is accessed to only once in your queries if at all possible. Some of this can be done on the report side, but a lot of it might have to go back to a Universe or even the data model itself.
- Remove any objects or fields from the data providers which you will not need. For Webi and BEx data sources, you can also enable the query stripping option to help with this, though ultimately it is nicer and cleaner if you actually remove the object from the query entirely.
- Limit the number of queries or data providers you are retrieving on a given report if possible. If you are already pulling more than a few queries, then you might need to re-think your solution design.
- For Web Intelligence in particular, you can use the Data mode tab to view some basic performance information for each data provider within your document (as well as some other nice basic data mining/exploration functionalities!). If you have more than one query, this will help you to identify which ones may be taking more time than others to help you narrow your focus.
- In general, the more simple the SQL query looks, the better your chances are of getting a good outcome (or being able to tune or tweak the performance to achieve this!). If you are seeing a lot of inline views / SQL from things like derived tables, usage of database views which themselves may have complicated SQL, or a lot of complicated joins based on non-standard join conditions, this might be one of the first places to investigate your problem. A lot of this will go back to Universe design and possibly the data model itself.
- For most relational databases, you can copy the SQL of any problem queries and then use various trace and explain plan tools to help estimate and trace cost and execution statistics of the query directly on the database. This is an entire topic in and of itself, but once you can learn some of these techniques then they will become invaluable to you. Plus, the lessons that you learn in troubleshooting will help you to design better solutions in the future, too!
- For other data sources such as SAP BW, there are additional tracing tools which can be used in the SAP GUI.
- With SAP BW queries, avoid usage of InfoObjects with large hierarchies as these will significantly slow down query parse and execution times in tools such as Web Intelligence. This is because it actually reads through each level of the hierarchy in order to display it on the Query panel and make various drill and filter options available. It can be pretty scary to watch if you are tracing it in BW while it is happening. 😉 You may need to create a custom version of the InfoObject and/or change your data model a little bit to accommodate this in some cases.
- Don’t go crazy creating functions or variables. For example, if you have something more than 50 in a given document then there might be something you need to reconsider!
- Try to limit the amount of objects which are rendered on the layout of the reports themselves if you have a lot going on on the screen. Consider breaking things up into multiple tabs or documents entirely.
- Also, try to limit the number of tabs that you might have on any given document. If you have more than 10 or so, then you probably need to think about creating additional documents and splitting things up a bit!
- If users always need to filter the data when it is viewed, then consider using a query-level prompt to further reduce the amount of data that is retrieved into the report (why select data for all departments when the user only wants to see one?).
- If you have prompts with List of Values enabled, please don’t use them for highly unique values unless your data set is tiny! By this, I mean that you should generally turn off LOVs for things like date/time stamps, unique system ID numbers, any kind of numerical measurement that you need to be represented as a dimension, etc. If you have more than a few hundred unique values, then the field should likely not have a list of values available!
- For SAP Dashboards (Xcelsius), it does not do well when a lot of data is embedded into the internal spreadsheet (and/or loaded from data queries). Also try to avoid using formulas in the spreadsheet as much as possible if you can!
- Still on SAP Dashboards, if you are using BI Web Services / QaaWs on Webi report blocks, try to break up your webi documents into as focused of solutions as possible so that each individual web service query is not pulling in more data than is needed each time. For example, if you have one block to pull in header information, then another block to pull in detail information based on a filter selection, don’t pull in header and detail information into a single Webi document with multiple queries on the document and blocks exposed as web services. Instead, have one Webi document with a query on only the header data, and then a second Webi document with a query on only the detail data. Otherwise, the server would need to process and download both the header and detail data for BOTH web service queries!
- For Crystal Reports, try to avoid creating any subreports in the Detail/Body section unless you really know what you are doing and want to do this with a very TINY number of rows (as in, probably less than 10!). When you do this, it will execute the subreport’s database query for every single row of the main report’s query. Solutions that might have this would probably be better designed by having the subreports data somehow added to the original report and then create some sort of grouping to achieve whatever kind of layout is necessary.