Tips

Run SQL scripts from a folder using sqlcmd

This is a very short post about a very useful tip. I came across here on   stackoverflow and it certainly deserves a post. Many thanks to the answerer and I think this is his blog. I already follow it and would highly encourage you if you are interested in SQL Server.

Very often I need to execute all the SQL scripts from a folder. There are tools out there to do this but I wanted something very simple, like a batch command. So let’s say you want to execute all the SQL scripts from folder ‘D:\Scripts’  on server ‘localhost’ and on database ‘playbox’. Here are the steps

1. Open command prompt.

2. Navigate to the scripts folder.

3. Type the following command.

for %f in (*.sql) do sqlcmd /S localhost /d playbox /E /i "%f"

What this will do is iterate through all the .sql files in the folder and execute each one using sqlcmd. /S, /d, /E  are sqlcmd parameters. The beauty of this solution is you don’t have to install anything. Wrap a few of these statements in a batch file and you can execute statements from different folders in one go. Neat!!!

Advertisements

Hide a specific attribute hierarchy in role playing dimension – SSAS multidimensional

Role playing dimension is a dimension which plays different roles in a fact table. In SSAS multidimensional, a role playing dimension can be used across different cubes and measure groups. Date dimension is the most common example. A date might be appear in various forms such as SaleDate, ReceivedDate, OrderDate, ShipDate etc. Creating a separate dimension for each of these would not be a good approach. Instead, a single date dimension is created and is attached to various measure groups using different keys. In fact, if you have defined a relationship between multiple fact table columns and same dimension key, SSAS multidimensional will add the role playing dimensions automatically as shown below.

AdventureWorksRolePlayingDimension

AdventureWorksRolePlayingDimension

In certain situations, however, not all the attributes of a role playing dimension are relevant to all measure groups or cubes. Let us take an example of Organisation dimension. This dimension has four attribute hierarchies namely OrgnisationKey, Organisation Name, Customer Organisation Reference, and Provider Organisation Reference. Let us also assume a fictional fact table with columns CustomerOrgnisationKey , ProviderOrganisationKey and TransactionAmount. This fact table references Organisation dimension twice i.e. CustomerOrgnisationKey –>  OrgnisationKey and ProviderOrganisationKey –> OrgnisationKey. This means Organisation dimension is a role playing dimension. The table relationship and role playing in SSAS is shown below.

Tabler Relationships and Role Playing

Tabler Relationships and Role Playing

Now, when browsing the Customer Organisation dimension, the hierarchy Provider Organisation Reference doesn’t make much sense. At the same time, when browsing Provider Organisation dimension Provider Organisation Reference doesn’t convey any meaning. There is no apparent way of hiding the non-relevant hierarchy in each dimension. If you hide a hierarchy in the parent Organisation dimension, it would hide it in both the role playing dimensions. This is where I was stuck. As mentioned in one of posts, there are two places where ‘AttributeHierarchyVisible’ property can be set. For role playing dimension, you can do that for each individual role playing dimension from Cube structure tab. So go to Cube Structure tab and you should see Dimensions panel below Measures panel. This will show each dimension and it’s attributes separately. Now you can hide or even enable and disable attribute hierarchies for each role playing dimension. A picture is worth a thousand words, so here’s a screenshot.

Hiding Attribute Hierarchy in Role Playing Dimension

Hiding Attribute Hierarchy in Role Playing Dimension

Changing database ID in Analysis Services

Every object in SSAS database, such as a Cube, a Measure Group, a Measure and a Dimension, has an ID which is generated automatically when the object is created. The ID, once generated, cannot be changed from the Properties window of the object; it is read-only. When an object is created,  by default, object name and it’s ID are same, however, over a period, objects get renamed leaving the ID with old value. In majority of situations, this does not create a problem but there are a few cases when this can be problematic. Now some people might say I have an OCD (Obsessive Compulsive Disorder) but I like my object names and IDs to be consistent.For most of the objects, the ID can be changed by opening the object code (right click–> View Code) and changing the value in <ID> tag.  A word of warning though, be extremely careful when changing the IDs. It can create quite an issue in your project. For example if you change ID of dimension, you would have to make same change in cube XML else you would loose the relationship.

Anyway, that’s all find and dandy. But how would you change the database id? If you rename the database, the ID does not change and there is no View Code for project. As it turns out, it is in *.database file of the project. In Solution Explorer, click on Show All Files. You would see a your_project_name.database file. Right click on it and then View Code. You will find <ID> tag which contains the ID of the database.

Are you setting ‘Slice’ property in SSAS partitions?

Are you setting ‘Slice’ property in SSAS partitions? If not, you should. Here’s what happened with me.  We have a fairly large SSAS cube. Initially this cube was having only two partitions, current data and historic data. The performance was degrading and so we partitioned the cube by month since majority of the queries are by month and below. This also helped us in improving the processing performance because we can now process only those partitions for which data has changed, two or three at the maximum. On the downside though, there are large number of partitions in cube; a fact we decided we can live with.
So I created the partitions and got the cube processed. I wrote a simple MDX query to observe how cube responds in profiler. Here is the query (slightly changed for anonymity).

SELECT
[Measures].[Measure Dummy] ON COLUMNS
FROM [Playbox]
WHERE [Period].[YearMonthDaySettlementPeriod].[Calendar Year].&[2010].&[4]

And here’s what I was seeing in profiler.

Without Slice all partitions are scanned

Without Slice, all the partitions are scanned

Hang on! I am querying only April 2010 data, why are those scans on all the other partitions? This defeated the whole purpose of partitioning by month because queries were scanning all the partitions. As mentioned earlier, we had a number of partitions and this is bound to have an impact.Even after creating partitions I was still not getting the benefit I wanted.

Now on the crux of the matter. As it turns out (after spending hours on google, msdn), automatic detection of Slice of SSAS 2008 is not so effective. Hence even though the partitions are clearly for a month, SSAS still has to scan all of them to get data. There are a number of blogs which elaborate the issue. I will just point to this, this and this.

So I set the slice property on each partition, run the same query and guess what, it is hitting only the required partition.

With Slice defined, only required partition is scanned

With Slice defined, only required partition is scanned

The performance benefit was quite apparent as well. I selected a very common real life query. I noted the query timing using the AS Performance Workbench which I have mentioned in my previous blog.  As you can see in the below images, the response time improved dramatically.

Query Duration without Slice

Query Duration without Slice

Query duration with Slice

Query duration with Slice

So the bottom line is “Always define slice when you create SSAS partition”. This has to go in my list of things to look for when you are building SSAS cube.

Export file names from a folder

Lets say you want to export names of all the files and folders within a folder. Doing it one at a time will be pretty time consuming. This little trick will come handy in such situations. Open up cmd prompt and navigate to the folder from which you want to export file names.

Let me introduce you to “>>” operator. What this will guy will do is transfer the output of the DOS command you are executing to the file specified. This is very handy if you are executing SSIS packages through a batch command. Put “>>” at the end of every command, specify the file name and all the output messages will be transferrred to the file. In the following cases, the file is called ‘files.txt’ and will be created where ever your command promt location is. You can always specify a different location.

1. To get just the names of files and folders 

 dir /B  >> files.txt

2. To get the full path of files and folders 

 dir /B /S >>files.txt

3.To get only specific files for e.g. csv

dir /B /S *.csv >>files.txt

dir /B /S *.csv  >>files.txt

dir /B /S *.txt >>files.txt

dir /B /S *.xls >>files.txt

Something’s to look for when you are building your next SSAS cube

There are a number of things which you should be aware of when you are building your SSAS cube. Here I am listing some of the very trivial things that have been particularly painful to me. I am hoping that this will be a live post and I will add more as and when I find them.

 1. Create views in the DW and then build your DSV from them

The biggest benefit is that it will add an abstraction layer between DW and SSAS.  Metadata columns from the DW such as StartDate, EndDate, Source, DateAdded, ModifiedBy etc. which are not required in the cube can be easily filtered out. Another benefit is that any changes in the DW which you don’t want to propagate to cube can be absorbed in views. For example minor data type changes, adding new column, new relationships etc. Make sure that the views are simple SELECT statements over DW tables with minimal joins.

2. Avoid NULLs

This is more of a DW thing than SSAS. SSAS behaves slightly weird when it finds a NULL, especially when the column has NULL and an empty string(”). SSAS converts NULLs to blanks or zeros and chances are that you will get Duplicate Key Error. You can avoid NULLs in the views as mentioned above. If you do have NULLs then consider setting  NullProcessing property toUnknownMember.

3. Religiously define and maintain relationships between tables in DSV

This is very important since SSAS will be able to help you more actively once you define relationships. Take snow-flaking for example,when building snow-flaked dimension, SSAS can tell you which other tables from DSV are related to it to help you out.

4. Beware of the datatypes and trimming

I have had a fair share of trouble because of these. If the datatype of a column changes in DSV, this change does not reflect automatically in dimension attribute. This is specially troublesome for varchar,char or nvarchar datatypes because sooner or later you will start getting  the error “The size specified for a binding was too small, resulting in one or more column values being truncated.”  Dimension Data Type Discrepancy Check  from BIDS Helper is a life saver in this situation

5. Use folders to categories dimension attributes

For dimensions with large number of attributes, use AttributeHierarchyDisplayFolder property to display them in specific folders. This will make browsing the cube in excel for end users simple. Same holds true if you have large number of measures in a measure group.

6. Keep the attributes in the dimension to bare minimum

Too many attributes in too many dimensions will create too much confusion. Give users what they absolutely need and will be most useful to them. Disable or hide the hierarchies which are not needed.

7. Create partitions or atleast plan for them (Update:03 August 2012)

This may not be an immediate concern but thinking about partitioning in the beginning will help you in the longer run. Partitioning can dramatically reduce the cube processing and performance.

8. Partition slices are important, don’t ignore them!! (Update: 19 December 2012)

Without slices, chances are that your MDX query is scanning all the partitions. There are many many blogs out there who confess to this and recently I had a very similar experience. This post from Mosha is a good indication of importance of setting Slices.