Since last time I wrote a blog on PRINCE 2 exam (check here) I noticed a lot of traffic on that post from Google. So I think I will share some more experience on job related skills.
Today I want to talk about – how to write good SQL. First of all I am not data admin, nor have I majored in any IT subject. I learned it on my job and mastered it in a couple of months – it’s definitely a skill easy to acquire and master!
A couple of tips:
- Start with a small base. When I need to query against tens of millions of records with joins, highly likely my query won’t finish. A good trick is to create some temp table with less records, and start join from that temp table as my base. For example I’ve here a big contact history table here – in order to proceed with further queries, I need to first define my base.
- Once I create my temp table, I always use collect stats and create index to the table. This is just a small step but it will improve the performance so much. The index should be the key which can be used for further joins. Here is more details on collect status http://www.teradatawiki.net/2014/03/collect-statistics-in-teradata.html
- On a side note – a good way to create temp table is create volatile table (as above). The benefit is that I don’t need to worry about drop tables; once I close the session volatile tables will be dropped automatically.
- When I define selection restrictions, I follow below rules: whenever I can use =, I won’t use LIKE; whenever I can refer to another table, I don’t use IN. So in my above example, EVENT_ID = ‘STR_30_A’ would perform better than EVENT_ID LIKE ‘%STR_30_A%’.
- When I use inner join, I always define my criteria with ‘WHERE’. For example, in my below example I could have used ‘AND’, but as a best practice WHERE works better.
- Always as a best practice, avoid create duplicates in your table. You can create a flat table with more columns and normalised key, which will perform WAYYYYY better than a table with less columns and duplicated key. One way to transpose table is to use CASE WHEN statement as below. Of course you can also alter table and update values.
Those are all I want to write so far. More than welcomed to comment and add!
Tivamoo’s blog update schedule:
- Tue – Japanese Notes – て form!
- Fri – (optional) random ideas – Pyjama fashion show?
- Sun – weekly awesomeness in Australia – Sculpture by the sea!