Database performance tuning in general is a kind of art. There are many discussions on this topic like adjusting Postgres conf parameters and rewriting some notorious slow statements, such as COUNT() and MIN().
I found that it is also vitally important to understand the data itself. It intrigued me at first that Postgres tends to choose seqscan and hashjoin for several of my SQL statements even though I have indexes on these tables. By using EXPLAIN facility, I found that SQL optimizer got the wrong idea of the actual cost for different plans. For example, the default value of random_page_cost=4, which means that it is 4 times as expensive to retrieve a random page than to retrieve a page sequentially. It is generally true. It means that if you do operations on 1 million rows out of 20 million total, assuming each page contains 10 rows, you have lower cost by doing sequential scan (cost=2 million) than by doing index scan (cost=4 million), because each index lookup will incur a random page retrieval.
But for out data set, this is not the case because the data has very high locality (rows with close indexe ids generally stays close to each other). Most index lookups don't actually result in a random page lookup because it is likely the page is already in memory cache. If the memroy is big enough, the actual random_page_cost may be less than 1!!!
In our case, we dynamically set several options such as, ENABLE_SEQSCAN, ENABLE_HASHJOIN, and random_page_cost etc at runtime, which helped the SQL optimizer to choose the right plan.