<h2>Popular Choices</h2>
<h3>Cloud Services</h3>
<ul>
  <li><strong>Google BigQuery</strong>: based on Google's Dremel, battle tested inside Google for years.</li>
  <li><strong>Amazon Redshift</strong>: based on ParAccel(initially based on PostgreSQL, columnar)</li>
  <li><strong>Microsoft Azure SQL Data Warehouse</strong></li>
  <li><strong>Snowflake</strong></li>
</ul>
<h3>Hadoop Ecosystem</h3>
<p><strong>Hive</strong> can be used as Data Warehouse, to store huge amount of data. However other compute engines like <strong>Presto</strong> are often used to accelerate queries.</p>
<h3>Traditional</h3>
<ul>
  <li>Teradata</li>
</ul>
<h2>Data Sources</h2>
<ul>
  <li>logging and messaging system (kafka)</li>
  <li>scraping (dumping) data from DB (e.g. MySQL) to DW (e.g. Hive); dump to a staging table, then copy to the target table</li>
</ul>
<h2>Dimension Table vs Fact Table</h2>
<ul>
  <li><strong>fact tables</strong>: business facts, or measures, like business transactions, page visits; they have foreign keys which refer to primary keys in the dimension tables.</li>
  <li><strong>dimension tables</strong>: descriptive attributes, like name, age, location, etc. Used for (1) query constraining and/or filtering, and (2) query result set labeling.</li>
</ul>
<h2>Slowly Changing Dimensions (SCD)</h2>
<p><a href="https://en.wikipedia.org/wiki/Slowly_changing_dimension">https://en.wikipedia.org/wiki/Slowly_changing_dimension</a></p>
<ul>
  <li>relatively static data like geo location, customer, or product</li>
  <li>change slowly but unpredictably, no regular schedule</li>
</ul>


Data Warehouses

Popular Choices

Cloud Services

Hadoop Ecosystem

Traditional

Data Sources

Dimension Table vs Fact Table

Slowly Changing Dimensions (SCD)