The Core Difference: How Data Is Stored
Imagine a table with 1 million commission transactions: ID, commission_amount, status, created_date, salesperson_id, tier_level. In Oracle, the data on disk looks like this:
Row 1: ID=1, amount=500.00, status='APPROVED', date='2026-01-01', salesperson_id=42, tier=3 Row 2: ID=2, amount=1200.50, status='APPROVED', date='2026-01-01', salesperson_id=43, tier=2 Row 3: ID=3, amount=300.00, status='REJECTED', date='2026-01-02', salesperson_id=42, tier=3 ... Row 1,000,000: ID=1000000, amount=750.25, status='APPROVED', date='2026-04-08', salesperson_id=55, tier=1
Everything about row 1 is stored together as a unit. When you want row 1, you load one "block" of disk (typically 8KB) and get all its columns. When you want the top 10 rows, you load them efficiently.
In SAP HANA (columnar), the same data looks like this:
ID Column: [1, 2, 3, ..., 1000000] amount Column: [500.00, 1200.50, 300.00, ..., 750.25] status Column: ['APPROVED', 'APPROVED', 'REJECTED', ..., 'APPROVED'] created_date Column: ['2026-01-01', '2026-01-01', '2026-01-02', ..., '2026-04-08'] salesperson_id Column: [42, 43, 42, ..., 55] tier Column: [3, 2, 3, ..., 1]
Every value for one column is stored contiguously. When you aggregate commission amounts, you read only the amount column. You don't touch the other columns at all.
Why This Matters: The Performance Implications
Aggregation Queries (The HANA Win)
In Callidus Commissions, you might run a query like this:
SELECT salesperson_id, tier, SUM(commission_amount) AS total_commission, COUNT(*) AS transaction_count FROM cs_transactions WHERE created_date BETWEEN '2026-01-01' AND '2026-04-08' GROUP BY salesperson_id, tier;
In Oracle (row-based), even though you only need 4 columns, the query engine must load all 6 columns from each row. It reads the entire table from disk into the buffer cache, filters by date, aggregates, and returns results. Fast for a small table, but on millions of rows, you're wasting I/O reading columns you don't use.
In SAP HANA (columnar), the query engine reads only the 4 columns it needs: salesperson_id, tier, commission_amount, and created_date. All 1 million dates are stored contiguously, so they're loaded with minimal I/O. The amount values are compressed (HANA applies dictionary encoding per column), so the data fits in fewer bytes. The aggregation is done entirely in-memory. Result: 10-100x faster depending on the query.
Point Lookups (Roughly the Same)
If you query a single transaction by ID:
SELECT * FROM cs_transactions WHERE id = 12345;
In Oracle, one B-tree index lookup finds the row. One disk block is read. Done.
In HANA, the query engine must locate the value 12345 in the ID column, then follow that position index into the other columns to gather all values. It's a bit more work logically, but HANA's in-memory architecture makes it nearly as fast. However, HANA doesn't need an index on ID — the columnar structure provides the equivalent of an index for free.
Oracle Database Architecture: Deep Dive
Storage Model
- Heap tables: Rows are stored in any order on disk (no inherent structure).
- Buffer cache: Recently used blocks are cached in RAM. If you query the same data twice, the second query hits the cache.
- B-tree indexes: Balanced tree structure that maps column values to row addresses. Used for point lookups and range scans.
- Block-based I/O: Minimum unit of I/O is a disk block (8KB). You can't read a single column value without reading its entire block, which includes all other columns in nearby rows.
What This Means Operationally
- Index tuning is critical: A missing index on a WHERE clause column means a full table scan, which is slow.
- Multi-column indexes matter: An index on (salesperson_id, tier) may perform better than separate indexes on each column (index cardinality, selectivity).
- Index maintenance has a cost: Every INSERT, UPDATE, or DELETE touches the index. For write-heavy workloads, indexes slow you down.
- DBA work is indexing work: Performance tuning is largely about creating and dropping the right indexes. Query plans are analyzed; indexes are redesigned.
SAP HANA Database Architecture: Deep Dive
Storage Model
- Columnar storage: All values of a column are stored contiguously in a compressed format. Different columns can use different compression algorithms (dictionary encoding, run-length encoding, etc.).
- In-memory primary: Data resides in RAM by default. Persistence to disk is asynchronous. This means queries are much faster — no disk I/O latency.
- Dictionary compression: For a column with values like 'APPROVED', 'REJECTED', 'PENDING' (only 3 unique values across 1 million rows), HANA stores a dictionary [0='APPROVED', 1='REJECTED', 2='PENDING'] and then the actual column as tiny integers [0, 0, 1, 0, 2, ...]. This compresses to 1MB instead of 50MB.
- No traditional indexes: The columnar structure itself provides the indexing benefit. Lookups in a column don't require a separate B-tree.
What This Means Operationally
- No index tuning required: You don't create indexes to make queries fast. The columnar structure handles it.
- Memory is the constraint: You need enough RAM to hold the entire dataset. For Callidus Commissions (typically 100GB-500GB of transaction history), that means substantial hardware.
- Compression is free: Column-level compression happens automatically. Your "500GB of uncompressed data" might only take 50GB in HANA.
- DBA work is different: Instead of index strategies, you think about column store vs. row store (when to use each), memory allocation, and compression ratios. Partitioning strategies focus on time-based partitions for analytics.
- Write performance trade-off: In-memory storage is slower for massive write workloads (millions of inserts/second). But for ICM (typical load: 1000-10000 transactions per second), it's not a bottleneck. Reads are so fast that the overall application feels faster despite slower individual writes.
Comparison: Oracle vs HANA at a Glance
| Dimension | Oracle Database | SAP HANA Database |
|---|---|---|
| Storage model | Row-based (heap tables) | Column-based (columnar) |
| Primary location | Disk (buffer cache in RAM) | In-memory (disk for persistence) |
| Compression | Optional per table (minimal) | Automatic per column (aggressive) |
| Typical compression ratio | 10:1 to 5:1 | 20:1 to 100:1 (varies by data) |
| Indexing | B-tree indexes required | No traditional indexes needed |
| Best for | OLTP (point queries, small writes) | OLAP (aggregations, scans) |
| Query for 1 row by ID | ~10ms (index lookup) | ~10ms (columnar scan) |
| SUM 1M commission amounts | ~5000ms (full table scan) | ~50ms (columnar, in-memory) |
| Hardware requirement | Standard disk + modest RAM | Huge RAM (data must fit in memory) |
What It Means for Callidus → SAP Migration
The Good News
- Report queries run 100x faster: Commission roll-ups, tier summaries, and payout reports that took 5 minutes now take 3 seconds.
- Real-time dashboards are feasible: Sales dashboards and commission tracking can refresh every 5 minutes instead of overnight batch jobs.
- Ad-hoc analysis becomes practical: Business users can run complex ad-hoc queries without waiting for a scheduled report.
The Bad News
- Index tuning doesn't translate: Your Oracle performance tuning expertise (B-tree index strategies, index cardinality, multi-column indexes) doesn't apply. You need to learn HANA's memory management and columnar compression instead.
- Memory is expensive: If you had a 200GB Oracle database on a 64GB server, that worked fine (only the hot data was in RAM). For SAP HANA, you need 200GB of RAM. That's a 10x hardware cost increase.
- Procedural code needs rewriting: Cursor loops (iterate through rows one at a time) are slow in HANA. Set-based operations (bulk inserts, updates) are much faster. Rewriting PL/SQL packages as SQL Scripts requires a mindset shift.
The Callidus → SAP Acquisition Story
Callidus Software was founded in 1997 and built on Oracle Database. By 2015-2017, the modern trend was moving toward in-memory analytics: Salesforce built on their own in-memory tech, HubSpot on PostgreSQL, and SAP pushed SAP HANA hard as the future. Callidus was being left behind — their Oracle backend couldn't deliver real-time commission dashboards or handle the scale of modern enterprises.
SAP acquired Callidus in June 2018 for $2.4B, primarily to consolidate the ICM market. The strategic play: migrate Callidus customers to SAP SuccessFactors (which runs on HANA) and retire the Callidus platform. This gives SAP a dominant position in compensation management.
For customers, the migration is mandatory (Callidus end-of-life is 2027). For your team, that means learning HANA's architecture, SQL syntax, and procedural patterns — which is exactly what this learning path covers.
Your Toolkit Going Forward
As you move from Oracle to HANA, your toolkit changes:
| Skill | Oracle Focus | HANA Focus |
|---|---|---|
| Query optimization | Index design, execution plans, CBO (Cost-Based Optimizer) | Column selection, compression ratios, memory footprint |
| Procedural code | PL/SQL packages, cursor loops, bulk collect, exception handling | SQL Script procedures, table variables, set-based operations, exit handlers |
| Reporting | Complex SQL queries with many joins and subqueries | Aggregation-heavy queries (GROUP BY, windowing functions) with fewer joins |
| Batch processing | Nightly ETL batches using PL/SQL packages | HANA data provisioning or scheduled procedures |
| DBA work | Index management, tablespace allocation, archiving | Memory management, data freshness (row store vs. column store), backup strategy |
One More Thing: Row Store vs Column Store in HANA
HANA supports both row and column storage. By default, tables are column-stored (fast analytics). But some transactional tables are row-stored to optimize insert/update performance. When you migrate from Oracle, you'll work with both:
- Column store (default): Commission transactions, payout history, audit logs. Any table you aggregate over.
- Row store: Active work tables, staging tables, temporary configurations. Tables with high insert/update rates.
The SQL syntax is identical for both, but the performance characteristics differ. This is covered in depth in lesson 04 (Procedural Code), but it's good to know that HANA gives you the choice.