Although at a conceptual level tables may be viewed as a sparse set of rows, they are physically stored by column family. A new column qualifier (column_family:column_qualifier) can be added to an existing column family at any time.
Table 5.2. ColumnFamily anchor
| Row Key | Time Stamp | Column Family anchor |
|---|---|---|
| "com.cnn.www" | t9 | anchor:cnnsi.com = "CNN" |
| "com.cnn.www" | t8 | anchor:my.look.ca = "CNN.com" |
Table 5.3. ColumnFamily contents
| Row Key | Time Stamp | ColumnFamily "contents:" |
|---|---|---|
| "com.cnn.www" | t6 | contents:html = "<html>..." |
| "com.cnn.www" | t5 | contents:html = "<html>..." |
| "com.cnn.www" | t3 | contents:html = "<html>..." |
The empty cells shown in the
conceptual view are not stored at all.
Thus a request for the value of the contents:html column at time stamp
t8 would return no value. Similarly, a request for an
anchor:my.look.ca value at time stamp t9 would
return no value. However, if no timestamp is supplied, the most recent value for a
particular column would be returned. Given multiple versions, the most recent is also the
first one found, since timestamps
are stored in descending order. Thus a request for the values of all columns in the row
com.cnn.www if no timestamp is specified would be: the value of
contents:html from timestamp t6, the value of
anchor:cnnsi.com from timestamp t9, the value of
anchor:my.look.ca from timestamp t8.
For more information about the internals of how Apache HBase stores data, see Section 9.7, “Regions”.