In-Memory Computing Planet® Blogs and Events

IMC Planet presents in-memory computing blogs and events from around the world. Read the latest in-memory computing news here. Submit your RSS feed or post your upcoming events and help the in-memory computing community stay up to date on the latest developments.

Dec
19
2011
Posted by In-Memory Computing Blog
Analytics on historical data
All enterprise data has a lifespan: depending on the application, a datum might be expected to be changed or updated in the future in many cases, in few cases, or never. In financial accounting, for example, all data that is not from the current year plus all open items from the previous year can be considered 'historic data', since they may no longer be changed.

In HANA, historical data is instantly available for analytical processing from solid state disk (SSD) drives. Only active data is required to reside in-memory permanently.…
 
Nov
25
2011
Posted by In-Memory Computing Blog
By default, all data is stored in-memory to achieve high speed data access. However, not all data is expected to be read or updated frequently and needs to reside in-memory, as this increases the required amount of main memory unnecessarily. This so-called historic or passive data can be stored in a specific passive data storage based on less expensive storage media, such as SSDs or hard disks, still providing sufficient performance for possible accesses at lower cost. The dynamic transition from active to passive data is supported by the database, based on custom rules defined as per…
 
Nov
22
2011
Posted by In-Memory Computing Blog
Insert-only or append-only describes how data is managed when inserting new data. The principle idea of insert-only is that changes to existing data are handled by appending new tuples to the data storage. In other words, the database does not allow applications to perform updates or deletions on physically stored tuples of data.

This design approach allows the introduction of a specific write-optimized data store for fast writing and a read-optimized data store for fast reading.Traditional database systems support four operations for data manipulations, i.e. inserting new data, selecting data,…
 
Nov
19
2011
Posted by In-Memory Computing Blog
High performance in-memory computing will change how enterprises work. Currently, enterprise data is split into two databases for performance reasons. Usually, disk-based row-oriented database systems are used for operational data and column-oriented databases are used for analytics (e.g. “sum of all sales in China grouped by product”). While analytical databases are often kept in-memory, they are also mixed with disk-based storage media.

Transactional data and analytical data are not stored in the same database: analytical data resides in separate data warehouses, to which it is replicated in batch jobs. In consequence, flexible real-time reporting is not possible…
 
Nov
16
2011
Posted by In-Memory Computing Blog
For a long time the available amount of main memory on large server systems was not enough to hold the complete transactional data set of large enterprise applications.

Today, the situation changed. Modern servers provide up to multiple Terabytes of main memory and allow keeping the complete transactional data in memory. This eliminates multiple I/O layers and simplifies database design, allowing for high throughput of transactional and analytical queries.

Please also see our podcast on this technology concept.…
 
Nov
15
2011
Posted by In-Memory Computing Blog
MapReduce is a programming model to parallelize the work on large amounts of data. MapReduce took the data analysis world by storm, because it dramatically reduces the development overhead of parallelizing such tasks. With MapReduce the developer only needs to implement a map and reduce function, while the execution engine transparently parallelizes the processing of these functions among available resources.

 
Nov
11
2011
Posted by In-Memory Computing Blog
The In-Memory Database allows for switching between transactional inserts and a bulk load mode. The latter can be used to insert large sets of data without the transactional overhead and thus enables significant speed-ups when setting up systems or restoring previously collected data. Furthermore, scenarios with extremely high loads are enabled by buffering events and bulk-inserting them in chunks. This comes at the cost of reducing real-time capabilities to near-real time capabilities with an offset according to a defined buffering period, which oftentimes may be acceptable for business scenarios.

 
Nov
09
2011
Posted by In-Memory Computing Blog
We distinguish between two partitioning approaches: vertical and horizontal partitioning, whereas a combination of both approaches is also possible.

Vertical partitioning refers to rearranging individual database columns. It is achieved by splitting columns of a database table in two or more column sets. Each of the column sets can be distributed on individual databases servers. This can also be used to build up database columns with different ordering to achieve better search performance while guaranteeing high-availability of data. Key to success of vertical partitioning is a thorough understanding of the application’s data access patterns. Attributes that are…
 
Nov
08
2011
Posted by In-Memory Computing Blog
In application development, layers refer to levels of abstractions. Each application layer encapsulates specific logic and offers certain functionality. Although abstraction helps to reduce complexity, it also introduces obstacles. The latter result from various aspects, e.g. a) functionality is hidden within a layer and b) each layer offers a variety of functionality while only a small subset is in-use.

From the data’s perspective, layers are problematic since data is marshaled and unmarshaled for transformation in the layer-specific format. As a result, the identical data is kept in…
 
Nov
03
2011
Posted by In-Memory Computing Blog
Parallel execution is key to achieve sub second response time for queries processing large sets of data. The independence of tuples within columns enables easy partitioning and therefore supports parallel processing. We leverage this fact by partitioning database tasks on large data sets into as many jobs as threads are available on a given node. This way, the maximal utilization of any supported hardware can be achieved.

Please also see our podcast on this technology concept.
 
Oct
31
2011
Posted by In-Memory Computing Blog
To achieve the highest level of operational efficiency, the data of multiple customers can be consolidated onto a single HANA server. Such consolidation is key when HANA is provisioned in an on-demand setting, a service which SAP plans to provide in the future. Multi-tenancy allows making HANA accessible for smaller customers at lower cost, as a benefit from the consolidation.
Already today…
 
Oct
27
2011
Posted by In-Memory Computing Blog
Compression defines the process of reducing the amount of storage place needed to represent a certain set of information. Typically, a compression algorithm tries to exploit redundancy in the available information to reduce the amount required storage. The biggest difference between compression algorithms is the amount of time that is required to compress and decompress a certain piece or all of the information. More complex compression algorithms will sort and perform complex analysis of the input data to achieve the highest compression ratio. For in-memory databases, compression is applied to reduce the amount of data that is transferred along the memory channels…
 
Oct
24
2011
Posted by In-Memory Computing Blog
In contrast to the hardware development until the early 2000 years, todays processing power does no longer scale in terms of processing speed, but degree of parallelism. Today, modern system architectures provide server boards with up to eight separate CPUs where each CPU has up to twelve separate cores. This tremendous amount of processing power should be exploited as much as possible to achieve the highest possible throughout for transactional and analytical applications. For modern enterprise applications it becomes imperative to reduce the amount of sequential work and develop the application in a way…
 
Feb
09
2011
Posted by In-Memory Computing Blog
Cloud Services in combination with high performance in-memory computing will change how enterprises work. Currently, most of the data is stored in silos of slow disk-based row-oriented database systems. Besides, transactional data is not stored in the same database as analytical data, but in separate data warehouses and gets replicated in batch jobs. Consequently, instant real time analytics are not possible and company leaders often have to make decisions in a very short time frame based on insufficient information.

This is about to change. In the last decade, hardware architectures have evolved dramatically. Multi core architectures and the availability of large amounts of main memory at low costs are about to set new breakthroughs in the software industry. It has become possible to store data sets of whole companies entirely in main memory, which offers a performance that is orders of magnitudes faster than disk. Traditional disks are one of the remaining mechanical…
 
Feb
07
2011
Posted by In-Memory Computing Blog
In-memory data management technology in combination with highly parallel processing has a tremendous impact on business applications, for example, by having all enterprise data instantly available for analytical needs. Guided by Hasso Plattner, we, a team of researchers under the supervision of Alexander Zeier at the Hasso Plattner Institute, analyzed and evaluated how business applications are developed and used starting in 2006.

I am [...] very excited about the potential that the in-memory database technology offers to my business.