Apache Kudu 1.0.0 发布,该版本添加了一些新特性。部分更新内容如下:
Removal of multiversion concurrency control (MVCC) history is now supported. This allows Kudu to reclaim disk space, where previously Kudu would keep a full history of all changes made to a given table since the beginning of time.
Most of Kudu’s command line tools have been consolidated under a new top-level "kudu" tool. This reduces the number of large binaries distributed with Kudu and also includes much-improved help output.
Administrative tools including "kudu cluster ksck" now support running against multi-master Kudu clusters.
The C++ client API now supports writing data in AUTO_FLUSH_BACKGROUND mode. This can provide higher throughput for ingest workloads.
还有其他的Bug修复、优化和提升,点击这里查看详细更新内容
下载地址:
Apache Kudu 简介
为了应对先前发现的这些趋势,有两种不同的方式:持续更新现有的Hadoop工具或者重新设计开发一个新的组件。其目标是:
对数据扫描(scan)和随机访问(random access)同时具有高性能,简化用户复杂的混合架构;
高CPU效率,最大化先进处理器的效能;
高IO性能,充分利用先进永久存储介质;
支持数据的原地更新,避免额外的数据处理、数据移动
我们为了实现这些目标,首先在现有的开源项目上实现原型,但是最终我们得出结论:需要从架构层作出重大改变。而这些改变足以让我们重新开发一个全新的数据存储系统。于是3年前开始开发,直到如今我们终于可以分享多年来的努力成果:Kudu,一个新的数据存储系统。