Apache Kudu 0.10.0 发布，Hadoop 存储系统

Apache Kudu 0.10.0 发布了。

Apache Kudu 简介

为了应对先前发现的这些趋势，有两种不同的方式：持续更新现有的Hadoop工具或者重新设计开发一个新的组件。其目标是：

对数据扫描(scan)和随机访问(random access)同时具有高性能，简化用户复杂的混合架构；
高CPU效率，最大化先进处理器的效能；
高IO性能，充分利用先进永久存储介质；
支持数据的原地更新，避免额外的数据处理、数据移动

我们为了实现这些目标，首先在现有的开源项目上实现原型，但是最终我们得出结论：需要从架构层作出重大改变。而这些改变足以让我们重新开发一个全新的数据存储系统。于是3年前开始开发，直到如今我们终于可以分享多年来的努力成果：Kudu，一个新的数据存储系统。

更新如下：

Incompatible changes and deprecated APIs in 0.10.0

Gerrit #3737 The Java client has been repackaged under org.apache.kudu instead of org.kududb. Import statements for Kudu classes must be modified in order to compile against 0.10.0. Wire compatibility is maintained.
Gerrit #3055 The Java client’s synchronous API methods now throw KuduException instead of Exception. Existing code that catches Exception should still compile, but introspection of an exception’s message may be impacted. This change was made to allow thrown exceptions to be queried more easily using KuduException.getStatus and calling one of Status’s methods. For example, an operation that tries to delete a table that doesn’t exist would return a `Status that returns true when queried on isNotFound().
The Java client’s KuduTable.getTabletsLocations set of methods is now deprecated. Additionally, they now take an exclusive end partition key instead of an inclusive key. Applications are encouraged to use the scan tokens API instead of these methods in the future.
The C++ API for specifying split points on range-partitioned tables has been improved to make it easier for callers to properly manage the ownership of the provided rows.
The TableCreator::split_rows API took a vector<const KuduPartialRow*>, which made it very difficult for the calling application to do proper error handling with cleanup when setting the fields of the KuduPartialRow. This API has been now been deprecated and replaced by a new method TableCreator::add_range_split which allows easier use of smart pointers for safe memory management.
The Java client’s internal buffering has been reworked. Previously, the number of buffered write operations was constrained on a per-tablet-server basis. Now, the configured maximum buffer size constrains the total number of buffered operations across all tablet servers in the cluster. This provides a more consistent bound on the memory usage of the client regardless of the size of the cluster to which it is writing.
This change can negatively affect the write performance of Java clients which rely on buffered writes. Consider using thesetMutationBufferSpace API to increase a session’s maximum buffer size if write performance seems to be degraded after upgrading to Kudu 0.10.0.
The "remote bootstrap" process used to copy a tablet replica from one host to another has been renamed to "Tablet Copy". This resulted in the renaming of several RPC metrics. Any users previously explicitly fetching or monitoring metrics related to Remote Bootstrap should update their scripts to reflect the new names.
The SparkSQL datasource for Kudu no longer supports mode Overwrite. Users should use the new KuduContext.upsertRowsmethod instead. Additionally, inserts using the datasource are now upserts by default. The older behavior can be restored by setting the operation parameter to insert.

New features

Users may now manually manage the partitioning of a range-partitioned table. When a table is created, the user may specify a set of range partitions that do not cover the entire available key space. A user may add or drop range partitions to existing tables.
This feature can be particularly helpful with time series workloads in which new partitions can be created on an hourly or daily basis. Old partitions may be efficiently dropped if the application does not need to retain historical data past a certain point.
This feature is considered experimental for the 0.10 release. More details of the new feature can be found in the accompanyingblog post.
Support for running Kudu clusters with multiple masters has been stabilized. Users may start a cluster with three or five masters to provide fault tolerance despite a failure of one or two masters, respectively.
Note that certain tools (e.g. ksck) are still lacking complete support for multiple masters. These deficiencies will be addressed in a following release.
Kudu now supports the ability to reserve a certain amount of free disk space in each of its configured data directories. If a directory’s free disk space drops to less than the configured minimum, Kudu will stop writing to that directory until space becomes available. If no space is available in any configured directory, Kudu will abort.
This feature may be configured using the fs_data_dirs_reserved_bytes and fs_wal_dir_reserved_bytes flags.
The Spark integration’s KuduContext now supports four new methods for writing to Kudu tables: insertRows, upsertRows,updateRows, and deleteRows. These are now the preferred way to write to Kudu tables from Spark.

完整更新说明：http://kudu.apache.org/releases/0.10.0/docs/release_notes.html

下载：

Kudu 0.10.0 source tarball (SHA1, MD5, Signature)

Apache Kudu 0.10.0 发布，Hadoop 存储系统

Trending Articles

SM3268AB 8CE三星量产无法格式化

[下载工具]Think4V utubedown(Youtube高清视频下载工具) v2.1.6 官方版2.1.3

出售: SINE Othello 電源線

博讯｜张磊帮助下，李源潮的儿子被耶鲁录取

FullEventLogView 1.73 免安裝中文版 - 事件檢視器取代工具

同門四角戀？李沛旭喇舌「小郭雪芙」曾智希，蔡淑臻拍完婚紗...怒毀婚

五代RAV4 降車身（機械車位因素）

[攻略] 《魔獸世界》6.2.2 白色魚人蛋再現！來去收編魚人寶寶特基！

jetBrains Product crack 2024 Java based

2013 KUGA 6G轉動方向盤會聽到摳摳摳的異音，有人知道原因嗎?

【豌豆字幕組】[藥屋少女的呢喃（藥師少女的獨語）/ Kusuriya no Hitorigoto][25][繁體][1080P][MP4]

好用的照片后期处理软件【DxO PhotoLab Elite 5.4.0.4765 (x64) 多语言便携版】..

出售: Thixar Silence Plus 啫喱板

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

三條崙討海人故事…重建烏倉寮憶43年前船難

致喬立建設道歉聲明

[一般] 神州全地圖掉寶資料

方易通7862 8/128G 無360 刷機

動感校園小記者・瑪利諾修院學校｜採訪王瑋駿陳晞文帶領試玩風帆

有藍電流行車紀錄器分享文嗎