Bigtable：修订间差异

删除的内容添加的内容

行内

2012年1月23日 (一) 04:47的版本

BigTable 是一種壓縮的、高效能的、高可擴展性的Google檔案系統(Google File System, GFS)，用於儲存大规模結構化資料，適用於雲端計算。

BigTable 發展於 2004年^[1]，現今已成為 Google 的應用程式。像是MapReduce就常透過BigTable來儲存或更改資料，^[2]其他還有Google Reader^[3]、Google Maps^[4]、Google Book Search、"My Search History"、 Google Earth、Blogger.com、Google Code hosting、Orkut^[4]、YouTube^[5]以及Gmail^[6]等。Google自行發展出特別的巨型資料庫的原因，自然是效能的問題^[7]。

BigTable不是传统的關連性資料庫，不支援 JOIN 这样的SQL語法，BigTable更像今日的NoSQL的Table-oriented，优势在于扩展性和性能。BigTable的Table資料結構包括row key、col key和timestamp，其中row key用於儲存倒轉的URL，例如www.google.com必須改成com.google.www。BigTable使用大量的Table，在Table之下還有Tablet。每一個Tablets大概有 100-200 MB，每台机器有100個左右的 Tablets。所謂的Table是屬於immutable 的SSTables，也就是存储方式不可修改。另外Table還必須進行壓縮，其壓縮又分成table的壓縮或系統的壓縮。客户端有一指向META0的Tablets的指標，METAO tablets 保儲所有的META1的tablets的資料記錄。

注釋

^ "First an overview. BigTable has been in development since early 2004 and has been in active use for about eight months (about February 2005)." Google's BigTable
^ "Bigtable can be used with MapReduce, a framework for running large-scale parallel computations developed at Google. We have written a set of wrappers that allow a Bigtable to be used both as an input source and as an output target for MapReduce jobs". pg 3 of "Bigtable: A Distributed Storage System for Structured Data", 2006
^ "Reader is using Google's BigTable in order to create a haven for what is likely to be a massive trove of items." Official Google Reader blog.
^ ^4.0 ^4.1 "There are currently around 100 cells for services such as Print, Search History, Maps, and Orkut." Google's BigTable
^ "Their new solution for thumbnails is to use Google’s BigTable, which provides high performance for a large number of rows, fault tolerance, caching, etc. This is a nice (and rare?) example of actual synergy in an acquisition." YouTube Scalability Talk
^ "How Entities and Indexes are Stored - Google App Engine - Google Code"
^ "We have described Bigtable, a distributed system for storing structured data at Google....Our users like the performance and high availability provided by the Bigtable implementation, and that they can scale the capacity of their clusters by simply adding more machines to the system as their resource demands change over time...Finally, we have found that there are significant advantages to building our own storage solution at Google. We have gotten a substantial amount of flexibility from designing our own data model for Bigtable." from the Conclusion of "Bigtable: A Distributed Storage System for Structured Data", 2006

外部連結

Bigtable: A Distributed Storage System for Structured Data -(official paper; PDF)
BigTable: A Distributed Structured Storage System (video)
- more video
- Google's BigTable -(notes on the official presentation)
"How Google Works"
Is the Relational Database Doomed ?

[1st-blog-1] "First an overview. BigTable has been in development since early 2004 and has been in active use for about eight months (about February 2005)." Google's BigTable

[2] "Bigtable can be used with MapReduce, a framework for running large-scale parallel computations developed at Google. We have written a set of wrappers that allow a Bigtable to be used both as an input source and as an output target for MapReduce jobs". pg 3 of "Bigtable: A Distributed Storage System for Structured Data", 2006

[google-reader-3] "Reader is using Google's BigTable in order to create a haven for what is likely to be a massive trove of items." Official Google Reader blog.

[maps-orkut-4] 4.0 ^4.1 "There are currently around 100 cells for services such as Print, Search History, Maps, and Orkut." Google's BigTable

[5] "Their new solution for thumbnails is to use Google’s BigTable, which provides high performance for a large number of rows, fault tolerance, caching, etc. This is a nice (and rare?) example of actual synergy in an acquisition." YouTube Scalability Talk

[6] "How Entities and Indexes are Stored - Google App Engine - Google Code"

[7] "We have described Bigtable, a distributed system for storing structured data at Google....Our users like the performance and high availability provided by the Bigtable implementation, and that they can scale the capacity of their clusters by simply adding more machines to the system as their resource demands change over time...Finally, we have found that there are significant advantages to building our own storage solution at Google. We have gotten a substantial amount of flexibility from designing our own data model for Bigtable." from the Conclusion of "Bigtable: A Distributed Storage System for Structured Data", 2006

[1]

[2]

[3]

[4]

[5]

[6]

[7]

2012年1月23日 (一) 04:43的版本编辑 Ellery（留言 \| 贡献）延伸确认用户 89,567次编辑无编辑摘要 ←上一版本		2012年1月23日 (一) 04:47的版本编辑撤销 Ellery（留言 \| 贡献）延伸确认用户 89,567次编辑增加或調整內部連結下一版本→
第1行：		第1行：
	'''BigTable''' 是一種壓縮的、高效能的、高可擴展性的[[Google檔案系統]](Google File System, GFS)，用於儲存大规模結構化資料，適用於雲端計算。		'''BigTable''' 是一種壓縮的、高效能的、高可擴展性的[[Google檔案系統]](Google File System, GFS)，用於儲存大规模結構化資料，適用於[[雲端計算]]。

	BigTable 發展於 2004年<ref name="1st-blog">"First an overview. BigTable has been in development since early 2004 and has been in active use for about eight months (about February 2005)." [http://andrewhitchcock.org/?post=214 Google's BigTable]</ref>，現今已成為 Google 的應用程式。像是[[MapReduce]]就常透過BigTable來儲存或更改資料，<ref>"Bigtable can be used with MapReduce, a framework for running large-scale parallel computations developed at Google. We have written a set of wrappers that allow a Bigtable to be used both as an input source and as an output target for MapReduce jobs". pg 3 of "Bigtable: A Distributed Storage System for Structured Data", 2006</ref>其他還有[[Google Reader]]<ref name="google-reader">"Reader is using Google's BigTable in order to create a haven for what is likely to be a massive trove of items." [http://googlereader.blogspot.com/2005/10/google-reader-two-weeks.html Official Google Reader] blog.</ref>、[[Google Maps]]<ref name="maps-orkut">"There are currently around 100 cells for services such as Print, Search History, Maps, and Orkut." [http://andrewhitchcock.org/?post=214 Google's BigTable]</ref>、[[Google Book Search]]、"My Search History"、 [[Google Earth]]、[[Blogger.com]]、[[Google Code]] hosting、[[Orkut]]<ref name="maps-orkut"/>、[[YouTube]]<ref>"Their new solution for thumbnails is to use Google’s BigTable, which provides high performance for a large number of rows, fault tolerance, caching, etc. This is a nice (and rare?) example of actual synergy in an acquisition." [http://kylecordes.com/2007/07/12/youtube-scalability/ YouTube Scalability Talk]</ref>以及[[Gmail]]<ref>[http://code.google.com/intl/pl/appengine/articles/storage_breakdown.html#anc-background "How Entities and Indexes are Stored - Google App Engine - Google Code"]</ref>等。Google自行發展出特別的巨型資料庫的原因，自然是效能的問題<ref>"We have described Bigtable, a distributed system for storing structured data at Google....Our users like the performance and high availability provided by the Bigtable implementation, and that they can scale the capacity of their clusters by simply adding more machines to the system as their resource demands change over time...Finally, we have found that there are significant advantages to building our own storage solution at Google. We have gotten a substantial amount of flexibility from designing our own data model for Bigtable." from the Conclusion of "Bigtable: A Distributed Storage System for Structured Data", 2006</ref>。		BigTable 發展於 2004年<ref name="1st-blog">"First an overview. BigTable has been in development since early 2004 and has been in active use for about eight months (about February 2005)." [http://andrewhitchcock.org/?post=214 Google's BigTable]</ref>，現今已成為 Google 的應用程式。像是[[MapReduce]]就常透過BigTable來儲存或更改資料，<ref>"Bigtable can be used with MapReduce, a framework for running large-scale parallel computations developed at Google. We have written a set of wrappers that allow a Bigtable to be used both as an input source and as an output target for MapReduce jobs". pg 3 of "Bigtable: A Distributed Storage System for Structured Data", 2006</ref>其他還有[[Google Reader]]<ref name="google-reader">"Reader is using Google's BigTable in order to create a haven for what is likely to be a massive trove of items." [http://googlereader.blogspot.com/2005/10/google-reader-two-weeks.html Official Google Reader] blog.</ref>、[[Google Maps]]<ref name="maps-orkut">"There are currently around 100 cells for services such as Print, Search History, Maps, and Orkut." [http://andrewhitchcock.org/?post=214 Google's BigTable]</ref>、[[Google Book Search]]、"My Search History"、 [[Google Earth]]、[[Blogger.com]]、[[Google Code]] hosting、[[Orkut]]<ref name="maps-orkut"/>、[[YouTube]]<ref>"Their new solution for thumbnails is to use Google’s BigTable, which provides high performance for a large number of rows, fault tolerance, caching, etc. This is a nice (and rare?) example of actual synergy in an acquisition." [http://kylecordes.com/2007/07/12/youtube-scalability/ YouTube Scalability Talk]</ref>以及[[Gmail]]<ref>[http://code.google.com/intl/pl/appengine/articles/storage_breakdown.html#anc-background "How Entities and Indexes are Stored - Google App Engine - Google Code"]</ref>等。Google自行發展出特別的巨型資料庫的原因，自然是效能的問題<ref>"We have described Bigtable, a distributed system for storing structured data at Google....Our users like the performance and high availability provided by the Bigtable implementation, and that they can scale the capacity of their clusters by simply adding more machines to the system as their resource demands change over time...Finally, we have found that there are significant advantages to building our own storage solution at Google. We have gotten a substantial amount of flexibility from designing our own data model for Bigtable." from the Conclusion of "Bigtable: A Distributed Storage System for Structured Data", 2006</ref>。

2012年1月23日 (一) 04:47的版本

相關條目

注釋

外部連結