The Internet Archive Center at Bibliotheca Alexandrina Expands Its Hardware Capacities

Posted on

Alexandria— Following an agreement with the San Francisco team, the Bibliotheca Alexandrina (BA) Internet Archive center has incorporated the second-generation machines for web archiving—the Petabox. The Petabox is a machine designed to safely store and process one Petabyte (a million Gigabytes) of data. The machine features low power consumption, high density of 64 terabytes per rack, multi-operating systems, easy maintenance as well as a software to automate mirroring. It is an inexpensive design that will pave the way to increase the collection and stay up-to-date with the latest technology.

The transition to the Petabox system is progressing quickly as the necessary hard disks for two Petabytes (two millions Gigabytes) have been purchased (5,000 hard disks, 400 Gigabytes each), of which 3,700 hard disks were assembled and loaded with data by the San Francisco team. The final batch of equipment reached BA last January. The machines assembled in San Francisco accommodate 1.5 Petabytes of data and are currently holding the web collection of 1996-2006, 25,000 digitized books acquired through the Open Content Alliance (OCA) consortium; in addition to, 25,000 digitized books from the Library’s Digital Assets Repository (DAR). Machines for the new collection will be designed and manufactured locally at BA. BA software engineers have successfully completed the migration of the data and the testing of the new machines. The new installation features major improvements with regards to imaging, disk layout and the file system. Enhancements to the system are also being researched, particularly in the areas of cluster management and security.

The Internet Archive center is currently being refurbished to accommodate air conditioning, weight, power, and network requirements of the new hardware.

For detailed description of the BA Internet Archive Center and other projects by ISIS, please visit


© Bibliotheca Alexandrina