File System for Big Data
A file system for big data requires a lot of capabilities. One example is the General parallel file System (GPFS) and another is the Hadoop Distributed File System (HDFS). Especially for large quantities of data such a system needs to take advantage of parallelization and scalability. Reliability and good input/output (I/O) read and write are essential as well. GPFS or other systems of that kinds such as Lustre leverage parallel I/O that provides a very high read and write performance. HDFS on the other hand keeps several copies at different geographical locations. The idea is to keep the data close to the computing site avoiding data transfer. Both systems have their advantages and disadvantages and it usually depends on the type of big data applications what system to use.
More Information about a file system
There is a nice and short video that introduces some important capabilities about the Topic: