Comparison of distributed file systems

In computing, a distributed file system (DFS) or network file system is any file system that allows access from multiple hosts to files shared via a computer network. This makes it possible for multiple users on multiple machines to share files and storage resources.

Distributed file systems differ in their performance, mutability of content, handling of concurrent writes, handling of permanent or temporary loss of nodes or storage, and their policy of storing content.

Locally managed

ClientWritten inLicenseAccess APIHigh availabilityShardsEfficient RedundancyRedundancy GranularityInitial release yearMemory requirements (GB)
Alluxio (Virtual Distributed File System)JavaApache License 2.0HDFS, FUSE, HTTP/REST, S3hot standbyNoReplication[1]File[2]2013
CephC++LGPLlibrados (C, C++, Python, Ruby), S3, Swift, FUSEYesYesPluggable erasure codes[3]Pool[4]20101 per TB of storage
CodaCGPLCYesYesReplicationVolume[5]1987
GlusterFSCGPLv3libglusterfs, FUSE, NFS, SMB, Swift, libgfapimirrorYesReed-Solomon[6]Volume[7]2005
HDFSJavaApache License 2.0Java and C client, HTTP, FUSE[8]transparent master failoverNoReed-Solomon[9]File[10]2005
IPFSGoApache 2.0 or MITHTTP gateway, FUSE, Go client, Javascript client, command line toolYeswith IPFS ClusterReplication[11]Block[12]2015[13]
LizardFS[14]C++GPLv3POSIX, FUSE, NFS-Ganesha, Ceph FSAL (via libcephfs)masterNoReed-Solomon[15]File[16]2013
LustreCGPLv2POSIX, NFS-Ganesha, NFS, SMBYesYesNo redundancy[17][18]No redundancy[19][20]2003
MinIOGoAGPL3.0AWS S3 API, FTP, SFTPYesYesReed-Solomon[21]Object[22]2014
MooseFSCGPLv2POSIX, FUSEmasterNoReplication[23]File[24]2008
OpenAFSCIBM Public LicenseVirtual file system, Installable File SystemReplicationVolume[25]2000[26]
OpenIO[27]CAGPLv3 / LGPLv3Native (Python, C, Java), HTTP/REST, S3, Swift, FUSE (POSIX, NFS, SMB, FTP)YesPluggable erasure codes[28]Object[29]20150.5
Quantcast File SystemCApache License 2.0C++ client, FUSE (C++ server: MetaServer and ChunkServer are both in C++)masterNoReed-Solomon[30]File[31]2012
RozoFSC, PythonGPLv2FUSE, SMB, NFS, key/valueYesMojette[32]Volume[33]2011[34]
Tahoe-LAFSPythonGNU GPL[35]HTTP (browser or CLI), SFTP, FTP, FUSE via SSHFS, pyfilesystemReed-Solomon[36]File[37]2007
XtreemFSJava, C++BSD Licenselibxtreemfs (Java, C++), FUSEReplication[38]File[39]2009
ClientWritten inLicenseAccess API
BeeGFSC / C++FRAUNHOFER FS (FhGFS) EULA,[40]

GPLv2 client

POSIX
CloudianC++ProprietaryAWS S3, NFS, SMB/CIFS, Rest API
ObjectiveFS[41]CProprietaryPOSIX, FUSE
Spectrum Scale (GPFS)C, C++ProprietaryPOSIX, NFS, SMB, Swift, S3, HDFS
MapR-FSC, C++ProprietaryPOSIX, NFS, FUSE, S3, HDFS, CLI
Isilon OneFSC/C++ProprietaryPOSIX, NFS, SMB/CIFS, HDFS, HTTP, FTP, SWIFT Object, CLI, Rest API
QumuloC/C++ProprietaryPOSIX, NFS, SMB/CIFS, CLI, S3, Rest API
ScalityCProprietaryFUSE, NFS, REST, AWS S3
VaultFSC/C++ProprietaryPOSIX, NFS, SMB/CIFS, CLI, S3, Rest API

Remote access

NameRun byAccess API
Amazon S3Amazon.comHTTP (REST/SOAP)
Google Cloud StorageGoogleHTTP (REST)
SWIFT (part of OpenStack)Rackspace, Hewlett-Packard, othersHTTP (REST)
Microsoft AzureMicrosoftHTTP (REST)
IBM Cloud Object StorageIBM (formerly Cleversafe)[42]HTTP (REST)

Comparison

Some researchers have made a functional and experimental analysis of several distributed file systems including HDFS, Ceph, Gluster, Lustre and old (1.6.x) version of MooseFS, although this document is from 2013 and a lot of information are outdated (e.g. MooseFS had no HA for Metadata Server at that time).[43]

The cloud based remote distributed storage from major vendors have different APIs and different consistency models.[44]

See also

References

  1. ^ "Caching: Managing Data Replication in Alluxio".
  2. ^ "Caching: Managing Data Replication in Alluxio".
  3. ^ "Erasure Code Profiles".
  4. ^ "Pools".
  5. ^ Satyanarayanan, Mahadev; Kistler, James J.; Kumar, Puneet; Okasaki, Maria E.; Siegel, Ellen H.; Steere, David C. "Coda: A Highly Available File System for a Distributed Workstation Environment" (PDF). {{cite journal}}: Cite journal requires |journal= (help)
  6. ^ "Erasure coding implementation". GitHub. 2 November 2021.
  7. ^ "Setting up GlusterFS Volumes".
  8. ^ "MountableHDFS".
  9. ^ "HDFS-7285 Erasure Coding Support inside HDFS".
  10. ^ "Apache Hadoop: setrep".
  11. ^ Erasure coding plan: "Reed-Solomon layer over IPFS #196". GitHub., "Erasure Coding Layer #6". GitHub.
  12. ^ "CLI Commands: ipfs bitswap wantlist".
  13. ^ "Why The Internet Needs IPFS Before It's Too Late". 4 October 2015.
  14. ^ "Is LizardFS development still alive?". GitHub.
  15. ^ "Configuring Replication Modes".
  16. ^ "Configuring Replication Modes: Set and show the goal of a file/directory".
  17. ^ "Lustre Operations Manual: What a Lustre File System Is (and What It Isn't)".
  18. ^ Reed-Solomon in progress: "LU-10911 FLR2: Erasure coding".
  19. ^ "Lustre Operations Manual: Lustre Features".
  20. ^ File-level redundancy plan: "File Level Redundancy Solution Architecture".
  21. ^ "MinIO Erasure Code Quickstart Guide".
  22. ^ "MinIO Storage Class Quickstart Guide". GitHub.
  23. ^ Only available in the proprietary version 4.x "[feature] erasure-coding #8". GitHub.
  24. ^ "mfsgoal(1)".
  25. ^ "Replicating Volumes (Creating Read-only Volumes)".
  26. ^ "OpenAFS".
  27. ^ "OpenIO SDS Documentation". docs.openio.io.
  28. ^ "Erasure Coding".
  29. ^ "Declare Storage Policies".
  30. ^ "The Quantcast File System" (PDF).
  31. ^ "qfs/src/cc/tools/cptoqfs_main.cc". GitHub. 8 December 2021.
  32. ^ "About RozoFS: Mojette Transform".
  33. ^ "Setting up RozoFS: Exportd Configuration File".
  34. ^ "Initial commit". GitHub.
  35. ^ "About Tahoe-LAFS". GitHub. 24 February 2022.
  36. ^ "zfec -- a fast C implementation of Reed-Solomon erasure coding". GitHub. 24 February 2022.
  37. ^ "Tahoe-LAFS Architecture: File Encoding".
  38. ^ "Under the Hood: File Replication".
  39. ^ "Quickstart: Replicate A File".
  40. ^ "FRAUNHOFER FS (FhGFS) END USER LICENSE AGREEMENT". Fraunhofer Society. 2012-02-22.
  41. ^ "ObjectiveFS official website".
  42. ^ "IBM Plans to Acquire Cleversafe for Object Storage in Cloud". www-03.ibm.com. 2015-10-05. Archived from the original on October 8, 2015. Retrieved 2019-05-06.
  43. ^ Séguin, Cyril; Depardon, Benjamin; Le Mahec, Gaël. "Analysis of Six Distributed File Systems" (PDF). HAL.
  44. ^ "Data Consistency Models of Public Cloud Storage Services: Amazon S3, Google Cloud Storage and Windows Azure Storage". SysTutorials. 4 February 2014. Retrieved 19 June 2017.
Retrieved from "https://en.wikipedia.org/w/index.php?title=Comparison_of_distributed_file_systems&oldid=1326779175"