Virtual Desktop Infra

Message par **lls** » sam. 1 févr. 2014 23:20

Est ce que si on héberge les fichiers vhd sur un réplica dfs cela réplique tout le fichier (plusieurs giga) ou est ce qu'on peut configurer un répliqua uniquement les modifications du fichier (je sais que certains systèmes de backup le fond) ?

Message par **dsebire** » dim. 2 févr. 2014 14:13

Ca réplique tout, c'est bien pour ca que j'ai dit au moins 3 fois que dfsr était a proscrire

Message par **lls** » dim. 2 févr. 2014 14:21

oui ca j'ai bien compris mais je me demandé simplement s'il était possible de gérer la réplication du dfs-r autrement que par la totalité des données

Message par **dsebire** » dim. 2 févr. 2014 14:25

Ca s'appelle plus du dfsr dans ce cas !
C'est non paramétrable.

Rsync le fait mais tu n'es plus en temps réel.

Message par **augur1** » dim. 2 févr. 2014 14:34

oui ca j'ai bien compris mais je me demandé simplement s'il était possible de gérer la réplication du dfs-r autrement que par la totalité des données

Autre cas de figure, avec ZFS Dedup (en temps réel) + possibilité d'envoyer la dedup sur un autre storage (ZFS Send)

What to dedup: Files, blocks, or bytes?

Data can be deduplicated at the level of files, blocks, or bytes.

[#ff0000]File-level[/#ff0000] assigns a hash signature to an entire file. File-level dedup has the lowest overhead when the natural granularity of data duplication is whole files, but it also has significant limitations: any change to any block in the file requires recomputing the checksum of the whole file, which means that if even one block changes, any space savings is lost because the two versions of the file are no longer identical. This is fine when the expected workload is something like JPEG or MPEG files, but is completely ineffective when managing things like virtual machine images, which are mostly identical but differ in a few blocks.
[#ff0000]
Block-level[/#ff0000] dedup has somewhat higher overhead than file-level dedup when whole files are duplicated, but unlike file-level dedup, it handles block-level data such as virtual machine images extremely well. Most of a VM image is duplicated data -- namely, a copy of the guest operating system -- but some blocks are unique to each VM. With block-level dedup, only the blocks that are unique to each VM consume additional storage space. All other blocks are shared.

[#ff0000]Byte-level[/#ff0000] dedup is in principle the most general, but it is also the most costly because the dedup code must compute 'anchor points' to determine where the regions of duplicated vs. unique data begin and end. Nevertheless, this approach is ideal for certain mail servers, in which an attachment may appear many times but not necessary be block-aligned in each user's inbox. This type of deduplication is generally best left to the application (e.g. Exchange server), because the application understands the data it's managing and can easily eliminate duplicates internally rather than relying on the storage system to find them after the fact.

[#ff0000]ZFS provides block-level deduplication [/#ff0000]because this is the finest granularity that makes sense for a general-purpose storage system. Block-level dedup also maps naturally to ZFS's 256-bit block checksums, which provide unique block signatures for all blocks in a storage pool as long as the checksum function is cryptographically strong (e.g. SHA256).