Skip to main content

How to Use tar and gzip for Archiving and Compression

·719 words·4 mins
Linux Learning Lab
Author
Linux Learning Lab
Writing about code, tools, and workflows.
Table of Contents

What Are tar and gzip?
#

tar (tape archive) bundles multiple files and directories into a single file while preserving permissions, ownership, and directory structure. It does not compress by itself.

gzip compresses a single file to reduce its size. It doesn’t handle multiple files or directories on its own.

Together they’re the standard approach on Linux: tar bundles the files, gzip compresses the bundle. The result is the familiar .tar.gz (or .tgz) file.

tar Basics
#

Create an archive
#

tar -cf archive.tar file1.txt file2.txt directory/
FlagMeaning
-cCreate a new archive
-fSpecify the archive filename (must come last before the filename)

Extract an archive
#

tar -xf archive.tar
FlagMeaning
-xExtract
-fSpecify the archive to extract

List contents without extracting
#

tar -tf archive.tar

Extract to a specific directory
#

tar -xf archive.tar -C /opt/myapp/

Verbose output
#

Add -v to any command to see each file being processed:

tar -cvf archive.tar ./project/
tar -xvf archive.tar

gzip Basics
#

Compress a file
#

gzip file.txt
# Produces file.txt.gz (original is removed)

Decompress a file
#

gunzip file.txt.gz
# Restores file.txt (compressed file is removed)

# Or equivalently
gzip -d file.txt.gz

Keep the original file
#

gzip -k file.txt
# Produces file.txt.gz AND keeps file.txt

Check compression ratio
#

gzip -l file.txt.gz

Combining tar and gzip
#

tar can invoke gzip directly with the -z flag, so you don’t need to run them separately.

Create a compressed archive
#

tar -czf archive.tar.gz ./project/
FlagMeaning
-cCreate
-zCompress with gzip
-fOutput filename

Extract a compressed archive
#

tar -xzf archive.tar.gz

List contents of a compressed archive
#

tar -tzf archive.tar.gz

Other Compression Options
#

tar supports multiple compression algorithms via different flags:

FlagAlgorithmExtensionNotes
-zgzip.tar.gz, .tgzFast, universally available
-jbzip2.tar.bz2Better compression, slower
-Jxz.tar.xzBest compression, slowest
--zstdzstandard.tar.zstFast compression with good ratios
# bzip2
tar -cjf archive.tar.bz2 ./project/

# xz
tar -cJf archive.tar.xz ./project/

# zstandard
tar --zstd -cf archive.tar.zst ./project/

Practical Examples
#

Back up a directory
#

tar -czf backup-$(date +%F).tar.gz ./myapp/

Extract a single file from an archive
#

tar -xzf archive.tar.gz path/to/specific/file.txt

The path must match exactly as shown in tar -tzf.

Exclude files or directories
#

tar -czf project.tar.gz ./project/ --exclude='*.log' --exclude='node_modules'

Archive and preserve permissions (as root)
#

sudo tar -czpf full-backup.tar.gz /etc/

The -p flag preserves permissions. This is the default when running as root, but explicit is clearer.

Stream an archive over SSH
#

# Send a directory to a remote host without creating a local file
tar -cz ./project/ | ssh user@server 'tar -xz -C /opt/'

Create an archive with a top-level directory
#

# Puts everything under a project-v1.0/ prefix
tar -czf release.tar.gz --transform='s,^\./,project-v1.0/,' ./src/ ./README.md

Split a large archive into parts
#

tar -czf - ./large-dir/ | split -b 100M - backup-part-

# Reassemble and extract
cat backup-part-* | tar -xzf -

Common File Extensions
#

ExtensionWhat it is
.tarUncompressed tar archive
.tar.gz or .tgzgzip-compressed tar archive
.tar.bz2 or .tbz2bzip2-compressed tar archive
.tar.xz or .txzxz-compressed tar archive
.tar.zstzstandard-compressed tar archive
.gzgzip-compressed single file

Verifying Archive Integrity
#

# Test that a gzip file is valid
gzip -t archive.tar.gz

# Verify tar contents are readable
tar -tzf archive.tar.gz > /dev/null

Troubleshooting
#

“tar: Removing leading ‘/’ from member names”
#

tar strips absolute paths by default to prevent accidentally overwriting system files on extraction. This is intentional and safe.

To extract to the original absolute paths (dangerous):

tar -xzf archive.tar.gz -C / --absolute-names

“gzip: stdin: not in gzip format”
#

The file isn’t actually gzip-compressed. Check what it is:

file archive.tar.gz

Permission errors during extraction
#

# Extract as yourself, ignoring stored ownership
tar -xzf archive.tar.gz --no-same-owner

Best Practices
#

  • Use tar -czf as your default for creating archives — gzip is fast and universal
  • Use xz or zstd when archive size matters more than speed
  • Always list contents (tar -tzf) before extracting archives from untrusted sources
  • Include a top-level directory in archives so extraction doesn’t scatter files into the current directory
  • Use --exclude to leave out build artifacts, logs, and dependency directories
  • Add a date stamp to backup filenames: backup-$(date +%F).tar.gz
  • For large transfers, pipe tar over SSH instead of creating intermediate files