tar

format of tar archives 

File Format


DESCRIPTION

This document describes the format of archives read and written by the tar utility, and by pax using the -xtar option. PTC MKS Toolkit's tar utility supports both the older UNIX-compatible TAR formats and the new USTAR format defined in the POSIX (IEEE P1003.1) standard. The new USTAR format allows more information to be stored and supports longer file path names.

A tar archive, in either format, consists of one or more blocks, which are used to represent member files. Each block is 512 bytes long; the -b option to tar can indicate how many of these blocks are read and/or written at once.

Each member file consists of a header block (as described later in this page) followed by 0 or more blocks containing the file contents. The end of the archive is indicated by two blocks filled with binary zeros. Unused space in the header is left as binary zeros.

The header information in a block is stored in a printable ASCII form, so that tar archives are easily ported to different environments. If the contents of the files on the archive are all ASCII, the entire archive is ASCII.

Table 1 shows the format of the header block for a file, in the older UNIX-compatible TAR format.

Field Width Field Name Meaning

100 name name of file
8 mode file mode
8 uid owner user ID
8 gid owner group ID
12 size length of file in bytes
12 mtime modify time of file
8 chksum checksum for header
1 link indicator for links
100 linkname name of linked file

Table 1: tar Header Block (TAR Format)

The link field is 1 for a linked file, 2 for a symbolic link, and 0 otherwise. A directory is indicated by a trailing slash (/) in its name.

For the new USTAR format, headers take on the format shown in Table 2. Note that tar can determine that the USTAR format is being used by the presence of the null-terminated string "ustar" in the magic field. All fields before the magic field correspond to those of the older format described earlier, except that the typeflag replaces the link field.

Field Width Field Name Meaning

100 name name of file
8 mode file mode
8 uid owner user ID
8 gid owner group ID
12 size length of file in bytes
12 mtime modify time of file
8 chksum checksum for header
1 typeflag type of file
100 linkname name of linked file
6 magic USTAR indicator
2 version USTAR version
32 uname owner user name
32 gname owner group name
8 devmajor device major number
8 devminor device minor number
155 prefix prefix for file name

Table 2: tar Header Block (USTAR Format)

This information is compatible with that returned by the UNIX stat() function; see also stat. The magic, uname, and gname fields are null-terminated character strings. The fields name, linkname, and prefix are null-terminated unless the full field is used to store a name (that is, the last character is not null). All other fields are zero-filled octal numbers, in ASCII. Trailing nulls are present for these numbers, except for the size, mtime, and version fields.

The name field contains the name of the archived file. On USTAR format archives, the value of the prefix field, if non-null, is prefixed to the name field to allow names longer then 100 characters. For compatibility with older tar commands, the PTC MKS Toolkit version of tar leaves prefix null unless the file name exceeds 100 characters.

The size field is 0 if the header describes a link.

The chksum field is a checksum of all the bytes in the header, assuming that the chksum field itself is all blanks.

For USTAR, the typeflag field is a compatible extension of the link field of the older TAR format. Table 3 shows the values that are recognized.

Type Flag File Type

0 or null Regular file
1 Link to another file already archived
2 Symbolic link
3 Character special device
4 Block special device
5 Directory
6 FIFO special file
7 Reserved
A-Z Available for custom usage

Table 3: Type Flag Values for USTAR Format Files

In USTAR format, the uname and gname fields contain the name of the owner and group of the file respectively.


PORTABILITY

tar archives are fully compatible between UNIX and Windows systems because all header information is represented in ASCII.

Because the NTFS file system has fewer features than that of UNIX systems, much of the information stored in the archive is ignored.

The ASCII digit 7 is commonly used in the typeflag field to indicate contiguous files. The use of 2 to indicate a symbolic link is particular to some UNIX versions. These common extensions are mentioned in the POSIX (IEEE P1003.1) standard.


AVAILABILITY

PTC MKS Toolkit for System Administrators
PTC MKS Toolkit for Developers
PTC MKS Toolkit for Interoperability
PTC MKS Toolkit for Professional Developers
PTC MKS Toolkit for Professional Developers 64-Bit Edition
PTC MKS Toolkit for Enterprise Developers
PTC MKS Toolkit for Enterprise Developers 64-Bit Edition


SEE ALSO

Commands:
cpio, pax, tar

File Formats:
cpio


PTC MKS Toolkit 10.4 Documentation Build 39.