C++ Class: InputFile

From Genome Analysis Wiki
Revision as of 17:50, 29 September 2010 by Mktrost (talk | contribs)
Jump to navigationJump to search

InputFile / IFILE

This is our class for file operations. It hides the underlying file structure from the user. That way code can generically open and operate on a file using the exact same interface without having to know if the file is standard, gzip, or bgzf format (for reading from a file - for reading from stdin and writing, the user has to specify which type to open).

The IFILE class is made to mimic FILE. THe typical way to use IFILE is to call the set of global methods contained in InputFile.h that take IFILE as a parameter

To use IFILE, add the following include to your file.

#include "InputFile.h"

Global IFILE Methods

Method Name Description
IFILE ifopen(const char * filename, const char * mode, InputFile::ifileCompression compressionMode = InputFile::DEFAULT) Open the file with the specified name with the specified mode.

A filename = "-" indicates stdin/stdout.

When reading a file (not from stdin), the file compression type is determined by reading the file.

When reading from stdin and writing anywhere the file compression type is determined by the passed in parameter.

int ifclose(IFILE file) Close the file.
unsigned int ifread(IFILE file, void * buffer, unsigned int size) Reads up to size bytes from the file into the buffer. Returns the number of bytes read.
int ifgetc(IFILE file) Reads and returns 1 character from the file. Returns EOF on error/end of file.
void ifrewind(IFILE file) Go to the beginning of the file (cannot be done for stdin/stdout).
int ifeof(IFILE file) Return 0 if it is not the end of the file, otherwise returns non-zero.
unsigned int ifwrite(IFILE file, const void * buffer, unsigned int size) Write the size bytes from buffer into the file.
long int iftell(IFILE file) Tell the current position in the file. Can be fed back into ifseek.
bool ifseek(IFILE file, long int offset, int origin) Go to the specified position (result from an iftell) in the file (cannot be done for stdin/stdout).

Class Enums

Enum Value Description
enum ifileCompression
DEFAULT Use the default method for determining file type.

Opens as UNCOMPRESSED unless the filename ends in ".gz", then opened as GZIP

UNCOMPRESSED Standard, Uncompressed File Type.
GZIP Gzip File Type.
BGZF bgzf File Type.

BGZF Notes

Newer BGZF files have a empty BGZF block at the end to mark the EOF. By default when opening BGZF Files for reading, the software requires the empty block and the fail opens if it is not there. To support files without the empty block, the following call must first be made:


With that call, the empty block is not checked for when opening the file.


This class can be used to read/write to stdin/stdout.

To use stdin/stdout, specify the filename as "-" and use the ifileCompression parameter in ifopen to specify the compression type.

NOTE: even when reading from stdin, you MUST specify the file type. It does not read from stdin to determine the compression type like it does for other files.

When writing programs that will use stdout and pipes to send input from one program to the next, make sure that all error, debug, or info messages are written to stderr/cerr and not stdout/cout.


 // specify to open stdin for read.
 // replace the - with a filename in order to read from a file.
 const char* filename = "-";
 IFILE myFilePtr = ifopen(filename, "rb", InputFile::BGZF);
 if (myFilePtr != NULL)
     std::cerr << "Failed to open the file\n";
    // File was successfully opened.
    // Read the magic string.
    char magic[4];
    if(ifread(myFilePtr, magic, 4) != 4)
       std::cerr << "Could not read 4 bytes from the file\n";
    myFilePtr = NULL;