source: trunk/libdjvu/IFFByteStream.h @ 81

Last change on this file since 81 was 17, checked in by Eugene Romanenko, 16 years ago

update makefiles, remove absolute paths, update djvulibre to version 3.5.17

File size: 15.0 KB
Line 
1//C-  -*- C++ -*-
2//C- -------------------------------------------------------------------
3//C- DjVuLibre-3.5
4//C- Copyright (c) 2002  Leon Bottou and Yann Le Cun.
5//C- Copyright (c) 2001  AT&T
6//C-
7//C- This software is subject to, and may be distributed under, the
8//C- GNU General Public License, Version 2. The license should have
9//C- accompanied the software or you may obtain a copy of the license
10//C- from the Free Software Foundation at http://www.fsf.org .
11//C-
12//C- This program is distributed in the hope that it will be useful,
13//C- but WITHOUT ANY WARRANTY; without even the implied warranty of
14//C- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
15//C- GNU General Public License for more details.
16//C-
17//C- DjVuLibre-3.5 is derived from the DjVu(r) Reference Library
18//C- distributed by Lizardtech Software.  On July 19th 2002, Lizardtech
19//C- Software authorized us to replace the original DjVu(r) Reference
20//C- Library notice by the following text (see doc/lizard2002.djvu):
21//C-
22//C-  ------------------------------------------------------------------
23//C- | DjVu (r) Reference Library (v. 3.5)
24//C- | Copyright (c) 1999-2001 LizardTech, Inc. All Rights Reserved.
25//C- | The DjVu Reference Library is protected by U.S. Pat. No.
26//C- | 6,058,214 and patents pending.
27//C- |
28//C- | This software is subject to, and may be distributed under, the
29//C- | GNU General Public License, Version 2. The license should have
30//C- | accompanied the software or you may obtain a copy of the license
31//C- | from the Free Software Foundation at http://www.fsf.org .
32//C- |
33//C- | The computer code originally released by LizardTech under this
34//C- | license and unmodified by other parties is deemed "the LIZARDTECH
35//C- | ORIGINAL CODE."  Subject to any third party intellectual property
36//C- | claims, LizardTech grants recipient a worldwide, royalty-free,
37//C- | non-exclusive license to make, use, sell, or otherwise dispose of
38//C- | the LIZARDTECH ORIGINAL CODE or of programs derived from the
39//C- | LIZARDTECH ORIGINAL CODE in compliance with the terms of the GNU
40//C- | General Public License.   This grant only confers the right to
41//C- | infringe patent claims underlying the LIZARDTECH ORIGINAL CODE to
42//C- | the extent such infringement is reasonably necessary to enable
43//C- | recipient to make, have made, practice, sell, or otherwise dispose
44//C- | of the LIZARDTECH ORIGINAL CODE (or portions thereof) and not to
45//C- | any greater extent that may be necessary to utilize further
46//C- | modifications or combinations.
47//C- |
48//C- | The LIZARDTECH ORIGINAL CODE is provided "AS IS" WITHOUT WARRANTY
49//C- | OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
50//C- | TO ANY WARRANTY OF NON-INFRINGEMENT, OR ANY IMPLIED WARRANTY OF
51//C- | MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
52//C- +------------------------------------------------------------------
53//
54// $Id: IFFByteStream.h,v 1.10 2003/11/07 22:08:21 leonb Exp $
55// $Name:  $
56
57#ifndef _IFFBYTESTREAM_H_
58#define _IFFBYTESTREAM_H_
59#ifdef HAVE_CONFIG_H
60#include "config.h"
61#endif
62#if NEED_GNUG_PRAGMAS
63# pragma interface
64#endif
65
66
67/** @name IFFByteStream.h
68
69    Files #"IFFByteStream.h"# and #"IFFByteStream.cpp"# implement a parser for
70    files structured according the Electronic Arts ``EA IFF 85 Interchange
71    File Format''.  IFF files are composed of a sequence of data {\em chunks}.
72    Each chunk is identified by a four character {\em chunk identifier}
73    describing the type of the data stored in the chunk.  A few special chunk
74    identifiers, for instance #"FORM"#, are reserved for {\em composite
75    chunks} which themselves contain a sequence of data chunks.  This
76    conventions effectively provides IFF files with a convenient hierarchical
77    structure.  Composite chunks are further identified by a secondary chunk
78    identifier.
79   
80    We found convenient to define a {\em extended chunk identifier}.  In the
81    case of a regular chunk, the extended chunk identifier is simply the
82    chunk identifier, as in #"PM44"#. In the case of a composite chunk, the
83    extended chunk identifier is composed by concatenating the main chunk
84    identifier, a colon, and the secondary chunk identifier, as in
85    #"FORM:DJVU"#.
86
87    Class \Ref{IFFByteStream} provides a way to read or write IFF structured
88    files.  Member functions provide an easy mean to position the underlying
89    \Ref{ByteStream} at the beginning of each chunk and to read or write the
90    data until reaching the end of the chunk.  The utility program
91    \Ref{djvuinfo} demonstrates how to use class #IFFByteStream#.
92
93    {\bf IFF Files and ZP-Coder} ---
94    Class #IFFByteStream# repositions the underlying ByteStream whenever a new
95    chunk is accessed.  It is possible to code chunk data with the ZP-Coder
96    without worrying about the final file position. See class \Ref{ZPCodec}
97    for more details.
98   
99    {\bf DjVu IFF Files} --- We had initially planned to exactly follow the
100    IFF specifications.  Then we realized that certain versions of MSIE
101    recognize any IFF file as a Microsoft AIFF sound file and pop a message
102    box "Cannot play that sound".  It appears that the structure of AIFF files
103    is entirely modeled after the IFF standard, with small variations
104    regarding the endianness of numbers and the padding rules.  We eliminate
105    this problem by casting the octet protection spell.  Our IFF files always
106    start with the four octets #0x41,0x54,0x26,0x54# followed by the fully
107    conformant IFF byte stream.  Class #IFFByteStream# silently skips these
108    four octets when it encounters them.
109
110    {\bf References} --- EA IFF 85 Interchange File Format specification:\\
111    \URL{http://www.cica.indiana.edu/graphics/image_specs/ilbm.format.txt} or
112    \URL{http://www.tnt.uni-hannover.de/soft/compgraph/fileformats/docs/iff.pre}
113
114    @memo
115    IFF file parser.
116    @author
117    L\'eon Bottou <leonb@research.att.com>
118
119// From: Leon Bottou, 1/31/2002
120// This has been changed by Lizardtech to fit better
121// with their re-implementation of ByteStreams.
122
123    @version
124    #$Id: IFFByteStream.h,v 1.10 2003/11/07 22:08:21 leonb Exp $# */
125//@{
126
127
128#include "DjVuGlobal.h"
129#include <stdlib.h>
130#include <stdio.h>
131#include <string.h>
132#include "GException.h"
133#include "GString.h"
134#include "ByteStream.h"
135
136
137#ifdef HAVE_NAMESPACES
138namespace DJVU {
139# ifdef NOT_DEFINED // Just to fool emacs c++ mode
140}
141#endif
142#endif
143
144/** ByteStream interface for an IFF file.
145
146    Class #IFFByteStream# augments the #ByteStream# interface with
147    functions for navigating from chunk to chunk.  It works in relation
148    with a ByteStream specified at construction time.
149
150    {\bf Reading an IFF file} --- You can read an IFF file by constructing an
151    #IFFByteStream# object attached to the ByteStream containing the IFF file.
152    Calling function \Ref{get_chunk} positions the file pointer at the
153    beginning of the first chunk.  You can then use \Ref{ByteStream::read} to
154    access the chunk data.  Function #read# will return #0# if you attempt to
155    read past the end of the chunk, just as if you were trying to read past
156    the end of a file. You can at any time call function \Ref{close_chunk} to
157    terminate reading data in this chunk.  The following chunks can be
158    accessed by calling #get_chunk# and #close_chunk# repeatedly until you
159    reach the end of the file.  Function #read# is not very useful when
160    accessing a composite chunk.  You can instead make nested calls to
161    functions #get_chunk# and #close_chunk# in order to access the chunks
162    located inside the composite chunk.
163   
164    {\bf Writing an IFF file} --- You can write an IFF file by constructing an
165    #IFFByteStream# object attached to the seekable ByteStream object that
166    will contain the IFF file.  Calling function \Ref{put_chunk} creates a
167    first chunk header and positions the file pointer at the beginning of the
168    chunk.  You can then use \Ref{ByteStream::write} to store the chunk data.
169    Calling function \Ref{close_chunk} terminates the current chunk.  You can
170    append more chunks by calling #put_chunk# and #close_chunk# repeatedly.
171    Function #write# is not very useful for writing a composite chunk.  You
172    can instead make nested calls to function #put_chunk# and #close_chunk# in
173    order to create chunks located inside the composite chunk.
174
175    Writing an IFF file requires a seekable ByteStream (see
176    \Ref{ByteStream::is_seekable}).  This is not much of a problem because you
177    can always create the IFF file into a \Ref{MemoryByteStream} and then use
178    \Ref{ByteStream::copy} to transfer the IFF file into a non seekable
179    ByteStream.  */
180
181class IFFByteStream : protected ByteStream::Wrapper
182{
183protected: 
184  IFFByteStream(const GP<ByteStream> &bs, const int pos);
185public:
186  /** Constructs an IFFByteStream object attached to ByteStream #bs#.
187      Any ByteStream can be used when reading an IFF file.  Writing
188      an IFF file however requires a seekable ByteStream. */
189  static GP<IFFByteStream> create(const GP<ByteStream> &bs);
190  // --- BYTESTREAM INTERFACE
191  ~IFFByteStream();
192  virtual size_t read(void *buffer, size_t size);
193  virtual size_t write(const void *buffer, size_t size);
194  virtual long tell(void) const;
195  // -- NAVIGATING CHUNKS
196  /** Enters a chunk for reading.  Function #get_chunk# returns zero when the
197      last chunk has already been accessed.  Otherwise it parses a chunk
198      header, positions the IFFByteStream at the beginning of the chunk data,
199      stores the extended chunk identifier into string #chkid#, and returns
200      the non zero chunk size.  The file offset of the chunk data may be
201      retrieved using function #tell#.  The chunk data can then be read using
202      function #read# until reaching the end of the chunk.  Advanced users may
203      supply two pointers to integer variables using arguments #rawoffsetptr#
204      and #rawsizeptr#. These variables will be overwritten with the offset
205      and the length of the file segment containing both the chunk header and
206      the chunk data. */
207  int get_chunk(GUTF8String &chkid, int *rawoffsetptr=0, int *rawsizeptr=0);
208  /** Enters a chunk for writing.  Function #put_chunk# prepares a chunk
209      header and positions the IFFByteStream at the beginning of the chunk
210      data.  Argument #chkid# defines a extended chunk identifier for this
211      chunk.  The chunk data can then be written using function #write#.  The
212      chunk is terminated by a matching call to function #close_chunk#.  When
213      #insertmagic# is non zero, function #put_chunk# inserts the bytes:
214      0x41, 0x54, 0x26, 0x54 before the chunk header, as discussed in
215      \Ref{IFFByteStream.h}. */
216  void put_chunk(const char *chkid, int insertmagic=0);
217  /** Leaves the current chunk.  This function leaves the chunk previously
218      entered by a matching call to #get_chunk# and #put_chunk#.  The
219      IFFByteStream is then ready to process the next chunk at the same
220      hierarchical level. */
221  void close_chunk();
222  /** This is identical to the above, plus it adds a seek to the start of
223      the next chunk.  This way we catch EOF errors with the current chunk.*/
224  void seek_close_chunk();
225  /** Returns true when it is legal to call #read# or #write#. */
226  int ready();
227  /** Returns true when the current chunk is a composite chunk. */
228  int composite();
229  /** Returns the current chunk identifier of the current chunk.  String
230      #chkid# is overwritten with the {\em extended chunk identifier} of the
231      current chunk.  The extended chunk identifier of a regular chunk is
232      simply the chunk identifier, as in #"PM44"#.  The extended chunk
233      identifier of a composite chunk is the concatenation of the chunk
234      identifier, of a semicolon #":"#, and of the secondary chunk identifier,
235      as in #"FORM:DJVU"#. */
236  void short_id(GUTF8String &chkid);
237  /** Returns the qualified chunk identifier of the current chunk.  String
238      #chkid# is overwritten with the {\em qualified chunk identifier} of the
239      current chunk.  The qualified chunk identifier of a composite chunk is
240      equal to the extended chunk identifier.  The qualified chunk identifier
241      of a regular chunk is composed by concatenating the secondary chunk
242      identifier of the closest #"FORM"# or #"PROP"# composite chunk
243      containing the current chunk, a dot #"."#, and the current chunk
244      identifier, as in #"DJVU.INFO"#.  According to the EA IFF 85 identifier
245      scoping rules, the qualified chunk identifier uniquely defines how the
246      chunk data should be interpreted. */
247  void full_id(GUTF8String &chkid);
248  /** Checks a potential chunk identifier.  This function categorizes the
249      chunk identifier formed by the first four characters of string #chkid#.
250      It returns #0# if this is a legal identifier for a regular chunk.  It
251      returns #+1# if this is a reserved composite chunk identifier.  It
252      returns #-1# if this is an illegal or otherwise reserved identifier
253      which should not be used.  */
254  static int check_id(const char *id);
255  GP<ByteStream> get_bytestream(void) {return this;}
256  /** Copy data from another ByteStream.  A maximum of #size# bytes are read
257      from the ByteStream #bsfrom# and are written to the ByteStream #*this#
258      at the current position.  Less than #size# bytes may be written if an
259      end-of-file mark is reached on #bsfrom#.  This function returns the
260      total number of bytes copied.  Setting argument #size# to zero (the
261      default value) has a special meaning: the copying process will continue
262      until reaching the end-of-file mark on ByteStream #bsfrom#, regardless
263      of the number of bytes transferred.  */
264  size_t copy(ByteStream &bsfrom, size_t size=0)
265  { return get_bytestream()->copy(bsfrom,size); }
266  /** Flushes all buffers in the ByteStream.  Calling this function
267      guarantees that pending data have been actually written (i.e. passed to
268      the operating system). Class #ByteStream# provides a default
269      implementation which does nothing. */
270  virtual void flush(void)
271  { ByteStream::Wrapper::flush(); }
272  /** This is a simple compare method.  The IFFByteStream may be read for
273      the sake of the comparison.  Since IFFByteStreams are non-seekable,
274      the stream is not valid for use after comparing, regardless of the
275      result. */
276  bool compare(IFFByteStream &iff);
277  /** #has_magic# is true if the stream has the DjVu file magic.
278   */
279  bool has_magic;
280private:
281  // private datatype
282  struct IFFContext
283  {
284    IFFContext *next;
285    long offStart;
286    long offEnd;
287    char idOne[4];
288    char idTwo[4];
289    char bComposite;
290  };
291  // Implementation
292  IFFContext *ctx;
293  long offset;
294  long seekto;
295  int dir;
296  // Cancel C++ default stuff
297  IFFByteStream(const IFFByteStream &);
298  IFFByteStream & operator=(const IFFByteStream &);
299  static GP<IFFByteStream> create(ByteStream *bs);
300};
301
302//@}
303
304
305
306#ifdef HAVE_NAMESPACES
307}
308# ifndef NOT_USING_DJVU_NAMESPACE
309using namespace DJVU;
310# endif
311#endif
312#endif
Note: See TracBrowser for help on using the repository browser.