source: trunk/libdjvu/IFFByteStream.h @ 209

Last change on this file since 209 was 206, checked in by Eugene Romanenko, 14 years ago

DJVU plugin: djvulibre updated to version 3.5.19

File size: 15.0 KB
Line 
1//C-  -*- C++ -*-
2//C- -------------------------------------------------------------------
3//C- DjVuLibre-3.5
4//C- Copyright (c) 2002  Leon Bottou and Yann Le Cun.
5//C- Copyright (c) 2001  AT&T
6//C-
7//C- This software is subject to, and may be distributed under, the
8//C- GNU General Public License, either Version 2 of the license,
9//C- or (at your option) any later version. The license should have
10//C- accompanied the software or you may obtain a copy of the license
11//C- from the Free Software Foundation at http://www.fsf.org .
12//C-
13//C- This program is distributed in the hope that it will be useful,
14//C- but WITHOUT ANY WARRANTY; without even the implied warranty of
15//C- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
16//C- GNU General Public License for more details.
17//C-
18//C- DjVuLibre-3.5 is derived from the DjVu(r) Reference Library from
19//C- Lizardtech Software.  Lizardtech Software has authorized us to
20//C- replace the original DjVu(r) Reference Library notice by the following
21//C- text (see doc/lizard2002.djvu and doc/lizardtech2007.djvu):
22//C-
23//C-  ------------------------------------------------------------------
24//C- | DjVu (r) Reference Library (v. 3.5)
25//C- | Copyright (c) 1999-2001 LizardTech, Inc. All Rights Reserved.
26//C- | The DjVu Reference Library is protected by U.S. Pat. No.
27//C- | 6,058,214 and patents pending.
28//C- |
29//C- | This software is subject to, and may be distributed under, the
30//C- | GNU General Public License, either Version 2 of the license,
31//C- | or (at your option) any later version. The license should have
32//C- | accompanied the software or you may obtain a copy of the license
33//C- | from the Free Software Foundation at http://www.fsf.org .
34//C- |
35//C- | The computer code originally released by LizardTech under this
36//C- | license and unmodified by other parties is deemed "the LIZARDTECH
37//C- | ORIGINAL CODE."  Subject to any third party intellectual property
38//C- | claims, LizardTech grants recipient a worldwide, royalty-free,
39//C- | non-exclusive license to make, use, sell, or otherwise dispose of
40//C- | the LIZARDTECH ORIGINAL CODE or of programs derived from the
41//C- | LIZARDTECH ORIGINAL CODE in compliance with the terms of the GNU
42//C- | General Public License.   This grant only confers the right to
43//C- | infringe patent claims underlying the LIZARDTECH ORIGINAL CODE to
44//C- | the extent such infringement is reasonably necessary to enable
45//C- | recipient to make, have made, practice, sell, or otherwise dispose
46//C- | of the LIZARDTECH ORIGINAL CODE (or portions thereof) and not to
47//C- | any greater extent that may be necessary to utilize further
48//C- | modifications or combinations.
49//C- |
50//C- | The LIZARDTECH ORIGINAL CODE is provided "AS IS" WITHOUT WARRANTY
51//C- | OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED
52//C- | TO ANY WARRANTY OF NON-INFRINGEMENT, OR ANY IMPLIED WARRANTY OF
53//C- | MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
54//C- +------------------------------------------------------------------
55//
56// $Id: IFFByteStream.h,v 1.12 2007/03/25 20:48:32 leonb Exp $
57// $Name: release_3_5_19 $
58
59#ifndef _IFFBYTESTREAM_H_
60#define _IFFBYTESTREAM_H_
61#ifdef HAVE_CONFIG_H
62#include "config.h"
63#endif
64#if NEED_GNUG_PRAGMAS
65# pragma interface
66#endif
67
68
69/** @name IFFByteStream.h
70
71    Files #"IFFByteStream.h"# and #"IFFByteStream.cpp"# implement a parser for
72    files structured according the Electronic Arts ``EA IFF 85 Interchange
73    File Format''.  IFF files are composed of a sequence of data {\em chunks}.
74    Each chunk is identified by a four character {\em chunk identifier}
75    describing the type of the data stored in the chunk.  A few special chunk
76    identifiers, for instance #"FORM"#, are reserved for {\em composite
77    chunks} which themselves contain a sequence of data chunks.  This
78    conventions effectively provides IFF files with a convenient hierarchical
79    structure.  Composite chunks are further identified by a secondary chunk
80    identifier.
81   
82    We found convenient to define a {\em extended chunk identifier}.  In the
83    case of a regular chunk, the extended chunk identifier is simply the
84    chunk identifier, as in #"PM44"#. In the case of a composite chunk, the
85    extended chunk identifier is composed by concatenating the main chunk
86    identifier, a colon, and the secondary chunk identifier, as in
87    #"FORM:DJVU"#.
88
89    Class \Ref{IFFByteStream} provides a way to read or write IFF structured
90    files.  Member functions provide an easy mean to position the underlying
91    \Ref{ByteStream} at the beginning of each chunk and to read or write the
92    data until reaching the end of the chunk.  The utility program
93    \Ref{djvuinfo} demonstrates how to use class #IFFByteStream#.
94
95    {\bf IFF Files and ZP-Coder} ---
96    Class #IFFByteStream# repositions the underlying ByteStream whenever a new
97    chunk is accessed.  It is possible to code chunk data with the ZP-Coder
98    without worrying about the final file position. See class \Ref{ZPCodec}
99    for more details.
100   
101    {\bf DjVu IFF Files} --- We had initially planned to exactly follow the
102    IFF specifications.  Then we realized that certain versions of MSIE
103    recognize any IFF file as a Microsoft AIFF sound file and pop a message
104    box "Cannot play that sound".  It appears that the structure of AIFF files
105    is entirely modeled after the IFF standard, with small variations
106    regarding the endianness of numbers and the padding rules.  We eliminate
107    this problem by casting the octet protection spell.  Our IFF files always
108    start with the four octets #0x41,0x54,0x26,0x54# followed by the fully
109    conformant IFF byte stream.  Class #IFFByteStream# silently skips these
110    four octets when it encounters them.
111
112    {\bf References} --- EA IFF 85 Interchange File Format specification:\\
113    \URL{http://www.cica.indiana.edu/graphics/image_specs/ilbm.format.txt} or
114    \URL{http://www.tnt.uni-hannover.de/soft/compgraph/fileformats/docs/iff.pre}
115
116    @memo
117    IFF file parser.
118    @author
119    L\'eon Bottou <leonb@research.att.com>
120
121// From: Leon Bottou, 1/31/2002
122// This has been changed by Lizardtech to fit better
123// with their re-implementation of ByteStreams.
124
125    @version
126    #$Id: IFFByteStream.h,v 1.12 2007/03/25 20:48:32 leonb Exp $# */
127//@{
128
129
130#include "DjVuGlobal.h"
131#include <stdlib.h>
132#include <stdio.h>
133#include <string.h>
134#include "GException.h"
135#include "GString.h"
136#include "ByteStream.h"
137
138
139#ifdef HAVE_NAMESPACES
140namespace DJVU {
141# ifdef NOT_DEFINED // Just to fool emacs c++ mode
142}
143#endif
144#endif
145
146/** ByteStream interface for an IFF file.
147
148    Class #IFFByteStream# augments the #ByteStream# interface with
149    functions for navigating from chunk to chunk.  It works in relation
150    with a ByteStream specified at construction time.
151
152    {\bf Reading an IFF file} --- You can read an IFF file by constructing an
153    #IFFByteStream# object attached to the ByteStream containing the IFF file.
154    Calling function \Ref{get_chunk} positions the file pointer at the
155    beginning of the first chunk.  You can then use \Ref{ByteStream::read} to
156    access the chunk data.  Function #read# will return #0# if you attempt to
157    read past the end of the chunk, just as if you were trying to read past
158    the end of a file. You can at any time call function \Ref{close_chunk} to
159    terminate reading data in this chunk.  The following chunks can be
160    accessed by calling #get_chunk# and #close_chunk# repeatedly until you
161    reach the end of the file.  Function #read# is not very useful when
162    accessing a composite chunk.  You can instead make nested calls to
163    functions #get_chunk# and #close_chunk# in order to access the chunks
164    located inside the composite chunk.
165   
166    {\bf Writing an IFF file} --- You can write an IFF file by constructing an
167    #IFFByteStream# object attached to the seekable ByteStream object that
168    will contain the IFF file.  Calling function \Ref{put_chunk} creates a
169    first chunk header and positions the file pointer at the beginning of the
170    chunk.  You can then use \Ref{ByteStream::write} to store the chunk data.
171    Calling function \Ref{close_chunk} terminates the current chunk.  You can
172    append more chunks by calling #put_chunk# and #close_chunk# repeatedly.
173    Function #write# is not very useful for writing a composite chunk.  You
174    can instead make nested calls to function #put_chunk# and #close_chunk# in
175    order to create chunks located inside the composite chunk.
176
177    Writing an IFF file requires a seekable ByteStream (see
178    \Ref{ByteStream::is_seekable}).  This is not much of a problem because you
179    can always create the IFF file into a \Ref{MemoryByteStream} and then use
180    \Ref{ByteStream::copy} to transfer the IFF file into a non seekable
181    ByteStream.  */
182
183class IFFByteStream : protected ByteStream::Wrapper
184{
185protected: 
186  IFFByteStream(const GP<ByteStream> &bs, const int pos);
187public:
188  /** Constructs an IFFByteStream object attached to ByteStream #bs#.
189      Any ByteStream can be used when reading an IFF file.  Writing
190      an IFF file however requires a seekable ByteStream. */
191  static GP<IFFByteStream> create(const GP<ByteStream> &bs);
192  // --- BYTESTREAM INTERFACE
193  ~IFFByteStream();
194  virtual size_t read(void *buffer, size_t size);
195  virtual size_t write(const void *buffer, size_t size);
196  virtual long tell(void) const;
197  // -- NAVIGATING CHUNKS
198  /** Enters a chunk for reading.  Function #get_chunk# returns zero when the
199      last chunk has already been accessed.  Otherwise it parses a chunk
200      header, positions the IFFByteStream at the beginning of the chunk data,
201      stores the extended chunk identifier into string #chkid#, and returns
202      the non zero chunk size.  The file offset of the chunk data may be
203      retrieved using function #tell#.  The chunk data can then be read using
204      function #read# until reaching the end of the chunk.  Advanced users may
205      supply two pointers to integer variables using arguments #rawoffsetptr#
206      and #rawsizeptr#. These variables will be overwritten with the offset
207      and the length of the file segment containing both the chunk header and
208      the chunk data. */
209  int get_chunk(GUTF8String &chkid, int *rawoffsetptr=0, int *rawsizeptr=0);
210  /** Enters a chunk for writing.  Function #put_chunk# prepares a chunk
211      header and positions the IFFByteStream at the beginning of the chunk
212      data.  Argument #chkid# defines a extended chunk identifier for this
213      chunk.  The chunk data can then be written using function #write#.  The
214      chunk is terminated by a matching call to function #close_chunk#.  When
215      #insertmagic# is non zero, function #put_chunk# inserts the bytes:
216      0x41, 0x54, 0x26, 0x54 before the chunk header, as discussed in
217      \Ref{IFFByteStream.h}. */
218  void put_chunk(const char *chkid, int insertmagic=0);
219  /** Leaves the current chunk.  This function leaves the chunk previously
220      entered by a matching call to #get_chunk# and #put_chunk#.  The
221      IFFByteStream is then ready to process the next chunk at the same
222      hierarchical level. */
223  void close_chunk();
224  /** This is identical to the above, plus it adds a seek to the start of
225      the next chunk.  This way we catch EOF errors with the current chunk.*/
226  void seek_close_chunk();
227  /** Returns true when it is legal to call #read# or #write#. */
228  int ready();
229  /** Returns true when the current chunk is a composite chunk. */
230  int composite();
231  /** Returns the current chunk identifier of the current chunk.  String
232      #chkid# is overwritten with the {\em extended chunk identifier} of the
233      current chunk.  The extended chunk identifier of a regular chunk is
234      simply the chunk identifier, as in #"PM44"#.  The extended chunk
235      identifier of a composite chunk is the concatenation of the chunk
236      identifier, of a semicolon #":"#, and of the secondary chunk identifier,
237      as in #"FORM:DJVU"#. */
238  void short_id(GUTF8String &chkid);
239  /** Returns the qualified chunk identifier of the current chunk.  String
240      #chkid# is overwritten with the {\em qualified chunk identifier} of the
241      current chunk.  The qualified chunk identifier of a composite chunk is
242      equal to the extended chunk identifier.  The qualified chunk identifier
243      of a regular chunk is composed by concatenating the secondary chunk
244      identifier of the closest #"FORM"# or #"PROP"# composite chunk
245      containing the current chunk, a dot #"."#, and the current chunk
246      identifier, as in #"DJVU.INFO"#.  According to the EA IFF 85 identifier
247      scoping rules, the qualified chunk identifier uniquely defines how the
248      chunk data should be interpreted. */
249  void full_id(GUTF8String &chkid);
250  /** Checks a potential chunk identifier.  This function categorizes the
251      chunk identifier formed by the first four characters of string #chkid#.
252      It returns #0# if this is a legal identifier for a regular chunk.  It
253      returns #+1# if this is a reserved composite chunk identifier.  It
254      returns #-1# if this is an illegal or otherwise reserved identifier
255      which should not be used.  */
256  static int check_id(const char *id);
257  GP<ByteStream> get_bytestream(void) {return this;}
258  /** Copy data from another ByteStream.  A maximum of #size# bytes are read
259      from the ByteStream #bsfrom# and are written to the ByteStream #*this#
260      at the current position.  Less than #size# bytes may be written if an
261      end-of-file mark is reached on #bsfrom#.  This function returns the
262      total number of bytes copied.  Setting argument #size# to zero (the
263      default value) has a special meaning: the copying process will continue
264      until reaching the end-of-file mark on ByteStream #bsfrom#, regardless
265      of the number of bytes transferred.  */
266  size_t copy(ByteStream &bsfrom, size_t size=0)
267  { return get_bytestream()->copy(bsfrom,size); }
268  /** Flushes all buffers in the ByteStream.  Calling this function
269      guarantees that pending data have been actually written (i.e. passed to
270      the operating system). Class #ByteStream# provides a default
271      implementation which does nothing. */
272  virtual void flush(void)
273  { ByteStream::Wrapper::flush(); }
274  /** This is a simple compare method.  The IFFByteStream may be read for
275      the sake of the comparison.  Since IFFByteStreams are non-seekable,
276      the stream is not valid for use after comparing, regardless of the
277      result. */
278  bool compare(IFFByteStream &iff);
279  /** #has_magic_att# is true if the stream has
280      the DjVu magic 'AT&T' marker. */
281  bool has_magic_att;
282  /** #has_magic_sdjv# is true if the stream has
283      the Celartem magic 'SDJV' marker. */
284  bool has_magic_sdjv;
285private:
286  // private datatype
287  struct IFFContext
288  {
289    IFFContext *next;
290    long offStart;
291    long offEnd;
292    char idOne[4];
293    char idTwo[4];
294    char bComposite;
295  };
296  // Implementation
297  IFFContext *ctx;
298  long offset;
299  long seekto;
300  int dir;
301  // Cancel C++ default stuff
302  IFFByteStream(const IFFByteStream &);
303  IFFByteStream & operator=(const IFFByteStream &);
304  static GP<IFFByteStream> create(ByteStream *bs);
305};
306
307//@}
308
309
310
311#ifdef HAVE_NAMESPACES
312}
313# ifndef NOT_USING_DJVU_NAMESPACE
314using namespace DJVU;
315# endif
316#endif
317#endif
Note: See TracBrowser for help on using the repository browser.