Jari Oksanen jarioksa at sun3.oulu.fi
Wed Nov 27 09:12:19 CET 2002

ligges at statistik.uni-dortmund.de said:
> Matej Cepl wrote:
> Carlos Ortega wrote:
> > Well you can produce JPEG, PNG or BMP images directly from
> > R and later you can transform the image to the desired type of
> > file format with another second program (one that provide the
> > feature you need is, ImageMagick).
> On slightly related theme -- is there anyway how to make these
> EPS pictures less bulky? I am writing a paper with couple of
> pairs() and they are all 600k, 800k, etc. And it is really heck
> of waiting to work with them (moreover, currently it is three
> pages, but PDF file is already 1MB). However, I would prefer to
> use EPS before PNG/BMP (which would be probably smaller and
> faster) because of device independency.
> Any thoughts on that?
> These are vector format and each plotted data point, each line etc. is
> represented in the file, so size can be modelled as lm(size ~
> number.of.elements) (yes, including an intercept). ;-) 

Years ago I struggled with PostScript file sizes when I wrote a program
producing PS graphics with the DISLIN library (which is an excellent graphics
library, by the way). My initial file sizes were too big for my PC, and in
particular, for the tiny memory of the printer of those days. There indeed are
some tricks you can do, but unfortunately for you Matej, most of them are
already done in R, so you cannot improve much. However, you can influence `size'
in lm(size ~ number.of.elements) (but not the intercept). I guess that in
pairs() plots the bulk of the file size comes from the plotting symbols for
observations. Circles use a larger number of line segments than, say, triangles
(fortunately they are not real circles, which would use an infinite number of
line segments each in vector graphics). Depending on the way the symbols are
drawn, triangle may be the best choice, perhaps better than a cross or x. Point
may be still better, if that's a point indeed instead of being a crumbled
vector. Depending on the implementation, symbols in recognized PostScript fonts
may be still smaller. This, of course, requires that the driver uses PostScript
fonts instead of presenting them in vector (or pixel graphics). 

In my old DISLIN graphics I could reduce the file sizes by 70% (!) from the
original 1M+ when I used PostScript fonts in labels instead of HP Plotter
glyphs, and used triangles instead of circles as plotting characters. The former
is already done in R, so you can influence only the plotting character. 

Further trick that I used was that I wrote an "adaptive" curve plotting
function, which didn't represent the curve as a sequnece of, say, 101 vectors,
but started with a sparse grid of points and checked if the curve would change 
if I add a point between two existing points. So I used denser grid where the 
curve changed rapidly (in effect, had high second derivatives) and a sparse 
grid elsewhere. The result looked better (smoother) than with 101 points 
although I used on average some 35 points rationally positioned. I guess you 
can't easily have this added to your pairs() function, though, since that would 
require hacking the R sources. How much this would help, depends on the number 
of segments used originally for curves, and probably much less than the 
plotting symbols.

I hope this helps but probably it won't help very much. Big files are big 
files, what ever you do.

cheers, jari oksanen
Jari Oksanen -- Dept Biology, Univ Oulu, 90014 Oulu, Finland
Ph. +358 8 5531526, cell +358 40 5136529, fax +358 8 5531061
email jari.oksanen at oulu.fi, homepage http://cc.oulu.fi/~jarioksa/

r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list