Text merged together after Optimizing PDF with Acrobat Pro 9

AlanZ

Member
Hello,

I recently switched to Acrobat Pro 9 for Mac (version included with Design Premium CS4). I used to use Acrobat Pro 7 on Windows XP.

After optimizing a few PDFs, I noticed the first 2-3 words on most pages had been merged together. The affected PDF's were originally created with PDFlib, but I've optimized hundreds of PDF's created with PDFlib in the past, all without incident. I only began to see the problem after switching to CS4 for Mac.

Has anybody else encountered text merging together after optimizing a PDF?

These are the optimizer settings that led to the problem.

My Optimizer Settings #1:
- PDF v1.5 (Acrobat 6.x) - set to retain existing
- Images - Downsample (bi-cubic) all color & grayscale images above 300ppi to 300ppi, jpeg maximum
- Fonts - checkbox selected "Do not unembed any font", all fonts fully embbeded
- Transparency - un-selected
- Discard objects - "discard all alternate images" selected - all others un-selected
- Discard user data selected - but none of the subsequent check boxes were selected
- Clean up - "remove compression", "discard invalid bookmarks", "discard invalid links", "optimize the PDF for fast web view" - were all selected, the others were left unchecked.

My only reason for optimizing these files in the first place was to downsample the images to 300 ppi max, before submitting them to our printer. So after noticing the problems optimizer settings #1 had caused. I tried deselecting every option in the PDF Optimizer except the following-

- Images - Downsample (bi-cubic) all color & grayscale images above 300ppi to 300ppi, jpeg maximum

Unfortunately, the problem continued to show up.

Eventually I was able to to downsample the images by using a custom preflight task and selecting "analyze and fix". I'm glad I found a workaround but it is not ideal and I'd like to get to the bottom of this.

Thanks!
 

Attachments

  • MergedTextBefore.pdf
    40.2 KB · Views: 266
  • MergedTextAfter.pdf
    40.1 KB · Views: 260
Wow thats odd. Whats even odder tho is if I click after the first word to type a space using Pit Stop the space is there automatically before I even type it! One of the many reasons I'm not using CS4 unless I'm forced to.
 
Why is someone creating these PDFs in such a crazy way as to use negative scaling all over the place, like -100 Tz or a font size of -12?

Admittedly the problem that occurs to you should still not happen (Acrobat's PDF rewriting of content streams has an occasional problem with kerning between partial text strings, which I believe kicks in here).

Sorry I can be of no more help....

Olaf Drümmer
 
Flattening more important than down sampling

Flattening more important than down sampling

My only thought on your post is that if your only reason to optimize is to down sample images, I wouldn't waste my time doing that at all. Today's computers/rips are powerful enough to do the down sampling on the fly and it takes very little time. Flattening images would be far more of an issue and that is, unfortunately, rarely considered my most designers.

Also, I would definitely turn off optimization for web viewing as that has nothing to do with printing and could do odd things that are of no use to you. (I think it primarily is so it can display the first few pages while downloading the rest of the file in the background).

– Michael
 
Why is someone creating these PDFs in such a crazy way as to use negative scaling all over the place, like -100 Tz or a font size of -12?

Admittedly the problem that occurs to you should still not happen (Acrobat's PDF rewriting of content streams has an occasional problem with kerning between partial text strings, which I believe kicks in here).

Sorry I can be of no more help....

Olaf Drümmer

Thanks for the insight, it may turn out to be more help than you think. I'm closer to a novice than an expert when it comes to PDFs in general but I had always wondered why the font size would show up as negative when I ran preflight. I wasn't having any output problems, so I never raised too many red flags.

We generate these PDFs on a web server using PDFlib with PHP. Fortunately, we are in the midst of rebuilding our PDF generation app, so now is the perfect time for me to get to the bottom of your question- "Why is someone creating these PDFs in such a crazy way...". I passed it along and I'm waiting to hear back from our developers. Plus, I've forwarded these files to PDFlib support to get their feedback. I'm quite curious to find out what is going on.

Thanks for the help!
 
My only thought on your post is that if your only reason to optimize is to down sample images, I wouldn't waste my time doing that at all. Today's computers/rips are powerful enough to do the down sampling on the fly and it takes very little time. Flattening images would be far more of an issue and that is, unfortunately, rarely considered my most designers.

Also, I would definitely turn off optimization for web viewing as that has nothing to do with printing and could do odd things that are of no use to you. (I think it primarily is so it can display the first few pages while downloading the rest of the file in the background).

– Michael

To answer your questions. We make these PDFs available to our customers on-demand- usually less than 5 copies printed at a time, but mostly 1 or 2. The files are packed with both images & text and the file sizes tend to be quite large. Our print supplier has asked me to keep the file size below 700MB otherwise the RIP takes too long and backs up there print line. That's the main reason we stick to 300ppi max.

As for flattening the pages- we had given this some thought but encountered two problems. First, we were told we would see a noticeable reduction in the sharpness of the text. Which is a problem because these are books that contain lots of text. Second, PDFlib doesn't support flattening and as we move forward we want the PDFs ready for print right from our web server.

Also, about turning off "Optimize for Web View". I also realized this was an unnecessary step, which is why you'll see in my original post that while troubleshooting the problem I de-selected all the options in Optimizer except for "downsample color & grayscale images" but I continued to encounter the kerning problems.

Thanks for your help!
 
Wow thats odd. Whats even odder tho is if I click after the first word to type a space using Pit Stop the space is there automatically before I even type it! One of the many reasons I'm not using CS4 unless I'm forced to.

That is weird!
 
We had a PHP project, and unfourtunately my experiences with the PDF's was not good. Try and put a demand on a PDF adhering to a standard eg, PDFx1a, or (PDFx4 if you need live transparency and maximum fixability)
 
I'm happy to report I've gotten to the bottom of my problem. With the help of a very clear and concise response from PDFlib, we were able to isolate and replicate the problem. We can now re-design/re-build our PDF generation application which should help us avoid these problems in the future.

It turns out our use of top-down coordinates when building our PDFs, combined with the use of Postscript fonts has exposed a bug when using Acrobat's Optimizer. We isolated and confirmed this by building two sets of test PDFs and then running them through optimizer. One set of PDFs used a Postscript font, while the other set used a True Type font. In each set one of the PDFs was defined using the Top-Down coordinates system while the other PDF was defined using the more standard Bottom-Up coordinate approach. After running all four samples through Acrobat's Optimizer, only the sample PDF using both Post Script fonts and top-down coordinates resulted in an optimized PDF containing the kerning issue. (I've attached all of the relevant PDFs)

I Just wanted to say thanks to everyone who chimed in. All your thoughts definitely helped us get to the bottom of this!
- Alan

In case anyone is interested in a more detailed explanation, here are portions of the email I received from PDFlib:

I don't like to discuss the question whether or not negative numbers are crazy, so let's stick to the facts:

- The PDFlib-generated page is perfectly legal according to all versions of Adobe's PDF reference as well as ISO 32000. There is no doubt that the original page is displayed correctly.

- Acrobat completely rewrites the page content stream; in doing so, it screws up and instead of optimizing generates operators which create a different page appearance. We are simply facing a bug in Acrobat, triggered by certain geometric constructs on the page. Note that page display in all Acrobat versions, printing, viewing with alternative PDF viewers, RIPping etc. work well - it's just this particular operation in Acrobat which fails.

Speculation: it seems quite obvious that Acrobat cannot properly rewrite page contents which have been created in a "topdown" coordinate system. As you know, the "topdown" option in PDFlib is a convenient way to set up the coordinate system in a way which is familiar to many developers.

You could probably work around the bug in Acrobat by rewriting your application to avoid the "topdown" option, but I doubt it will be worth the effort.

You could also report the Acrobat bug to Adobe.

You may want to tell the expert that negative font size and scaling are the only way to implement a coordinate system with the origin in the top left corner (with the y axis increasing downwards) and text standing upright without having to explicitly set the transformation matrix for each line of text. Since PDFlib gives the user full control over page content creation and coordinate system transformations while still creating serial output (i.e. not having to rewrite the page content stream after creation) there are some requirements which may not seem obvious at first glance - hence the "crazy" output.

Regards,
T


- Olaf, hope this helped answer your questions. Thanks for your help!

Why is someone creating these PDFs in such a crazy way as to use negative scaling all over the place, like -100 Tz or a font size of -12?

Admittedly the problem that occurs to you should still not happen (Acrobat's PDF rewriting of content streams has an occasional problem with kerning between partial text strings, which I believe kicks in here).


We had a PHP project, and unfourtunately my experiences with the PDF's was not good. Try and put a demand on a PDF adhering to a standard eg, PDFx1a, or (PDFx4 if you need live transparency and maximum fixability)

We've been using PHP in concert with PDFlib for over a year now and have been quite pleased with the results.

In the past we gave some thought to switching our PDFs over to one of the PDF/x standards. We just never really followed through and ended up shelving the change for a later date. Maybe that day has come...
 

Attachments

  • PostScript_BottomUp.pdf
    37.6 KB · Views: 256
  • PostScript_BottomUp_OPTIMIZER.pdf
    41.1 KB · Views: 241
  • PostScript_TopDown.pdf
    37.6 KB · Views: 253
  • PostScript_TopDown_OPTIMIZER.pdf
    41.1 KB · Views: 242
Here are the other 4 PDFs (True Type).
 

Attachments

  • TrueType_BottomUp.pdf
    22.2 KB · Views: 254
  • TrueType_BottomUp_OPTIMIZER.pdf
    27 KB · Views: 256
  • TrueType_TopDown.pdf
    22.5 KB · Views: 246
  • TrueType_TopDown_OPTIMIZER.pdf
    27.3 KB · Views: 240
Hi AlanZ

I am having the exact same problem with PDF optimizer in Acrobat Pro (9). But we are generating our PDF's from a prepress workflow then we need to optimize them in Acrobat to make them small enough to email.

I see you have a solution but this is based on your PHP custom software how can a prepress guy like me using off the shelf software work around the issue?

Thanks!
 
Hi AlanZ

I am having the exact same problem with PDF optimizer in Acrobat Pro (9). But we are generating our PDF's from a prepress workflow then we need to optimize them in Acrobat to make them small enough to email.

I see you have a solution but this is based on your PHP custom software how can a prepress guy like me using off the shelf software work around the issue?

Thanks!

Actually, I found a workaround using Acrobat 9. We're currently using this method, at least until we can re-build our PDF generation software.

Instead of using Acrobat Optimizer, I started using the "Analyze and Fix" feature in Acrobat Preflight. I created a custom preflight profile that downsamples all color & grayscale images above 300 ppi and reduces them to 300 ppi. Then I just click the "Analyze and Fix" button instead of the "Analyze" button whenever I run that profile and viola... images downsampled, no kerning issue!

Just a note though. After downsampling a PDF I always run it through Preflight one final time to make sure everything checks out; I have a separate preflight profile I use for this, it checks a number of things. While running this final check I noticed the following messages now appear when I run the PDF through the final preflight:

"The PDF document uses features that require at least Acrobat 5.0 (PDF 1.4)"
"Compressed object stream used (except for tags)"


I find this odd only because when I run the original PDF document through the same preflight test I don't receive the info message. I'm guessing it's the result of something Acrobat does during the downsample preflight, it's only odd because when I test a PDF that used Optimizer to downsample the images, the message doesn't show up. It hasn't effected the output of our documents, but I just wanted to give you the whole story before you implemented a similar method.

Goodluck! Hopefully Adobe will sort this bug out!
 

PressWise

A 30-day Fix for Managed Chaos

As any print professional knows, printing can be managed chaos. Software that solves multiple problems and provides measurable and monetizable value has a direct impact on the bottom-line.

“We reduced order entry costs by about 40%.” Significant savings in a shop that turns about 500 jobs a month.


Learn how…….

   
Back
Top