Removing stacked images from pdf without layers being used

bcr · Sep 9, 2025

So I have a pdf which is a scan of an old book. It does not use layers, but instead each page contains two images, one atop the other. The lowest is the background image of the page, and the highest is the scanned image of the text.

If I have several books scanned in this way and I want to rapidly delete the lower/background image, what would be the most efficient way of doing so?

ReproElectroProspero · Sep 9, 2025

Can you provide a sample file? I think a custom preflight fixup profile using 'remove object' or a related type of fixup might work for you. But I've never messed with a PDF that used stacked images instead of layers so I'm not sure how it would work in practice.

bcr · Sep 10, 2025

ReproElectroProspero said:
Can you provide a sample file? I think a custom preflight fixup profile using 'remove object' or a related type of fixup might work for you. But I've never messed with a PDF that used stacked images instead of layers so I'm not sure how it would work in practice.

Example below, thanks.

https://ia801609.us.archive.org/34/items/laconfrenceint00inteuoft/laconfrenceint00inteuoft.pdf

bcr · Sep 10, 2025

So I've managed to get the top layer defined as a layer by using the fixup "Put transparent objects as layers",

but I've not yet figured out how to quickly get rid of the image behind it, which is not defined as a layer.

bcr · Sep 10, 2025

Update: I figured out using preflight that the images at the back are all the same ppi and different to those at the front, so I was able to select them all using a rule in preflight and delete them in one pass.

Next issue will be trying to reduce file size and sharpen up the text

loicaigon · Sep 11, 2025

For info, In PitStop, we have an unsharp mask Action to make things sharper and in the lat July release, we put an effort in file size reduction. You might want to give it a try.

michaelejahn · Sep 11, 2025

bcr said:
Update: I figured out using preflight that the images at the back are all the same ppi and different to those at the front, so I was able to select them all using a rule in preflight and delete them in one pass.

Next issue will be trying to reduce file size and sharpen up the text

I wonder what software they use to scan these "old" books.

There are two applications out there that perform page segmentation ( where they scan at 900 ppi for example, then segment out the images, descreen downsample and then cut out the old images and replace them ( so, text becomes 1 bit linework, images 8, 24 or 32 bit contone ) - so, page are small.

VERY old slide show on the subject

IOFlex_presentation

20/04/2008 Presentation The contents of this presentation is confidential and the proprietary property of IO Flex, and cannot be shared or reproduced without permission. Scan to Print Ready Fully automated page segmentation Single or Multipage B&W Color Large Size Scans and/or PDF files IoFlex Bl...

docs.google.com

bcr · Sep 12, 2025

loicaigon said:
For info, In PitStop, we have an unsharp mask Action to make things sharper and in the lat July release, we put an effort in file size reduction. You might want to give it a try.

thanks

bcr · Sep 12, 2025

michaelejahn said:
I wonder what software they use to scan these "old" books.

There are two applications out there that perform page segmentation ( where they scan at 900 ppi for example, then segment out the images, descreen downsample and then cut out the old images and replace them ( so, text becomes 1 bit linework, images 8, 24 or 32 bit contone ) - so, page are small.

VERY old slide show on the subject

IOFlex_presentation

20/04/2008 Presentation The contents of this presentation is confidential and the proprietary property of IO Flex, and cannot be shared or reproduced without permission. Scan to Print Ready Fully automated page segmentation Single or Multipage B&W Color Large Size Scans and/or PDF files IoFlex Bl...

docs.google.com

interesting, thanks!

I know in some instances people will use DSLR's on a rig with lighting to take the photos, and in some cases large machines are used which turn the pages over and scan each page automatically with the book lying down. It can get quite sophisticated.

ReproElectroProspero · Sep 12, 2025

bcr said:
Update: I figured out using preflight that the images at the back are all the same ppi and different to those at the front, so I was able to select them all using a rule in preflight and delete them in one pass.

Next issue will be trying to reduce file size and sharpen up the text

Nice work! I've been MIA, but recreated your success today. For anyone curious:

I used the Output Preview Object Inspector Panel to get an idea of which 'layers' were which:

Screenshot 2025-09-12 at 10.43.16 AM.png

Looks like the background image comes through at 166.556 ppi while the text portion comes in at 500ppi.

Knowing this, it was simple to write a custom fixup that targets the correct layer. You make a new profile, and then in the profile you set a "custom fixup". In the properties of that custom fixup you can tell Acrobat to remove all objects except for images greater than 450ppi. This removes the majority of the background image in the document. Though from my quick testing it seems like the pages without text fail this check and don't get removed. Needs some more fine tuning the handle those edge cases.

Screenshot 2025-09-12 at 10.42.01 AM.png

Just wanted to put the "how" out into the ether so people can learn how to do this. I'm only just now starting to learn how these advance preflight functionalities work, and they are as powerful as they are confusing.

bcr · Sep 15, 2025

ReproElectroProspero said:
Nice work! I've been MIA, but recreated your success today. For anyone curious:

I used the Output Preview Object Inspector Panel to get an idea of which 'layers' were which:
View attachment 294351

Looks like the background image comes through at 166.556 ppi while the text portion comes in at 500ppi.

Knowing this, it was simple to write a custom fixup that targets the correct layer. You make a new profile, and then in the profile you set a "custom fixup". In the properties of that custom fixup you can tell Acrobat to remove all objects except for images greater than 450ppi. This removes the majority of the background image in the document. Though from my quick testing it seems like the pages without text fail this check and don't get removed. Needs some more fine tuning the handle those edge cases.
View attachment 294352

Just wanted to put the "how" out into the ether so people can learn how to do this. I'm only just now starting to learn how these advance preflight functionalities work, and they are as powerful as they are confusing.

thanks! this is pretty much how I did it, except I told it to remove images between the concerned PPI range, rather than "remove everything except" which you did.

Removing stacked images from pdf without layers being used

bcr

Well-known member

ReproElectroProspero

Well-known member

bcr

Well-known member

bcr

Well-known member

bcr

Well-known member

loicaigon

Active member

michaelejahn

Well-known member

IOFlex_presentation

bcr

Well-known member

bcr

Well-known member

IOFlex_presentation

ReproElectroProspero

Well-known member

bcr

Well-known member

Similar threads