Renaming a PDF page label with information on the page... Then extracting the page with that label for the name

rcreveli

Well-known member
Sorry for the long title.

We have a variety of documents that we want to separate out into individual pages. Each document is several thousand pages. Ideally here's what we'd like to do.

1 rename each page label with the folio name. The folios names are not consistent but, the placement on the page is.
2 extract each page as a separate PDF with that folio name for the file name.

Tools. We have Acrobat Pro, Pitstop Pro & Enfocus switch as well as tools like Excel.

I understand this may be impossible but, this is the first step in a very involved project so, anywhere we can automate is a huge plus.
 

abc

Well-known member
Ok, do you have any sample files you can send me please?
you can send them to andrewb@enfocus.com
I only need one or two for a proof of concept.

I guess the files can be split into single pages first, and then have the page labels adjusted in a second step?
 

ReproElectroProspero

Well-known member
Ok, do you have any sample files you can send me please?
you can send them to andrewb@enfocus.com
I only need one or two for a proof of concept.

I guess the files can be split into single pages first, and then have the page labels adjusted in a second step?
What is your plan? Can you script a solution using pitstop? If so, what language is it? Just wanting to learn how to implement custom solutions to problems such as this.
 

bcr

Well-known member
Don't know what folio name refers to but I'm assuming it's just text on the page?

I've used Evermap Auto bookmark before. Assume you're in legal market?

It can create bookmarks from text on pages.

And then you can export the bookmarked pages - I think you can easily do that with the bookmark name as the file name but I'm not at my desk to check

This software is cheap too
 

Stephen Marsh

Well-known member
As I had to remove the code from my blog to stop Google taking the site down, here are the two scripts:

JavaScript:
var CSV = function(data) {
var _data = data.split('\r');

    for(var i in _data) {
        if(_data[i].length > 0) {
            console.println(i + ' ' + _data[i]);
            _data[i] = _data[i].split(',');
        }
    }
  
    var _head = _data.shift();

    return {
        length: function() {
            return _data.length - 1;
        },
        getRow: function(row) {
            return _data[row];
        },
        getRowAndColumn: function(row, col) {
            if(typeof col !== 'string') {
                return _data[row][col];
            } else {
                col = col.toLowerCase();
                for(var i in _head) {
                    if(_head[i].toLowerCase() === col) {
                        return _data[row][i];
                    }
                }
              
            }
        }
    };
};

this.importDataObject("CSV Data");
var dataObject = this.getDataObjectContents("CSV Data");

var csvData = new CSV(util.stringFromStream(dataObject));

if(this.numPages != csvData.length()) {
    app.alert("Number of pages & CSV row count inconsistent");
} else {
    for(var i = 0; i < this.numPages; i++) {
        this.extractPages({nStart: i, cPath: csvData.getRowAndColumn(i, 'PartnerHQ_Id') + '.pdf'});
    }
}
 

Stephen Marsh

Well-known member
Attached is a .zip archive of various Acrobat Pro Action Wizard .sequ files for splitting files.
 

Attachments

  • Archive.zip
    15.7 KB · Views: 6
Last edited:

Koenig & Bauer Video

Canon 2022
The Video You Really
Need To Watch

Modern offset press performance comes with several nuances.
Chris Travis, Director of Technology at Koenig & Bauer, shares some details.
View The Video

   
Top