Home » Questions » Computers [ Ask a new question ]

How to find out WHERE a specific font is used in a PDF document

How to find out WHERE a specific font is used in a PDF document

For a given PDF which uses a number of fonts (e.g., in Acrobat Reader, the fonts used can be seen when selecting Files > Properties > Fonts) how can I find out where a certain font is used in the document (using Adobe Acrobat 7, Reader, or a free PDF tool)

Asked by: Guest | Views: 326
Total answers/comments: 4
bert [Entry]

I have used Enfocus' Pitstop Pro plugin for this, but it's not cheap.
bert [Entry]

"The following is a script that accomplishes this on Linux or similar operating systems, using only open-source software (qpdf and pdffonts).

#!/usr/bin/ruby

# usage:
# find_page_where_font_is_used.rb file.pdf Nimbus
# Finds the first page in file.pdf where a font with a name containing Nimbus is used.
# Font names are matched in a case-insensitive way.
# Requires pdffonts, qpdf.

def die(message)
$stderr.print ""error in find_page_where_font_is_used.rb: #{message}\n""
exit(-1)
end

def shell_out(command)
output = `#{command}`
result = $?
if !(result.success?) then
die(""error in command #{command}"")
end
return output.strip
end

def is_used_in_page_range(font,pdf,from,to)
table = shell_out(""pdffonts -f #{from} -l #{to} #{pdf}"")
if table=~/^[a-zA-Z0-9\+\-]*#{font}/i then
return true
else
return false
end
end

def search_for_font(font,pdf,from,to)
print ""Searching pages #{from}-#{to}.\n""
if from==to then
return from
else
mid = (from+to)/2
if mid==to then mid=to-1 end
if is_used_in_page_range(font,pdf,from,mid) then
return search_for_font(font,pdf,from,mid)
else
return search_for_font(font,pdf,mid+1,to)
end
end
end

def main

pdf = ARGV[0]
font = ARGV[1] # can be a substring, e.g., Deja or Nimbus
n = shell_out(""qpdf --show-npages #{pdf}"").to_i
print ""total pages = #{n}\n""
if !is_used_in_page_range(font,pdf,1,n) then
print ""No font in #{pdf} has a name containing the string #{font} (case-insensitive).\n""
exit(0)
end
p = search_for_font(font,pdf,1,n)
print ""The font first occurs on page #{p}.\nOutput of pdffonts for this page:\n""
print shell_out(""pdffonts -f #{p} -l #{p} #{pdf}"")+""\n""
end

main"
"The following is a script that accomplishes this on Linux or similar operating systems, using only open-source software (qpdf and pdffonts).

#!/usr/bin/ruby

# usage:
# find_page_where_font_is_used.rb file.pdf Nimbus
# Finds the first page in file.pdf where a font with a name containing Nimbus is used.
# Font names are matched in a case-insensitive way.
# Requires pdffonts, qpdf.

def die(message)
$stderr.print ""error in find_page_where_font_is_used.rb: #{message}\n""
exit(-1)
end

def shell_out(command)
output = `#{command}`
result = $?
if !(result.success?) then
die(""error in command #{command}"")
end
return output.strip
end

def is_used_in_page_range(font,pdf,from,to)
table = shell_out(""pdffonts -f #{from} -l #{to} #{pdf}"")
if table=~/^[a-zA-Z0-9\+\-]*#{font}/i then
return true
else
return false
end
end

def search_for_font(font,pdf,from,to)
print ""Searching pages #{from}-#{to}.\n""
if from==to then
return from
else
mid = (from+to)/2
if mid==to then mid=to-1 end
if is_used_in_page_range(font,pdf,from,mid) then
return search_for_font(font,pdf,from,mid)
else
return search_for_font(font,pdf,mid+1,to)
end
end
end

def main

pdf = ARGV[0]
font = ARGV[1] # can be a substring, e.g., Deja or Nimbus
n = shell_out(""qpdf --show-npages #{pdf}"").to_i
print ""total pages = #{n}\n""
if !is_used_in_page_range(font,pdf,1,n) then
print ""No font in #{pdf} has a name containing the string #{font} (case-insensitive).\n""
exit(0)
end
p = search_for_font(font,pdf,1,n)
print ""The font first occurs on page #{p}.\nOutput of pdffonts for this page:\n""
print shell_out(""pdffonts -f #{p} -l #{p} #{pdf}"")+""\n""
end

main"
bert [Entry]

"I have found a way, that isn't very automatic, but it only involves freeware and tells you exactly which text uses a specific font:

Identify the fonts using pdffont and the page where it is used as explained in the other answers.
Open the PDF in Inkscape (selecting the page you want to look at in detail).
Save the file as SVG.
Open the SVG file in your favorite text editor and search for the font name. SVG is XML-based, so you should be able to see for which text the font is used.

I found Inkscape also to be useful for the reverse problem: If you have a particular snippet of text, it can tell you what font it is: Open the PDF as above, then use the text tool and select the text you want to know the font of. Inkscape may not render the font correctly, but it does display the name of the font in the font selector."
bert [Entry]

"This doesn't meet all the OP's restrictions, but I've found many of the other methods suggested here less useful if you're looking for hidden text. My workaround was to use Adobe Illustrator.
For example, Print > PDF > Save as PDF in MS Word 2016 for macOS will insert hidden exclamation marks in Arial after each ordinal in a numbered list. If you preflight the PDF in Acrobat, it will report embedding ""Arial MT"", but Show in snap will just give you a blank gray box, so there's no text to search for.
Opening the PDF instead in Illustrator, if you have it, will:

make the hidden text visible, and
allow you to use the Type > Find Font… command to locate the text and/or replace the font."