Skip to content

Google Converts PDFs, DOCs, XLS etc into HTML for Indexing

A discussion came up on twitter about different content types and how Google determines what type of files they are. The discussion then moved to PDFs in the Google search results and how Google handles them.

John Mueller commented that Google automatically converts PDFs and similar document types into HTML format for indexing and ranking purposes.

For those who are active in PDF SEO, this won’t be a surprise. Google has converted PDFs into HTML for quite some time, and included a link to the HTML version directly in the search results. So while you may have what you think is an awesome PDF, your users might actually prefer the HTML version and click this link instead.

Do note that for larger files in Google will not convert the entire PDF document into HTML. So there’s still some important content that could be within the PDF that is just simply not indexed because of the PDF size.

And there’s a lot of evidence that while PDF files can rank very well, they tend to rank well for the types of queries where someone is looking for something like a PDF, such as a search for a manual for example.

If you do have a large number of important PDFs indexed and that you want ranking well, it is worth considering whether having that content with in a PDF is the best solution for your users as well. For example, PDFs are hard to open and read on many mobile devices. And sizes of PDFs are often much larger than what the corresponding HTML version of the page would be, which is also a limitation on some slower connections depending on the size of the PDF.

PDFs aren’t the only file type that Google converts to HTML for indexing.  Google also does it for .doc documents (such as Word documents), .xls (spreadsheets) and other similar non-HTML content types.

The following two tabs change content below.

Jennifer Slegg is a longtime speaker and expert in search engine marketing, working in the industry for almost 20 years. When she isn’t sitting at her desk writing and working, she can be found grabbing a latte at her local Starbucks or planning her next trip to Disneyland. She regularly speaks at Pubcon, SMX, State of Search, Brighton SEO and more, and has been presenting at conferences for over a decade.

Latest posts by Jennifer Slegg (see all)

Let’s block ads! (Why?)

Source link

Back To Top