Myth: If the Bot and User Versions of Your Site Have Different Tech Specs, You Are Cloaking
Other versions of this myth are: If you’re detecting user agents for bots, then you are cloaking. -OR- If you are taking scripts out of your pages for bots, then you are cloaking.
The reality is you can’t tell cloaking from any of these things because these “indicators” aren’t directly related to cloaking. To understand that, let’s start with the definition of cloaking.
Cloaking
Per Google:
Cloaking refers to the practice of presenting different content to users and search engines with the intent to manipulate search rankings and mislead users. Examples of cloaking include:
Showing a page about travel destinations to search engines while showing a page about discount drugs to users
Inserting text or keywords into a page only when the user agent that is requesting the page is a search engine, not a human visitor [source]
And wikipedia gives additional examples:
Cloaking is often used as a spamdexing technique to attempt to sway search engines into giving the site a higher ranking. By the same method, it can also be used to trick search engine users into visiting a site that is substantially different from the search engine description, including delivering pornographic content cloaked within non-pornographic search results. [source]
The key to all the examples of cloaking are that you are modifying the content (keywords, copywriting, meta tags) that the search engine sees and making it different than what the regular user sees, either by adding in extra for the search engine that the user can’t see, or by duping the user and changing from what they are expecting to see based on the search engine result.
None of the examples of cloaking have anything to do with removing things the user doesn’t even know are there. For example, the user is unaware of all the scripts for analytics for pixel tags. Removing those for the bot is not relevant to the content and is not considered cloaking.
Google goes as far as saying you can remove resources from the page, even decorative images, as long as they are “unimportant”.
Here's how you can optimize your pages and resources for crawling:
Prevent large but unimportant resources from being loaded by Googlebot using robots.txt. Be sure to block only non-critical resources—that is, resources that aren't important to understanding the meaning of the page (such as decorative images). [source]
So, of course, you can see by now that it isn’t as simple as whether the bot page is different from a technical perspective or even whether it is different from a user-visible perspective. It’s whether it is different in a way that would change the meaning of the content between the search engine and the user.
Optimizing for Crawlers
This myth is thrown around to incite fear in those without a technical understanding of how websites work for bots and users. Large companies like Google and other major bots crawling your site want you to modify it for optimal performance. Google provides developers with extensive documentation on optimizing your site for their crawler. It’s in their best interest for you to produce an easy-to-crawl version that minimizes crawler resources. Sometimes, these optimizations benefit both the user and the crawler.
In cases like Nostra’s customers, who didn’t write the e-commerce platform or all installed apps, they lack the control to make necessary changes for user version optimization, unlike the bot version. This common scenario calls for the recommended best practice of implementing Dynamic Rendering, especially when you didn't build the underlying application.
Dynamic Rendering
We call it dynamic because your site dynamically detects whether or not the request there is a search engine crawler, like Googlebot, and only then sends the server-side rendered content directly to the client. You can include other web services here, as well, that can’t deal with rendering. For example, maybe social media services, chat services, anything that tries to extract structured information from your pages. And for all other requesters, so your normal users, you would serve your normal hybrid or client-side rendered code. source: John Muller
The following diagram was made by Google to illustrate simply how Dynamic Rendering works.
As seen in the diagram, the technical structure of the pages delivered to the user’s browser and to the bot differs significantly. Static HTML has already been processed, leaving no Javascript. The confusion over this myth is so widespread that Google had to dedicate a section in their developer documentation to unequivocally state that Dynamic Rendering is not cloaking.
Dynamic rendering is not cloaking
Googlebot generally doesn't consider dynamic rendering as cloaking. As long as your dynamic rendering produces similar content, Googlebot won't view dynamic rendering as cloaking. [source]
Nostra’s products properly implement Dynamic Rendering to produce an easy-to-crawl, static HTML, cached version of your site for bots while serving client-side rendered pages to your users with the same store content.