Google Now Indexing Flash – What’s New?
I’ve had many clients contact me about the recent announcement that Google is now indexing Flash (you can see Google’s announcement on its blog, see Google’s Webmaster Blog announcement, and Adobe’s official announcement). The overriding question is this: What does this mean for my Flash-based site?
The short answer is: Not much. While it’s fantastic that Google is now able to read the text on Flash sites now, they face significant challenges in making sense of the text that they read. The reason is that Flash technology is much different from HTML pages in a couple of key areas:
- Markup: HTML has tags which mark bits of text and code on the site. Google uses a variety of these tags in order to make sense of the page (what’s a link, what’s a title, what’s an important bit of text). Flash has no such markup — the code you use to register a “click” can vary widely from programmer to programmer. While it’s true that this code is eventually compiled into a movie that the Flash Player plays, there are still significant challenges in being able to interpret the text that lies within a Flash movie.
- Standardization: Flash code varies dramatically from developer to developer. For example, Flash designates certain movie clips as “buttons” but many developers (including myself) never actually use technical “buttons” because other movie clips offer much more flexible ways of animating and handling user interaction. As a result, code and the structural elements that make up a Flash site vary widely. In contrast, HTML and the manner in which pages are displayed is very standardized — there’s only one way to indicate that you’ve got a link. Even if you use a dynamic language like PHP, the PHP will eventually spit out HTML that is widely standardized. Granted, AJAX and use of JavaScript can yield less standardized results, but for most HTML sites, the code is much more standardized than Flash. The end result is that HTML offers Google a much more standard framework for interpreting the meaning of pages than Flash does.
I’m sure that there are numerous other factors that go into it as well. As Google themselves point out, there are additional things which present problems for interpreting Flash content:
- Images (and video): Google points out that they are not able to index images that feature text or any video that features text. Nor do they mention way to tag images and/or video in order to make it even partially visible (like you could with the “alt” attribute in HTML).
- Potential problems with javascript: Google does not execute some types of JavaScript and therefore won’t be able to read some Flash files which are embedded via JavaScript. Since pretty much every developer worth his/her salt uses JavaScript to embed their Flash files, this could be a problem for sites.
- Externally-loaded content: Any content that is loaded by Flash (via XML or an additional SWF) will be considered a separate page. Depending on how your site is structured, this could present significant interpretation problems for Google.
The reality of the situation is that since Google is now indexing Flash sites, the content it gets back could be even more confusing for Google and could adversely affect your search engine ranking (see John Andrews’ blog entry on why you might want to block Flash from search engines).
In my opinion, all this boils down to the following strategy: watch and wait. No doubt standards for search engine optimization inside Flash will begin to emerge as Flash developers start experimenting with code and SEO, as Google continues to tweak algorithms on their end, and as Adobe keeps working on its Searchable SWF API. Until solid standards start emerging, Flash should still be treated as a system which requires additional work and solid strategies in order to be properly optimized for search engines.
