Schema, JSON-LD, structured markup, rich snippets, and other big words.
What does all this mean? Well, I hope to give you a better understanding of it in this post, as it is a vital part of how your website gets properly indexed by GoogleBot and all search engine bots so you show up where you’re supposed to in various keyword searches.
Take a look at Examples Three and Four on schema.org’s LocalBusiness documentation (as of January 2017): https://schema.org/LocalBusiness
The “Without Markup” versions represent two chunks of information regarding two businesses, one for a gift shop close to a beach, and one for a restaurant that serves Mediterranean food with a little more info regarding the food and service. As a human with reading comprehension skills it’s really not that difficult to figure out what you’re looking at:
Ex 3:
<h1>Beachwalk Beachwear & Giftware</h1> A superb collection of fine gifts and clothing to accent your stay in Mexico Beach. 3102 Highway 98 Mexico Beach, FL Phone: 850-648-4200
Ex 4:
GreatFood 4 stars - based on 250 reviews 1901 Lemur Ave Sunnyvale, CA 94086 (408) 714-1489 <a href="http://www.greatfood.com">www.greatfood.com</a> Hours: Mon-Sat 11am - 2:30pm Mon-Thur 5pm - 9:30pm Fri-Sat 5pm - 10pm Categories: Middle Eastern, Mediterranean Price Range: $$ Takes Reservations: Yes
However, for a search engine crawler (or “bot”), who only knows as much as the programmer that built it told it to recognize, it’s not so easy. The first example has the business name in a heading tag, h1, but the second example does not. The second example includes the business’s zipcode, but the first does not. The phone numbers are formatted differently, one has a label, one does not. One has a link to the company website, categories, price range, hours, etc., the other does not. If you were trying to apply a formula to extract specific bits of information from these two examples, you might be able to, but consider how many variations you’d have to accommodate for such a formula to function in the real world. How would you determine what letters and numbers correspond to what part of the information?
Maybe something like: Anything that starts with “Phone:” is a phone number. What if they start with “Telephone”?
What about “Phone #” or “Phone number” or “#” or “(P)” or they use a phone icon instead? Are they including a prefix or not? Is the area code in parenthesis or not? Is there a line break or are they including more information on that line?
What about the description in Example Three? Is that part of the address? Or something else? Just trying to think about it is giving me a headache.
Schema takes the guesswork out of the picture by offering developers very specific ways to indicate what’s what regarding this type of information. Schema.org offers a shared vocabulary regarding the things that can be defined and what attributes about these things search engines are looking for. There are different formats to include this information, with varying degrees of flexibility and ease of implementation. Schema.org’s documentation as of late 2016 mentions microdata:
Your web pages have an underlying meaning that people understand when they read the web pages. But search engines have a limited understanding of what is being discussed on those pages. By adding additional tags to the HTML of your web pages—tags that say, “Hey search engine, this information describes this specific movie, or place, or person, or video”—you can help search engines and other applications better understand your content and display it in a useful, relevant way. Microdata is a set of tags, introduced with HTML5, that allows you to do this.
Microdata and RDFa require modifying the markup itself to include these additional details. JSON-LD is a newer format that allows you to just “plop in” an extra chunk of code, in addition to the existing HTML. Sure, there’s some repetition since you’re repeating what you just said for the user, for the bot, but considering how much easier it is to add and maintain, and greater potential for expansion, it’s probably worth the minor overhead in extra bytes being added to your document.
Here’s that same gift shop description in both Microdata and JSON-LD:
<div itemscope itemtype="http://schema.org/LocalBusiness"> <h1><span itemprop="name">Beachwalk Beachwear & Giftware</span></h1> <span itemprop="description"> A superb collection of fine gifts and clothing to accent your stay in Mexico Beach.</span> <div itemprop="address" itemscope itemtype="http://schema.org/PostalAddress"> <span itemprop="streetAddress">3102 Highway 98</span> <span itemprop="addressLocality">Mexico Beach</span>, <span itemprop="addressRegion">FL</span> </div> Phone: <span itemprop="telephone">850-648-4200</span> </div>
<script type="application/ld+json"> { "@context": "http://schema.org", "@type": "LocalBusiness", "address": { "@type": "PostalAddress", "addressLocality": "Mexico Beach", "addressRegion": "FL", "streetAddress": "3102 Highway 98" }, "description": "A superb collection of fine gifts and clothing to accent your stay in Mexico Beach.", "name": "Beachwalk Beachwear & Giftware", "telephone": "850-648-4200" } </script>
The important thing to remember is that a search engine reads these the same way — the main difference is ease of implementation and maintenance for developers and programmers.
Schema is the thing you’re adding to the page. JSON-LD, RDFa, and Microdata are the ways a developer can add Schema.