Changing the Game with R&D in Technical SEO via @OnCrawl
This is a sponsored post written by OnCrawl. The opinions expressed in this article are the sponsor’s own.
What Is R&D & Why Is It Important?
Research and development (R&D) uses resources – time, energy, creativity – to find novel solutions to tough problems. It is essential to scientific study, to innovation, and in developing new technologies.
In SEO, a marketing field where the popularity of programming and machine learning is growing, these are key changes required to level the playing field.
From BERT to AI analysis of SERP pages, to core algorithm updates with no concrete instructions to webmasters, search engine technology is becoming increasingly complex.
Today, keeping up with algorithm changes and search query processing in SEO requires the type of innovation offered by R&D.
Who Is Carrying Out SEO Research?
Much of R&D in technical SEO is carried out by individuals in their own time. It’s easy to cite pioneering SEO pros such as Hamlet Batista, whose articles have appeared in Search Engine Journal, or JR Oakes, whose experiments with search engine analysis received attention at TechSEO Boost this past year.
A few SEO companies also carry out R&D work. R&D can be difficult to implement because it doesn’t bring direct revenue.
What it does provide is often linked to long-term growth through innovation:
- Exploration of a subject without the constraint of being required to deliver a marketable feature on a deadline.
- Testing new ways of doing things, which can circumvent what we previously thought were hard limits.
- Creative experimentation and testing of multiple solutions to a broad problem.
OnCrawl, an award-winning technical SEO platform, is an example of one of the few SEO crawlers and data platforms on the market today that invests strongly in R&D.
In part thanks to the presence of strong innovators, such as the Product Director, Vincent Terrassi, who won 2nd place at TechSEO Boost this year, OnCrawl has been able to showcase the results of innovation and research, from the early example of its near-duplicate detection, to incorporation of big data technology.
But what goes on behind the scenes?
What OnCrawl Labs Reveals About R&D for SEO
Using the new OnCrawl Labs as an example, here is a glimpse under the hood of an R&D research lab.
OnCrawl Labs is a platform that makes available R&D work in technical SEO in the form of Jupyter Notebooks, a text-and-code format used for presenting Python-based algorithms and their explanations.
The platform provides a portfolio of solutions based on R&D work, along with the explanation of each step to predict online performances.
Feeding a Product Roadmap Through a Diversity of Subjects
The different solutions available on OnCrawl Labs show the breadth and richness of research in any R&D department.
An R&D department usually has many subjects underway at the same time: from internal link structures, to prediction, and from indexing, to high-quality text generation, OnCrawl’s laboratory covers a broad range of technical SEO subjects and techniques.
This feeds OnCrawl’s product roadmaps and uses innovation to improve product quality.
Building Technical Know-How Through Background Research
Almost all of the solutions available offer extensive background reading including GitHub resources, scientific articles, and academic papers, and courses – both paid and free – on the subjects or techniques handled by the solution.
In fact, a significant part of R&D is spent on documenting the current state of the art. Without an in-depth understanding of subjects such as machine learning techniques, natural language text generation, or calculating normal values in a system, it is impossible to adapt these techniques to SEO.
Adapting to SEO Needs Through Proof of Concept, Tests & Examples
The solutions all begin with an example, or suggest you run a test (on a limited volume of data, or using a shorter calculation time). This not only illustrates how the solution works, but also proves that it is viable with SEO data.
R&D involves a significant amount of testing; what looks good on paper might not work as expected when implemented. Even scientific studies often aim only to prove that their solution can theoretically work, but might not have tested it on the type of data that SEO can offer.
How R&D Drives Advances in Technical SEO
By looking at two of the current available solutions in OnCrawl Labs, “Anomaly detection” and “Real-time indexing”, we can see different R&D strategies at work.
Anomaly Detection: What Constitutes Abnormal Is Website Performance?
Some solutions are the result of experimentation with an interesting algorithm that R&D engineers wanted to use for SEO purposes. Understanding how an algorithm works can give ideas as to the type of input it is good at processing, which can in turn inspire a way to produce that information from SEO work.
This is the case with the “Anomaly detection” solution in OnCrawl Labs.
Starting with a machine learning algorithm used to detect anomalies in a complex dataset, OnCrawl applied it to SEO. The solution “learns” what a website’s normal performance looks like based on a series of inputs. The baseline for fluctuations it has learned during this training allows it to decide whether some of the data is unusual.
Using OnCrawl technical audit results over multiple crawls as input, the algorithm can point out crawls in which the website over- or under-performed. This input can be too complicated to analyze easily by hand.
Rank, as reported via the Google Search Console, is a good example of a worthwhile metric for this sort of analysis.
When we ask, “how does a site’s ranking vary over time?”, most SEO professionals are aware that some pages will lose positions on the search engine results pages, while others gain positions. This fluctuation is normal.
- How much fluctuation is abnormal?
- When should we worry?
The result is a solution offered in OnCrawl Labs that can identify the audits that reveal when a website’s ranking performance has dipped below what should be considered as normal variations in ranking.
Real-Time Indexing: Preventing Indexing Delays for New Pages
Another R&D strategy begins with a strategic SEO problem, such as getting new URLs indexed quickly, and attempts to find one or more solutions.
Getting new URLs indexed can be key for websites that have a lot of fluctuation in their pages: online publishers with new articles per day, ecommerce sites with variations in stock, etc.
- What happens if you can’t get a list of new pages?
- What if this list is too long to manually request a crawl for each new page?
- What if you suspect search engines are taking too long to discover your new pages?
In this case, there is a technical problem in getting pages indexed: identifying and submitting new pages is too complex to be done by normal methods.
An R&D engineer will break this problem down into smaller pieces:
- First, how can I automatically create a list of new pages?
- Then, how can I submit them to a search engine? Do all search engines work in the same way?
An R&D answer will attempt to cover as many of the individual pieces of this puzzle as possible, such as in OnCrawl Lab’s “Real-time indexing.”
Using OnCrawl’s crawl comparison feature, the solution is able to construct a list of new URLs as soon as the website is analyzed.
It then uses different methods of URL submission for each search engine: the new URL Submission API from Bing, and a combination of the Indexing API and sitemap updates for Google.
By scheduling both website analyses and this script to run regularly, it’s possible to automate crawl requests for new URLs – even when you don’t know which URLs are new.
OnCrawl Labs: Open R&D for SEO Features Not Yet Available to the Public
Not all R&D solutions can be built into full-fledged tools or features, but that doesn’t mean that they have no value for SEO professionals.
However, R&D work is not usually released to the public for multiple reasons.
First, it is often “unfinished”: it might work only in certain circumstances, or might break easily, or might not be fully documented. Public users may need to provide finishing touches in order to be able to apply R&D to their own situations.
Additionally, R&D work often requires users to have advanced knowledge of technical elements, algorithms, and techniques, which the general public might not possess.
Finally, R&D work has immense value on innovation but can be very difficult and expensive to market to users who are used to plug-and-play solutions.
Because of this, R&D is rarely visible to the public until it has matured into a marketable product or feature.
This is what is so unusual about OnCrawl Labs: OnCrawl has chosen to make its R&D work available to users of its platform who already have access to the OnCrawl API.
It provides access to SEO solutions that aren’t available on the market yet – including some that may never become available, commercially speaking.
Image Credits
Featured Image: Image by OnCrawl. Used with permission.
In-Post Photos: Images by OnCrawl. Used with permission.