The web activity in terms of CO2 production is equivalent to the production of CO2 from the aviation industry worldwide. And GOOGLE is responsible for almost half of it (45%).”— Gautier Dorva/
WHAT if you could reduce your site’s ecological footprint and, at your level, the greenhouse gas (CO2) production inherent to its existence on the Web?
An idea, original, crazy some would say but which gradually, for more than 6 months, is making its way. How could a simple website help fight global warming?
To understand it, you have to immerse yourself in the world of the Web, search engines and how they work. Every day, new websites appear on the Internet that contribute to enriching, regardless of the content, the information available online. Thanks to search engines, whose mission is to crawl & index content in order to make it accessible online via simple requests, anyone, no matter where in the world they are, can access the most varied content, on their computer or cell phone. This is the beauty of search engines, of which Google is the most effective representative.
However, the process of crawling a site, which, it should be noted, is carried out continuously (all the time, or almost all the time) in order to make published information (including the most recent) accessible to anyone, is particularly energy consuming! This involves crawlers (spiders) visiting your website, from link to link, to detect the presence of new pages, content, changes to represent them as accurately as possible in search engine result pages (SERP). As such, Google, which has become in a few years the most widely used search engine in the world, excels in its ability to provide the most appropriate and relevant information according to the requests made by users. It is a clever mix of technology, algorithms, servers, power, etc. that allows your last article published on your site to be, in a few days, read, organized, referenced and made available in a few clicks to the first visitor interested in your subject.
But what can you do about it? After all, you and your website cannot be held responsible for this. You are doing your part and a priori, you have not asked Google for anything even if, in reality, you depend on it considerably (for traffic on your site). And this work is titanic. To give you an idea, Google conducts more than 3.5 billion searches per day on behalf of its users, making them the main culprit, up to 40%, in the carbon footprint of the Web in general. In 2015, a study established that web activity in CO2 production (in terms of the use of millions of servers, cooling systems, etc.) was equivalent to the production of CO2 from the aviation industry worldwide.
That’s why, very early on, in the continuous improvement processes of its crawling system, Google, for example, defined a limit to the number of links a robot could explore in a session. And the reason is simple. The power required by indexing robots to explore your site directly impacts the performance and efficiency of your website.
However, this limitation, called “crawl-budget” is not a general rule applied by all (known) search engines and certainly not by the thousands of “web” robots (“scrapers”) continuously visiting, copying, analyzing, the Web in general.So, if you could accurately tell crawlers, whoever they are (Google, Bing, Yahoo, Baidu, etc.) what they can explore and what is not necessary for your visibility, you could both ensure better performance for your website but ABOVE ALL, significantly reduce the energy (and therefore power consumption) required by your hosting server, Google and all other exploration entities on the Web.
In fact, what you could do, which could impact globally (if all users did) the production of CO2 emitted by Google to read the Web, organize the information and allow users access to it, would simply be to simplify the work that Google has to do, through its indexing robots (crawlers), when they visit your website.
You may not know it, but your website is not limited to the pages you create with your content, nor to what is visible in search results. Your site contains an astronomical amount of internal links, intended strictly for its operation, to generate interactions between pages, to filter results, to organize content, to allow access to certain limited information, etc. And when your site is made available for crawling by search engines, crawlers systematically try to visit all the links it contains in order to identify the presence of information, index it and make available. But, and this part is important, exploring/crawling your website requires power, a lot of energy, both from crawlers (search engine servers) and also from your own hosting server.
This optimization mission is the one that PAGUP, a Canadian SEO agency specialized in search engine optimization systems, has adopted by creating a “plugin” specifically dedicated to WordPress allowing very simply and in a few clicks, the optimization of a file, called the Robots.txt.
The “robots.txt”. As incredible as it may seem, this whole crawling operation is done through a small file that each website has on its root directory (on the hosting server). This file has only one simple role, that of communicating with search engines. In fact, it is so important that when your site is displayed in a browser for a visitor, it is the very first file that is loaded. Just like when indexing robots explore your website, it is the first file that will be searched first, then read… to know exactly what to do with your site.
It is precisely by inserting precise “instructions” that it is possible to inform crawlers as to what they can read/index or not. The plugin in question, called Better Robots.txt, which had more than 10k downloads in 6 months, makes it possible, quite easily, to produce a robots.txt file optimized specifically for any WordPress site by refining the indexing work that exploration robots will have to perform for most search engines and a large number of other entities.
email us here