How search engines work

Your go-to forum for bot dataset expertise.
Post Reply
subornaakter24
Posts: 435
Joined: Thu Jan 02, 2025 7:21 am

How search engines work

Post by subornaakter24 »

Citation Index (CI) is an indicator of the number of citations (or references to the original source), which helps to determine which of the recently created documents refer to earlier publications. CI is used both for the analysis of articles and authors (for example, in the scientific community).

In Yandex, as in other search engines, the citation index is considered as the number of backlinks without taking into account links from sites of the following types: unmoderated directories, bulletin boards, online conferences, server statistics pages, XSS links, etc., the number of which can constantly increase without the participation of the resource owner.

It should be clarified that in the Aport cell phone database catalogue, the CI is considered as a weighted citation index.

To calculate this index, a link graph is used: if sites are graph nodes, and links to other sites are graph node connections or edges, then the link graph appears as a diagram shown in the figure:


where A, B, …, F are specific sites in the Yandex search engine index, and the arrows indicate the directions of connections between them (one-way or two-way).

The citation index plays a major role in the ranking of documents by a search engine, but the final results depend not only on this indicator.

It is believed that the citation index characterizes the significance of a publication, but it does not reflect the structure of the site's links, as a result, resources with different numbers of external links can be indexed equally.

To eliminate this drawback, a weighted citation index is used, which characterizes not only the quantity, but also the quality of referring resources. The use of link search and static link popularity facilitates the work of search engines, freeing them from various text spam. The Google search engine uses the PageRank indicator, which is similar to the weighted citation index.

To calculate the VIC, as well as other factors affecting ranking, a link graph is used. The owner of the site can independently roughly estimate the VIC of his Internet resource by checking its PageRank value using any of the available online services. But it is worth keeping in mind that the Yandex index contains only documents in Russian, and only some popular ones from foreign ones, thus, the value of Yandex VIC will differ from Google's PageRank.
Post Reply