Contents / Content:
  • Why is text generation needed?
  • Reproduction of articles - what is it?
  • Basic designs in generators
  • What algorithms are used to evaluate the text
  • What is the difference between product and article templates?
  • How to create your first template
  • The most common mistakes in creating templates and multiplying articles

Text generation is a process that allows you to get from one template many other texts that meet the given conditions. A good example is any existing CMS (content management system) such as Joomla, WordPress, OpenCart and others. Some "static" page acts as a template, where existing information from the database is inserted. For example, for a product card in online stores, the information is a description, blocks, attributes, options, etc. In article sites, the information is article texts, publication data, related blocks, and so on. This approach allows you to significantly reduce the time spent on site support.

However, text creation is not limited to this example. This process also includes the generation of pseudo-unique texts. But about everything in order.

Why is text generation needed?



As you probably already understood, today almost every site is a product of the text generation process. However, there are areas where generation is minimal, such as article sites, where the main text on each article page is human-written and unique (relatively unique). And there are areas where it is simply not possible to do without the generation of the main text, since writing interesting and surprising texts for every case is simply not justified (it will take too much time, and the result will be minimal). Such examples are program catalogs, online stores, article promotion, and others.

Just imagine that you have 1,000 products that are virtually identical to each other, with only a few parameters. It is simply impossible to write 1000 unique articles. Anyone who has even once written a normal article knows that it can take anywhere from an hour to infinity on average. Simple math. If you write 8 articles a day, each of which will take at least an hour together with the design, you will need about 125 days. More than a third of the year, which can be spent on something more useful.

However, it is important to understand that text generation implies accuracy and thoughtful use, as search engines do not aim to clutter their output. The result will depend on how you approach the process. The site's search positions may rise, traffic may increase, behavioral factors may improve, and so on. And maybe, on the contrary, it will lead to the application of filters by search engines to the CSO and others.

In the future, text generation refers to the creation of basic texts from a single template.

Reproduction of articles - what is it?

There are more than 1 billion websites today. Just think about this number. And there is far more than one page on each of them. Search engines need to rank all these sites in the search engine according to the same number of queries (compared in order). The task is extremely voluminous. Therefore, a lot of factors are taken into account, which are constantly increasing. For example, the number of necessary links to the site to obtain a certain TIC is increasing from year to year.

For this and some other reasons, in terms of promotion and generation of pages on the site, the process "Multiplication of articles" has gained great popularity, which, with sufficient skill, allows you to get hundreds of pseudo-unique articles literally in an hour. In other words, about articles that are unique from the point of view of search engines, but similar from the point of view of humans.

What is meant by reproduction of articles? In simple terms, the process consists of several steps:

  1. Writing a regular article
  2. Insertion of specialized constructions that allow you to modify the text
  3. Specifies the parameters for similarity assessment and the number of required articles
  4. Generation

Those who are often involved in the reproduction of texts usually immediately write templates and use preserved constructions from other templates. However, if you are just starting to deal with generation, then I strongly do not advise you to deal with templates right away. You need to "feel" this process in practice. Over time, when you get the hang of it, a well-made template will allow you to quickly get the right number of articles, each of which will not be similar to the other.

At the same time, it is important to understand that in a world where rewrites and duplicates in search results are a normal phenomenon, the reproduction of articles is a completely natural process (it is neither bad nor good, just the way it is).

It is also important to understand that reproduction of sexes is not a panacea and this process should be used carefully. In addition, the articles themselves must be human-readable. But more details about errors near the end of the article.

Basic designs in generators

Many generator programs or sites present their own set of unique designs, but there are basic designs that are found most often.

There are a couple of them, let's consider them first:

1. Synonymization . This term refers to the process of replacing words with words similar in meaning or simply random substitution (there is no clear criterion here). The structure itself is an opening curly brace " { ", followed by words or sentences separated by a horizontal line " | ", and at the end comes a closing curly brace " } ".

Consider the following pattern:

1. Вы получите этот товар вместе с { скидкой | подарком | акцией | 10%-й бонусной картой}

The following lines will appear randomly at the output:

1. ...
2. Вы получите этот товар вместе с скидкой
3. Вы получите этот товар вместе с 10%-й бонусной картой
4. Вы получите этот товар вместе с подарком
5. Вы получите этот товар вместе с акцией
6. ...

As you can see, using this construction to replace words or sentences, you can get different texts. In addition, it is important to know that such constructions can also be used inside each other to save space, so as not to repeat the same phrases that differ only by one word.

Since initially such constructions were used to replace words with synonyms, many programs and services for synonymization and reproduction have their own ready-made bases of such constructions. For this reason, you can, in principle, immediately after installing the program receive completely unique texts, although you will have to check them manually, since automatic text replacement sometimes leads to the creation of "delusional" texts.

2. Permutation . This construction allows you to mix words and phrases. It is found in almost all text generation programs and services, but its presence is not mandatory. Reordering is very useful when you need to rearrange sentences or descriptions of something. The construction is similar, but with some differences. It starts with a square bracket " [ ", then through the horizontal border " | " the words and phrases to be permuted are indicated, and at the end there is a closing curly bracket " ] ". An important note. Depending on the versions of programs and services, such constructions may be slightly modified, for example, the ability to specify symbols or words that will be separators is added so that they are not listed.

Consider an example:

1. Эта программа позволяет [ просматривать видео, | прослушивать аудио, | вставлять комментарии,] редактировать подсветку

At the output you will have the following random phrases:

1. ...
2. Эта программа позволяет просматривать видео, прослушивать аудио, вставлять комментарии, редактировать подсветку
3. Эта программа позволяет прослушивать аудио, вставлять комментарии, просматривать видео, редактировать подсветку
4. Эта программа позволяет просматривать видео, вставлять комментарии, прослушивать аудио, редактировать подсветку
5. ...

As you can see, with the help of this construction, you can get relatively different fragments of text. It is important to understand that the rearrangement of the text is also affected by the similarity of the texts, although the meaning itself most often changes.

Now consider some specialized designs:

1. Insertion . When you have some blanks or information can be taken from any database, then they can be used using insert constructs. Usually, an insertion is some specialized word, to the right and left of which there are brackets or their combinations. For example, " [name] ", " {family} ", " [[nick]] " and others. The format varies by program, but the content is usually the same.

Let's consider an example. Let's say you need to generate texts for users:

1. Уважаемый [name], пожалуйста, подтвердите ваш заказ с номером [order_num]

At the exit of each user, you will receive texts of the form:

1. ...
2. Уважаемый Василий, пожалуйста, подтвердите ваш заказ с номером №123
3. Уважаемый Проськин, пожалуйста, подтвердите ваш заказ с номером №444
4. ...

As you probably already guessed, such inserts are especially useful when there is a large amount of the same type of data, such as products of the same category, programs in catalogs, etc.

2. Conditional functions . These are some specialized constructions that allow you to logically calculate what text to insert (or not insert). For example, functions to check values: one, greater than, less than, etc. The format of these functions is unique in each generator, so they cannot be enumerated. However, they are very useful in cases where the template is composed for different areas that are different. However, such features are quite rare.

What algorithms are used to evaluate the text

Many algorithms are used to evaluate text similarity. But the most famous of them are the direct comparison and the shingle method. There are others, but usually these two are more than enough for most common tasks.

1. Direct comparison . As the name suggests, it refers to the extent to which the texts turned out to be identical. At the same time, it is important to understand that if you put the word at the beginning of the text, it will not make it unique. Because the rest will match completely. The advantage of this method is that it is easy to understand, and the disadvantage is that it is a weak indicator from the point of view of search engines. For example, you can simply rearrange text fragments and you will get a unique article, but the search engine will not perceive it as unique.

2. Shingle method . This algorithm is one of the methods of text evaluation by search engines. It is not complete, but search engines do not seek to disclose their algorithms in order to keep the output in a normal state. However, this method is often used to evaluate the similarity of text and gives strong results.

Its essence is that several consecutive words are given. Then the entire text is broken into fragments of this number of words. At the same time, the shift does not occur by the specified number of words, but by one word each time. The resulting shingles are encrypted in a space-saving way. And later, the two texts are compared precisely by the number of shingles, and not by the text. This approach negates the permutation of phrases and sentences in the text, because the fact that you have swapped two sentences, the shingles will practically not change.

Consider the text.

1. Цена товара составляет Х с учетом акции.

For example, let's take the number of words equal to 3. In this case, shingles will be obtained

1. Цена товара составляет
2. товара составляет Х
3. составляет Х с
4. Х с учетом
5. с учетом акции.

Now, if you move the "With the promotion" part, add the words and get "With the promotion, the price of the super item is only X", that phrase will still have a percentage of similarity, because some shingles do occur. When tested for a direct match, these two propositions would be practically different.

It is important to understand that this is a simple example and that there are many variations of the shingle method. Cleaning from stop words - non-informative words, such as prepositions "in", "on" and so on. Endings are taken away from words. Word order in a shingle may or may not matter. Words are evaluated together with synonyms. And so on.

Therefore, when composing the text, it is very important to paraphrase sentences, fill them with non-standard inserts and words, add or, on the contrary, remove paragraphs with the text in order to dilute the shingles. In general, make the text diverse.

What is the difference between product and article templates?

If search engines apply fairly high requirements to articles, then such requirements are reduced to product descriptions. The reasons are simple. The products are mostly the same at competitors' online stores. The number of products is easily calculated in thousands. Not everyone needs text sheets, many are guided by price and characteristics. In principle, it is difficult to make product descriptions very different, let's recall the beginning of the article, where I described how long it would take to compile 1000 articles for each product.

Concessions to product descriptions are usually as follows:

  • The criterion for the minimum number of characters in the text has been reduced (according to various data, the minimum is from 300 to 1000 characters, in articles today this threshold is 1500-2500)
  • Search engines relate more easily to duplication of content (not only on different sites, but also within the same site, for example, similar products with a similar description)
  • Search engines focus more on other indicators and individual data, such as keywords (manufacturers, specifications, model, etc.)

Of course, this does not mean that you should not approach the case when creating product templates. It is just important to understand that it is easier to create templates for generating product texts and a lot can be taken from the characteristics and metadata of the product itself.

How to create your first template

First of all, if you are creating a template for a website or online store, you need to make a backup of it. You will always have time to write templates, but after experiments, it is a very difficult task to restore the description of hundreds of products. The next thing you need to know is that if you've never made patterns before, start with small jobs or small amounts. You should not take on all the products on the website at once. You first need to see with your own eyes what it looks like and what the result is.

Now, after the warnings, let's move on to a small algorithm for creating the first template:

1. Open the search engine, search there for descriptions of similar products and articles. Based on this data, write your article, only good and not like two drops of water on the springs.

2. You begin to fill the text with specialized constructions, such as synonymization, permutation, insertion, conditional functions (see which ones are available).

3. Conduct text generation.

4. Check how unique the texts are. You can use copyright programs or sites, which are plentiful on the Internet. If you use the shingle method to check programs, then you should set the number of words in the region of 5-7, preferably 5, but not always suitable.

5. If the percentage of similarity is less than desired, go back to step 2 and redo the template again (add to it, change parts, etc.). If you measure with analyzers, it is desirable that the percentage is not less than 80%. If "by eye", then try to look at similar stores at the top of the search results to see how similar their product descriptions are. The latter, of course, has ceased to be a standard, since search engines form a search result, but at least it is a reference point.

6. You receive a ready-made template. Be sure to save it somewhere.

At first, templates will not be created quickly, but once you get the hang of it, templates will not take much time.

The most common mistakes in creating templates and multiplying articles

And now, you should learn the most frequent mistakes of beginners when composing templates and multiplying articles, in order to avoid as many rakes as possible:

  • I will make a universal template for all occasions in life . In fact, it can be and the results can be good, but it is better for beginners not to do it at first. What are the pitfalls here. First. The pattern will grow and you will get confused in the designs. Do not look at the examples that were in the article. Usually, patterns are something like a chaos of symbols and single designs. Second. If you change the need for part of the texts, you will have to either complicate the template even more, or copy and modify this universal template. Third. If you have a little experience, it is easy to ignore the context and get readable but meaningless text. In general, there will be much more problems in the initial stages.
  • And I will take a ready-made base of synonyms . The base of synonyms can be used, but not thoughtlessly. All such texts must be checked for readability after generation. "And your dairy product went fast" is not a phrase from a cartoon about Carlson.
  • I will immediately make a template . This still needs to be learned - to see the text according to the template. A beginner will begin to get confused halfway through the text and not understand what kind of text he is composing. As a result, the template will not only have to be brought to the desired percentage of similarity, but also make a text out of it.
  • I stuff key phrases and other tricks . Remember that reproduced texts must comply with the norms of SEO texts. The fact that the articles turned out to be unique means that such articles will normally enter the search results. Therefore, keep an eye on other aspects of SEO in the same way. For example, do not overdo it with keywords, use indirect entries, etc.
  • I came up with something else, I will regenerate the entire assortment . Remember that frequently changing text, especially in huge quantities, is a signal for search engines. In addition, if synonymization is used in the templates, random words and phrases will be substituted each time, which will change the text. Try to approach this question thoughtfully. For example, if you need to add something to the end of the texts, then see if there is no possibility in your generator to create such a template that would first insert the existing text and then add your idea. Such changes in texts are treated much more easily by search engines, because it is understood that any descriptions can be supplemented over time (but will not be completely changed, especially with the use of synonyms).
  • About immediately after generation according to one template, the traffic rose sharply, I will quickly do others . It is important to understand that the process of evaluating texts and sites in search engines is quite slow. Therefore, it is quite possible that the rise was connected with something else. Do not rush to run all the texts at once, especially if you are not sure about the templates. Watch what happens. I also recommend that at the first signs of decline, you should not try to immediately roll everything back. A temporary drop in traffic may be observed on the site when the issue is changed.

Now, you know more about text generation and article reproduction, are alerted to a number of trouble spots and know various subtleties.


Связанные товары