some thoughts on the structure of the Help article

tagezi · December 3, 2015, 9:24pm

Hi,

This letter is written because of the bug report #95968 [1]. So it is
connected with my work in the documentation team, and I want to do work of
high quality and without causing inconvenience to other community members. I
would like to clarify some appeared issues in reports #95968, #95496 [2] and
discuss them.
Everything that is said in this letter can be assigned to any help topic. But
I will talk only about Calc functions, because 486 pages in the Help can
include connected mistakes, in addition, the letter will be considerably
shorter.
My professional activities and my education give me a notion of the full life
cycle of software, including documentation. However, I can also be partly
wrong. In addition, I tend to believe that in the open societies all decisions
should be made together, and if we can not agree with standards, we need to
change them. Therefore, I would like to offer a discussion on how we should
further proceed with the Help development. I'm not a native English speaker,
so please ask questions, if I can not deliver my idea clearly enough. I would
like to get your opinion as well, if possible.

This is my vision of the structure of Calc function articles.

1. Article title
Title of the article should be brief, but give enough information about what
will be discussed below. This should include specific terms, which allow the
user to know exactly that he sees the page he/she wants.
For example, if the page is called TIME, one can not be sure that he reads the
right page until he start reading some further text on the page. Although it
seems a trifle, it may take time and cause some user irritability if he
mistakenly hits an unneeded article. Our goal as the documentation team is to
help the user find solutions to his issues, but without brain temperature
increase. This situation will arise when we use the TIME, DATE, DAY,
TRANSPOSE, STYLE, and others for an article title.
For example, in my opinion, instead of title just “TIME” we must provide “TIME
function”.

2. Brief description of why this function is needed, and what it returns
This item is most needed for the people who have experience of using Calc, but
did not use this particular function. Therefore, information about what the
function does and what results are returned should be included.
For example (IMCOS function):
In theory (but not in Calc), a complex number can be represented as a special
type of number (e.g. looks like Scientific format in Calc) or as an array of
two cells (e.g. looks like a result of some array functions in Calc). But in
Calc complex number is returned as a string “a+bi” in one cell. You can check
it with the help of the ISNONTEXT function. If we write in the Help article
only that the function returns the cosine of a complex number, but don't say
in which form, then the user, who does not use this function, may not know
that the result is returned as a string, because it is unusual for Calc.
That's why, the type of the returned value must be clearly specified,
especially if this can be misinterpreted.

3. Way of working for complex functions
This item is required for functions, whose work may vary depending on the
implementation. A user needs to understand how it works, because in some
situations it can provide a surprise result. This is especially important in
sciences such as sociology, economics, marketing, statistics, which, depending
on the theory and the school (e.g. West vs. Soviet school statistics [3]), use
various mathematical formulas, though the name can be the same or similar. If
we do not believe in the uniqueness, we have to clarify this for the user.
Otherwise, the Help will only confuse the user. In addition, this item enables
quick understanding for mathematically literate users.

4. Compliance with the ODF
In fact, I believe that it is necessary to specify both compliance and
incompatibility with the standard. Actually at present, some of the functions
are not only incompatible with the standard, but there is a decision by
developers not to lead them into compliance. [4] It is not an important reason
why the function can be incompatible with the standard. The fact is important
itself.
For example, AGGREGATE and INFO functions.
The first is not included in the ODF standard. The second supports only 5
categories of the 10 required. [5]

5. Syntax
I would like to point out the common mistake that occurs even in proprietary
software Help, when an optional separator of an argument includes an
obligatory argument. If a separator of arguments is optional, it must be
isolated together with optional arguments.
For example, MIN(Num_1 [; Num_2 [; Num_N ]])
=MIN(5)
Returns 5, while
=MIN(5;)
Returns 0, because the second argument is implicitly specified and equal to
zero.
If a formula has many values for such arguments, it is possible that a user
will not understand this mistake and get a wrong result.

6. Detailed explanation of each input argument
Each argument of the function should be described as fully as possible. This
is due to not only the fact that sometimes people can not imagine the working
logic of the function, but also they can make erroneous conclusions on the
basis of previous experience of using other functions. For example, some
functions don't transform arguments, though logically they may have expanded
way of application, but they have a functional limitation due to the
compatibility with other spreadsheets. An example of the latter is the
function AVERAGEIFS, which does not work for a 3D-range (ranges have the same
size, located on the adjacent pages and combined as one in a formula).
For example, the input arguments for the IMCOS function can be:
- string with a complex number;
- string with a real number;
- a real number in numerical format;
- reference to a cell containing one of the above cases.
So it should be written in the Help as “Complex_number - A string representing
a complex number, a real number in either string or number format, or a
reference to a cell containing a number whose cosine needs to be calculated.”
I have seen one comment that this description looks like a definition of a
complex number, but this has many “but”. On the one hand, this is really not
the definition of a complex number, because this is not correctly for a complex
number definition. On the other hand, we and users can not be sure that a
function is performed mathematically correctly. I would like to recall the
phrase of distinguished Markus Mohrhard: «First a spreadsheet is not a math
program and is not mathematically correct and most likely will never be.» [6]
In addition, I would like to remind that spreadsheets are used primarily by
economists and marketing specialists, who unfortunately often do not know the
mathematics and have no idea about such abstractions like complex numbers, but
use them and even more difficult things in their work. Most often, they simply
put values into the formula without any understanding of the calculation
process.
But even if we exclude people, who do not know mathematics, I can’t imagine
that a user is able to quickly gain an understanding of the fact that e.g.
some functions must use the input value as only a string for complex numbers,
but for real numbers both string and numeric formats are OK.
That is why, although we do not need to talk about mathematics, we have to
explain such issues of input arguments for the functions that are not
necessarily obvious.

7. Necessary notes on using the function
This item should include all the features that are not critical to the
function, but need to be noted. I see this item as further clarification of the
input arguments for the function. As well as in the above item, we can not
rely on working logic of a function or the respective competence of a user. We
need to describe it in a way that after reading no doubt is left.
For example, I can point out trigonometric functions such as SIN, ASIN etc.
Although any technically educated person knows that it is accepted to use
radians for trigonometric functions, in Russian schools the degree measure is
used.

8. Necessary warnings
Most often, this item makes sense to use in order to show common mistakes. Of
course, we have a separate table with errors, but the frequent ones with using
function need to be marked on the function page, because the Help is used when
a user has problems. And here we can point to a specific location, where a
mistake occurs.

9. Data for an example
This item is very difficult for me. On the one hand, examples should be the same
for similar functions, it allows the user to easily perceive the differences
between functions, on the other hand, it is complicated to translate, either
members of l10n command need to translate the data repeatedly, or they need to
be guided in the structure of the Help.
For example, functions SUM, SUMIF, SUMIFS, COUNT, COUNTIF, COUNTIFS, AVERAGE,
AVERAGEIF, AVERAGEIFS can be represented by the same table of data. This table
should be placed before examples in each article about functions, because the
user needs to be able to have it in front of his eyes.

10. Examples
Everyone knows that there should be some examples, but somehow many examples
are trivial in our Help. In my opinion, examples should form three categories:
- trivial – these examples allow understanding everything that is explained
not clearly enough;
- FAQ or frequent using – these examples allow answering on user's issues;
- revealing of the function potential – these examples allow a user to improve
skills.
Each example needs to have a description of its work, and not just the
returned result. Excluding trivial examples, it might be not clear for a
beginner what is happening inside the function.
Each explanation should begin with a new line. Thus, an example and an
explanation should be placed on separate lines. It is assumed that the user
can copy the example in Calc, and it is not convenient when it is needed to
pluck out an example from a string. It is much easier to select the example
with a mouse, when it is located on a separate line.

11. See also
Any article should contain this item in order to give a chance for the user to
get more information quickly, for example, error codes in Calc, list of
regular expressions, similar functions.
The Help has an item “Related articles”, that looks very organically in the
local Help. As I understand, it intends to provide links to additional
information. But I do not understand the name of the item, as it is more
suitable for personal blog than technical documentation. Maybe I overlooked
something.

I would like once again to note that although we do not have to associate a
user with a baby, who does not understand anything, we can not be sure that a
user has extensive experience in using functions and Calc. And we need to
facilitate the user's work with the Help, not counting on user's independent
understanding.

It is the representation of how I see a good Help article. But I also have
some questions about the internal structure.

As I said, this letter appeared from the bug report #95968 and some discussion
with Olivier Hallot.
I have to agree that the string "<embedvar
href=\"text/scalc/01/ful_func.xhp#func_define_complex\"/> whose cotangent is to
be calculated." is a bad idea. Although I could not agree that it is hard t
translate into languages with case endings [7], I think it should be changed,
because there is a mistake in Wiki online help [8]. However, this error does
not eliminate the issue of what to do with the inclusion of information from
other files. There are a large number of duplicate lexems, which greatly
increase the work of translators. Of course, Pootle gives tips, but not all
people have a good internet, some people prefer to download a file and to
translate it off-line and in this case tips in Pootle can not help.
If this reference confuses translators, can we expect that examples of regular
expressions on the basis of a table from another file (e.g. a COUNFIFS function
[8]) will be translated? If not, I can not bring the table with the data in a
separate file, to avoid unnecessary translations (only for the functions listed
above ~ 100 lexems). I would like to note that although this figure does not
look impressive for the languages that have almost complete translation, it
will grow like an avalanche for the new languages. What should I do in this
situation?

1. https://bugs.documentfoundation.org/show_bug.cgi?id=95968
2. https://bugs.documentfoundation.org/show_bug.cgi?id=95496
3. https://bugs.documentfoundation.org/show_bug.cgi?id=93445
4. https://bugs.documentfoundation.org/show_bug.cgi?id=95010
5. https://docs.oasis-open.org/office/v1.2/OpenDocument-v1.2-part2.html
6. http://lists.freedesktop.org/archives/libreoffice-qa/2014-November/008117.html
7. http://listarchives.libreoffice.org/global/l10n/msg09356.html
8. https://help.libreoffice.org/Calc/IMCOS_function
9. https://help.libreoffice.org/Calc/COUNTIFS_function

Best regards,
Lera

ohallot · December 7, 2015, 11:03am

Hello Lera

I think your scheme can turn into a wiki page to guide autors on how to
improve the description of Calc functions in Help content.

There will be a trade off between being prolific and be effective.

In my opinion, the help page on Calc function must bring the user on how
to use the function in his work. I'll let the teaching of the theory
behind the function to a link to the entry at Wikipedia to guide user
into the theory.

I cant make a statement on the level of details of your regular
expression explanations, as it is very subjective, but I decided to
shorten it in my translation.

On lexemes, be aware that Pootle and off-line translation tools have the
tips you mention (translation memory) and is a matter of choice for the
translator(*). These tools work paragraph by paragraph and are very
effective to assist the translators. I think this is a non-issue, and
the rule-of-thumb is to never builds a sentence with an assemblage of
individual words.

On the example table, I could not translate the words without breaking
the example just below. So I changed the words to and changed the
function examples. The issue is that is adds noise to the translation
memory, because pen, pencil were translated to other objects unrelated.
I have no solution to that (except implement a manual control to prevent
memorization)

regards

Olivier
(*) offline tools were used when Pootle was not so performant (or for
bulk changes). Now I consider Pootle my first choice (minus some UI bugs).