Human genome scientists renamed some genes "because Excel incorrectly converts them to dates"

Microsoft can't be expected to fix it.

Engadget JP (Translation)
Engadget JP (Translation) , @Engadget_MT
2020年08月8日, 午前 09:06 in egmt
0シェア
FacebookTwitter
KTSDESIGN/SCIENCE PHOTO LIBRARY via Getty Images
KTSDESIGN/SCIENCE PHOTO LIBRARY via Getty Images

This article is based on an article from the Japanese edition of Engadget and was created using the translation tool Deepl.


The human genome contains a myriad of genes, and the subtle combination of DNA and RNA gives each of us unique characteristics. Genome researchers have named each of these genes with a code consisting of a combination of symbols and alphanumeric characters and sorted them out for research purposes.

However, the "names" of the genes assigned to these genes in recent analyses have become a problem because they are very difficult to handle. This is because Microsoft's Excel spreadsheet software mistakes the cells with these gene names entered as dates.

Excel's auto-formatting feature defaults to making it easy to enter a date, for example, if you enter "12/1" it will convert to December 1, not 12 divided by 1. The folks at the Gene Nomenclature Committee (HGNC) within the Human Genome Organisation (HUGO) had a problem with the abbreviation "MARCH1", an abbreviation for "Membrane Associated Ring-CH-Type Finger 1", which is a name given to a gene, when typed into Excel, it was automatically converted to "March 1".

Annoyingly, Microsoft does not provide a setting to turn off this automatic conversion.

There are many genes that can be dated by similar conversions, and Excel's automatic conversion feature is said to have affected about one-fifth of the genetics papers published in 2016. And HGNC has renamed a total of 27 genes in the past year to prevent such errors from occurring. For example, the symbol "MARCH1" has been changed to "MARCHF1" and "SEPT1" has been changed to "SEPTIN1", and so on. However, they have not yet finished renaming all the names that would be automatically converted to dates.

There have been instances where genes have been renamed once, but these were either because they caused false positives in searches or because the names raised concerns for certain people. There has never been an example of a name change due to a standard feature of the application software used, at least not until now.

Even scientists take Excel for granted, but as an application, it is designed for more general use, so it seems unlikely that Microsoft will have patches or anything else for problems in specific areas such as this one. Elspeth Bruford, HGNC's coordinator, also said that the problem is a "quite a limited use case" and that the option to disable automatic conversion of formatting would only help a small percentage of Excel users.

By the way, Excel doesn't provide a setting item to turn off automatic conversion of cell contents, but there is a rule that if you put a single quote at the beginning of the string you are typing in, subsequent values will be treated as strings. For example, if you type " MARCH1", you can type " 'MARCH1'" and it should be treated as a string without any problems, not a date. But perhaps there was a special glitch that still couldn't be resolved. It must have been...

Source: Guidelines for human gene nomenclature(Nature)
Via: The Verge


This article is based on an article from the Japanese edition of Engadget and was created using the translation tool Deepl. The Japanese edition of Engadget does not guarantee the accuracy or reliability of this article.

 
 
新型コロナウイルス 関連アップデート[TechCrunch]

 

関連キーワード: egmt, EXCEL, genes, Microsoft, spreadsheets, Spreadsheet, Genetics, science, DNA, RNA, news, gear
0シェア
FacebookTwitter