mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2021-08-05, 02:30   #1
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

2·5,059 Posts
Default Language localization for GIMPS software.

Based upon this post: https://www.mersenneforum.org/showpost.php?p=584857
I searched for a thread to see if there was a previous thread suggesting the possibility of language localization for GIMPS client software. I found none. So, let's talk about it.

I can think a few topics:
  1. Are any of the current apps set up already to make localization possible?
  2. For those that aren't, are all the menu titles, messages, etc. hardcoded in the body of the program, or are they in a table?
  3. Roughly how much text is there in the program?
  4. How much work would it be to convert the text to a table?
  5. Would a right to left language be handled properly by the program?
  6. Can PrimeNet send its messages along with a code such that the clients could use the code to interpret the message to localize it?
  7. Do we volunteers to do the translation?
  8. How much context will the translators need to get the right translation?
  9. How much value is there in doing the initial work?
  10. What about the help files and undoc?
Uncwilly is online now   Reply With Quote
Old 2021-08-05, 02:43   #2
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

2·5,059 Posts
Default

I would think all numbers should be in Arabic numerals. Localization for the format regarding the decimal point and things such as thousands dividers, etc. need not occur on the first localization attempt. (But, see below*)

Dates should be in an ISO format.

Units, such as GHz-days and MB, etc. would not need to be done in the first pass of localization. But, including them in them (and number formats) in the localization table on the first release would be a good thing. That would allow the translation volunteers to do their work ahead to prep for the time when the software is ready.
Uncwilly is online now   Reply With Quote
Old 2021-08-05, 03:20   #3
Viliam Furik
 
Viliam Furik's Avatar
 
"Viliam Furík"
Jul 2018
Martin, Slovakia

2×353 Posts
Default

Regarding point 6:

I think it doesn't need to. There aren't that many types of messages the server sends. A short clever piece of code can easily find out what type of message the server sent based on its highly regular structure.

E.g. if there is a number in the message, then you're already done with a bunch of different possibilities. You can then check what is the first word of the message, and that could potentially be the entire code - check the first word, and if there are still multiple possibilities, check a bit further.
Viliam Furik is offline   Reply With Quote
Old 2021-08-05, 04:04   #4
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

592310 Posts
Default

What is proposed will require significant buyin and cooperation and prioritization by the authors, or others stepping in to add the capability in software not currently being actively maintained or updated by the authors.

Mprime/prime95: actively updated by Prime95 (George) about annually

Mlucas: actively updated by Ernst Mayer about annually

Gpuowl: was very actively developed by Mihai Preda, but no updates (github commits) since March 2021

Mfaktc: no update releases for years, but there are occasional signs of private development activity by the author (The Judger); occasional recompiles for new CUDA levels or larger GPUSieveSize by others

Mfakto: no longer being updated or maintained by Bdot since ~2016? kracker provided an update in May 2020

CUDALucas: original author "msft" no longer active; flashjh ceased maintenance ~May 2017

Cllucas: original author "msft" no longer active, no updates or maintenance since Jan 2016

CUDAPm1: alpha software, no maintenance ongoing; last maintenance by Aaron Haviland's fork 2018?


Additional titles to consider include Mfactor, Factor5, MMFF.

Last fiddled with by kriesel on 2021-08-05 at 04:06
kriesel is offline   Reply With Quote
Old 2021-08-05, 10:38   #5
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

2·2,417 Posts
Default

I'm here if you need a localization in Italian.


Last fiddled with by ET_ on 2021-08-05 at 10:38
ET_ is offline   Reply With Quote
Old 2021-08-05, 13:55   #6
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,923 Posts
Default

There's also the included Primenet.py to consider for Mlucas and Gpuowl,
and the separate app-specific client management software listed in
the attachment of https://www.mersenneforum.org/showpo...2&postcount=3:
Misfit, GPU72's submit_spider, mfloop, llloop, gimps_getter, Mersenne Manager, and mersenne.ca command line scripts.

https://blog.languageline.com/what-is-localization
https://en.wikipedia.org/wiki/Intern...d_localization

In the case of prime95, the sole GIMPS number-crunching application with a GUI currently, it would potentially
involve window titles, window content, all the doc files, ini files local.txt and prime.txt, and output log files. Conceivably also file names.

Some languages involve differing character sets; Cyrillic comes to mind.

Time zones are more complicated than one might guess.
In an application I have under development, I output both UTC and local, and researching time zones, found ~20% of the world's population lives in time zones that are some number of hours plus a fractional hour offset from UTC. See https://www.worldtimezone.com/; India, Iran, Nepal, etc.

Something to consider is how much or little localization effort makes sense for software titles used by thousands or fewer globally.
PRP proof addition in Mlucas seems of higher value in my opinion.

Something else to consider is accessibility for those with disabilities, and how localization effort may incorporate, support, or divert resources from accessibility.

Finally, how will it impact support provided by the author? (How multilingual is George, to respond to inquiries about issues experienced with mprime or prime95, with symptoms posted with program output for a language other than English, for example? How often will George need to type a screen capture's text into Google Translate or the equivalent?)

Last fiddled with by kriesel on 2021-08-05 at 14:51
kriesel is offline   Reply With Quote
Old 2021-08-05, 14:21   #7
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101×103 Posts

2·5,059 Posts
Default

Quote:
Originally Posted by kriesel View Post
Mprime/prime95: actively updated by Prime95 (George) about annually
Mlucas: actively updated by Ernst Mayer about annually
Gpuowl: was very actively developed by Mihai Preda, but no updates (github commits) since March 2021
Mfaktc: no update releases for years, but there are occasional signs of private development activity by the author (The Judger); occasional recompiles for new CUDA levels or larger GPUSieveSize by others
Mfakto: no longer being updated or maintained by Bdot since ~2016? kracker provided an update in May 2020
CUDALucas: original author "msft" no longer active; flashjh ceased maintenance ~May 2017
Quote:
Originally Posted by kriesel View Post
There's also the included Primenet.py to consider for
I think that we need only worry about Prime95 and GPUowl first and foremost. Since most of the code for Prime95 is available openly, work on compiling the list of messages, menu entries, etc. could be done by someone else to free up George's time. That table could then be passed to volunteers.

Has anyone done localization set-ups before?
Uncwilly is online now   Reply With Quote
Old 2021-08-05, 14:50   #8
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,923 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
Prime95 and GPUowl first and foremost. ...

Has anyone done localization set-ups before?
My sense is prime95/mprime is the foremost; many users start there and a fraction progress to other apps or onto GPUs. The big 3 in my opinion are prime95/mprime, gpuowl, and mfaktc. The 80/20 rule applies, and it would be good to have all computation types covered.

Where I used to work we designed, produced, and shipped custom equipment internationally, and all documentation was customized to each unit, but only produced in American English. The end users were scientists who typically knew and used English in their careers.

Localization has lower penetration in small volume markets. https://english2.thesaigontimes.vn/t...-small-market/ The active GIMPS user base is currently ~4200, a very small market.

The big 3 languages for total speakers are English, Mandarin, and Hindi https://www.visualcapitalist.com/100...ken-languages/. GIMPS current participants or potential participants may not have the same distribution, due to economics, internet availalability, and perhaps other considerations.

Last fiddled with by kriesel on 2021-08-05 at 15:25
kriesel is offline   Reply With Quote
Old 2021-08-05, 16:31   #9
xilman
Bamboozled!
 
xilman's Avatar
 
"𒉺𒌌𒇷𒆷𒀭"
May 2003
Down not across

2B1216 Posts
Default

Personally I just use the language native to the application. If that means using Google Translate so be it. I can generally get around in most Romance and Germanic languages without too much trouble.

That's why I use Spanish keyboards, Spanish websites, etc when I am here in La Palma, despite never having had any formal tuition in the language. English and French is almost always good enough for, say, Italian, Spanish and Portuguese. Ditto English and German for, say, Dutch and Danish. I can often make out Romanian because it is fundamentally a Romance language, though with heavy Slavic influences.

I'm never going to be able to read and appreciate literature in those languages but technical documentation generally has a very limited vocabulary and shares jargon terms with all the other languages.
xilman is offline   Reply With Quote
Old 2021-08-06, 00:18   #10
Prime95
P90 years forever!
 
Prime95's Avatar
 
Aug 2002
Yeehaw, FL

3·13·197 Posts
Default

Back in the day when I wrote Windows programs for a living, all localization was put in a program's resource file. The resource file was added to the .exe as the last step in linking. One could then translate the resource file, replace the resource file in .exe and voila -- a new fully translated .exe.

I have no idea if that is the preferred localization methodology in Windows today. Even if it is, how does that help mprime? We'd need a Linux library that reads Windows resource files to get properly localized text.

Prime95/mprime "gave up" on localization as all text messages are hard-wired into the code. Fixed messages would not be terribly hard to place in a central location, but printf messages are much tougher. For example, "roundoff error of %g at iteration %d, rolling back to last save file %s". When localized, the %g, %d, and %s replacements might need to be in a different order.

If we can come up with a standard, I'm willing to migrate towards it over time. First step I think is to investigate tools and standards are in wide use today. Solution must be open source and not encumbered by the noxious GNU license.
Prime95 is offline   Reply With Quote
Old 2021-08-06, 01:19   #11
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

5,923 Posts
Default

There is at least the coding, translating, and testing. Here's a link about testing (in which it seems to me a lot of verbs are missing, perhaps lost in translation)
I was actually web searching for os-independent localization when I found that instead.

One ugly way to do the printfs OS-independently is line by line manually in place of "roundoff error of %g at iteration %d, rolling back to last save file %s",
"%s %g %s %d, %s %s",rndofferrorstring, roundofferror, atiterationstring, iteration, rollbackstring, savefilename etc. And all those phrase strings are input from a language-specific text file. Then toupper on first or last char when applicable. One printf for LTR languages, else the other printf for RTL.

Something like file name localization.amereng.txt
Code:
Language American English
Scandirection LTR
rndofferrorstring "roundoff error of"
atiterationstring "at iteration"
rollbacktostring "rolling back to last save file"
etc, or German, localization.german.txt
Code:
Language German
Scandirection LTR
rndofferrorstring "Rundungsfehler von"
atiterationstring "bei Iteration"
rollbacktostring "Rollback zur letzten gespeicherten Datei"
(I make no claims of correctness of the Google translate results, and don't understand the seemingly random capitalization it included. It doesn't quite pass the a->b->a forward and reverse translation test: "rolling back to last save file" becomes "Rollback to the last saved file")

All that assumes Unicode-character-supported languages. Which can lead to lots of fun depending on fonts also.

Open source localization web search didn't yield much applicable to C or C++, but C# is useful maybe? One developer's experience described at https://dev.to/yeah69/the-road-to-lo...e-project-1o09

Last fiddled with by kriesel on 2021-08-06 at 02:15
kriesel is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Integrated graphics processors, how to run GIMPS software on them, and why you may not want to kriesel kriesel 8 2021-09-13 16:45
On the origin of language ... Dr Sardonicus Lounge 28 2018-10-10 19:52
Uninstall GIMPS Software? BillMMar Information & Answers 6 2010-05-02 22:23
Body Language Orgasmic Troll Lounge 2 2005-11-29 16:52
GIMPS software for Sony PS/2 Linux? delta_t Software 5 2002-12-06 17:36

All times are UTC. The time now is 08:26.


Fri Dec 3 08:26:36 UTC 2021 up 133 days, 2:55, 0 users, load averages: 1.37, 2.22, 2.13

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.