mersenneforum.org  

Go Back   mersenneforum.org > Fun Stuff > Lounge

Reply
 
Thread Tools
Old 2020-04-29, 21:35   #353
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

236568 Posts
Default

Does anyone have a clue how pasted text in Windows Notepad could change its look? I guess there's another once-trusted utility to replace.
kladner is offline   Reply With Quote
Old 2020-04-29, 22:55   #354
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101Γ—103 Posts

23·1,223 Posts
Default

I suspect that you pasted it to notepad and then did not clear the clipboard and pasted the original swill to the forum. Paste to notepad. Copy and paste something else (to see that you cleared the old). Then copy from notepad to the forum.

Might even try pasting it into Notepad++
Uncwilly is offline   Reply With Quote
Old 2020-04-30, 04:28   #355
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

100111101011102 Posts
Default

That's kind of what I thought. But attached are screen shots of the body text pasted into Notebook. BTW, I take your suggestion of Notepad++ seriously. I could not remember the exact name, difficult as it is.(/s)
I guess I have been mistaken in thinking of Notepad as a formatting filter. The bizarre thing is that I just pasted one of these into Notepad++ and it seems the encoding is UTC-8. However, asking it to convert to other encoding brings up all question marks except for the Bitcoin address, which remains in clear text.

I am getting too tired and confused, I guess. I am not making sense to myself.
Attached Thumbnails
Click image for larger version

Name:	Scammer_A.JPG
Views:	74
Size:	117.4 KB
ID:	22188   Click image for larger version

Name:	Scammer_B.JPG
Views:	69
Size:	94.4 KB
ID:	22189  
kladner is offline   Reply With Quote
Old 2020-04-30, 05:02   #356
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

22×1,549 Posts
Default

It is using characters high up in the Unicode set. Converting the encoding won't change the characters, it only changes how they are encoded.

If for some bizarre reason you wanted to convert it to plain ASCII characters then you need to search and replace each character.

e.g. replace 𝙰 with A, 𝚣 with z, etc.
retina is online now   Reply With Quote
Old 2020-04-30, 16:47   #357
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

236568 Posts
Default

Quote:
Originally Posted by retina View Post
It is using characters high up in the Unicode set. Converting the encoding won't change the characters, it only changes how they are encoded.

If for some bizarre reason you wanted to convert it to plain ASCII characters then you need to search and replace each character.

e.g. replace 𝙰 with A, 𝚣 with z, etc.
Thanks. I was just poking at things from an ignorant standpoint. I've probably taken up more collective attention than this crap deserves, but I have been puzzled by the seeming oddities. The explanation is much appreciated.
kladner is offline   Reply With Quote
Old 2020-04-30, 18:10   #358
Dr Sardonicus
 
Dr Sardonicus's Avatar
 
Feb 2017
Nowhere

4,643 Posts
Default

Quote:
Originally Posted by retina View Post
It is using characters high up in the Unicode set. Converting the encoding won't change the characters, it only changes how they are encoded.

If for some bizarre reason you wanted to convert it to plain ASCII characters then you need to search and replace each character.

e.g. replace 𝙰 with A, 𝚣 with z, etc.
I actually figured out a way to convert the Unicode version to plain ASCII text. I copy-pasted the "little boxes" text into my text editor. It still displayed as little boxes.

I search-and-replaced all line feeds with spaces. I then prepended s=" and appended " to turn the text into a string called s. I then opened Pari-GP, copy-pasted in s="little-box-text" and invoked Vecsmall(s).

The result was fascinating. Each little box got turned into four numbers, which were all negative. Plain text characters got their usual ASCII code numbers.

Using a small sample of text transcribed from the screen grab, I quickly figured out how to convert most of the 4-tuples. By looking at the few characters that were converted incorrectly, I homed in on the recalcitrant 4-tuples and figured out a refinement.
Dr Sardonicus is offline   Reply With Quote
Old 2020-05-01, 04:10   #359
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2·3·1,693 Posts
Default

Wow. Just Wow.
kladner is offline   Reply With Quote
Old 2020-05-01, 04:27   #360
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

22·1,549 Posts
Default

For reference in case anyone cares enough to do a conversion.

𝙰 is Unicode character 0x1d670
πš‰ is Unicode character 0x1d689
𝚊 is Unicode character 0x1d68a
𝚣 is Unicode character 0x1d6a3

In UTF-8 encoding it looks like this:
𝙰: 0xf0 0x9d 0x99 0xb0
πš‰: 0xf0 0x9d 0x9a 0x89
𝚊: 0xf0 0x9d 0x9a 0x8a
𝚣: 0xf0 0x9d 0x9a 0xa3

If you wanted to encode it in UTF-16 then those characters need to use the surrogate pairs to extend beyond the 0xffff limit.

Compare this to a normal English alphabet encoding:
A is Unicode character 0x41
Z is Unicode character 0x5a
a is Unicode character 0x61
z is Unicode character 0x7a

In UTF-8 encoding it looks like this:
A: 0x41
Z: 0x5a
a: 0x61
z: 0x7a
retina is online now   Reply With Quote
Old 2020-05-01, 11:33   #361
Dr Sardonicus
 
Dr Sardonicus's Avatar
 
Feb 2017
Nowhere

4,643 Posts
Default

Quote:
Originally Posted by retina View Post
For reference in case anyone cares enough to do a conversion.

𝙰 is Unicode character 0x1d670
πš‰ is Unicode character 0x1d689
𝚊 is Unicode character 0x1d68a
𝚣 is Unicode character 0x1d6a3

In UTF-8 encoding it looks like this:
𝙰: 0xf0 0x9d 0x99 0xb0
πš‰: 0xf0 0x9d 0x9a 0x89
𝚊: 0xf0 0x9d 0x9a 0x8a
𝚣: 0xf0 0x9d 0x9a 0xa3

If you wanted to encode it in UTF-16 then those characters need to use the surrogate pairs to extend beyond the 0xffff limit.

Compare this to a normal English alphabet encoding:
A is Unicode character 0x41
Z is Unicode character 0x5a
a is Unicode character 0x61
z is Unicode character 0x7a

In UTF-8 encoding it looks like this:
A: 0x41
Z: 0x5a
a: 0x61
z: 0x7a
Not fully awake yet. (Goes for milk to add to morning brew, spills some. "Don't cry over spilled milk!" OK, no crying, but it doesn't say anything about not cursing a blue streak...)

(blears at screen)

Oh, goody, actual numeric codes. Great! Thanks, retina!

Huh, four identical-looking little boxes with different Unicodes, then four identical-looking little boxes with different UTF-8 codes. Huh? Wha'?

OK, I'm assuming it's A,Z,a,z throughout.

Let's see here. UTF-8 codes have four numbers. That's what I got. Check. The first two are always the same. Check. The last two look to vary about right. OK, I'm too sleepy to analyze further, but I'm satisfied I was getting some version of the UTF-8 encoding.

Last fiddled with by Dr Sardonicus on 2020-05-01 at 11:35 Reason: xigfin posty
Dr Sardonicus is offline   Reply With Quote
Old 2020-05-01, 11:45   #362
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

619610 Posts
Default

Quote:
Originally Posted by Dr Sardonicus View Post
Huh, four identical-looking little boxes with different Unicodes, then four identical-looking little boxes with different UTF-8 codes. Huh? Wha'?
From your description I assume that your browser doesn't render those Unicode characters. For me I can clearly see them as monospaced AZaz in that order.
Quote:
Originally Posted by Dr Sardonicus View Post
OK, I'm assuming it's A,Z,a,z throughout.
Yes.

BTW: Get your browser fixed. It should be able to display those characters.
Or, um, thinking about it more. It might be your OS not providing glyphs for those characters. Perhaps choosing a different, more comprehensive, font will help.


ETA: See attached
Attached Thumbnails
Click image for larger version

Name:	AZaz.png
Views:	77
Size:	29.8 KB
ID:	22207  

Last fiddled with by retina on 2020-05-01 at 11:50
retina is online now   Reply With Quote
Old 2020-05-01, 13:07   #363
kladner
 
kladner's Avatar
 
"Kieren"
Jul 2011
In My Own Galaxy!

2×3×1,693 Posts
Default

As I have told the good Dr, this has been educational and enlightening. Many thanks, Your Evility!
kladner is offline   Reply With Quote
Reply



Similar Threads
Thread Thread Starter Forum Replies Last Post
Official "Faits erronΓ©s dans de belles-lettres" thread ewmayer Lounge 39 2015-05-19 01:08
Official "all-Greek-to-me Fiction Literature and Cinema" Thread ewmayer Science & Technology 41 2014-04-16 11:54
Official "Lasciate ogne speranza" whinge-thread cheesehead Soap Box 56 2013-06-29 01:42
Official "Ernst is a deceiving bully and George is a meanie" thread cheesehead Soap Box 61 2013-06-11 04:30
Official "String copy Statement Considered Harmful" thread Dubslow Programming 19 2012-05-31 17:49

All times are UTC. The time now is 21:16.


Fri Jul 16 21:16:14 UTC 2021 up 49 days, 19:03, 1 user, load averages: 2.17, 1.93, 1.83

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.