mersenneforum.org  

Go Back   mersenneforum.org > Fun Stuff > Lounge

Reply
 
Thread Tools
Old 2021-06-11, 16:26   #1
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101Γ—103 Posts

2×4,909 Posts
Default Database conversion, dbf & dbt to SQL

I have a database (actually just a single table at the moment) that is in a dBase format. It has the dbf that contains the bulk of the data and a dbt that contains the "Notes" field.
(In dBase and other compatible programs a "Note" field type had a pointer within in the table that pointed to a section of the dbt. Each section was 512 B and they would be chained when there were larger notes. Many of the standard functions could not be done on note fields.)
It contains over 10,000 entries and over 130 fields per record (although I could kill off about a third of them, they were used internally to ease data entry and report generation and label printing.) That is beyond the capability of free on-line converters. Many of the other converters that I have looked into don't mention that they handle dbt's as well as the dbf. I can get it it into an xml file. I have used an online converter to take some smaller tables (~1000 or less records) that are related from xml to a mySQL file that creates the table in SQL.

Any pointers, suggestions, or offers? The table does contain personal contact info, but nothing like CC# or government ID#.
Uncwilly is offline   Reply With Quote
Old 2021-06-11, 18:01   #2
Nick
 
Nick's Avatar
 
Dec 2012
The Netherlands

22·7·61 Posts
Default

If I remember rightly, the format was simple enough to reverse engineer with ease up to dBase III, but got somewhat messier after that.
The index files were separate, however, so using a hexdump to examine the .dbf and writing a C program should probably suffice.
Nick is offline   Reply With Quote
Old 2021-06-11, 18:25   #3
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101Γ—103 Posts

2·4,909 Posts
Default

Quote:
Originally Posted by Nick View Post
The index files were separate, however, so using a hexdump to examine the .dbf and writing a C program should probably suffice.
Correct the index files are separate. The main issue at hand is to get the data out of that format where the software has basically gone away and into a format that can be used more cross platform and the new tech support can handle. The xml conversion is one step. I could cut the file into smaller chunks and covert each of those. The xml to sql converter I used spits out a file that creates the table. It might be something where I do 5 passes through the free converter and cut and paste them into one file one the back endo with Notepadd++. Some of the converters that I tried choked on logic (YES/NO) fields. Also there is a weird character that mark a CR or LF in the note field of those files that some times causes problems when working on the database in Access. (I had a database that was created in the program that this one was run in. I later had to use Access to maintain it.)
Uncwilly is offline   Reply With Quote
Old 2021-06-11, 18:30   #4
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
Any pointers, suggestions, or offers? The table does contain personal contact info, but nothing like CC# or government ID#.
Hmmm... Tricky... PII within an ancient format... This means you need to do this yourself, and not hand it off to "experts".

Knowing you, you already have many copies of the binary data.

Have you thought about using virtual machines to get to the executable level where you can run the DB?

Nick's suggestion is also sound. The data is very likely within a reverse-engineered format.

There might even be a Perl module for that.
chalsall is offline   Reply With Quote
Old 2021-06-12, 00:55   #5
retina
Undefined
 
retina's Avatar
 
"The unspeakable one"
Jun 2006
My evil lair

24×389 Posts
Default

Quote:
Originally Posted by chalsall View Post
PII within an ancient format... This means you need to do this yourself, and not hand it off to "experts".
The OP already mentioned that online converters were an option if they worked so the PII thing is probably not an issue. Since uploading the DB to a "free" online service is obviously going to have that data mined and exploited for all it is worth with no care about confidentiality.
retina is offline   Reply With Quote
Old 2021-06-12, 01:46   #6
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101Γ—103 Posts

265A16 Posts
Default

The previously converted files were adjunct files: a postal code routing to local depot table and an item number to name, mass, vendor, etc. table. When those were converted no data of value was released. The free online sites were trials to get one buy the service for 'serious' work. The software trial download that I tried worked fine, except for in about 30% of the fields it substituted "*trial*" instead of the data. The base price for the software is ~700USD. I only need it once. The software that was used to maintain the data lives on another person's machine. I had limited time to retrieve the data (this was with the permission of the computer's owner and main guardian/user of the data. I had an old USB drive that the machine was able to read. (There were multiple complications getting the data retrieved.) I will have to try to get a hold the help file or physical manual for the software and see if there are options for exporting the data to a different format.
Uncwilly is offline   Reply With Quote
Old 2021-06-12, 03:45   #7
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

25B516 Posts
Default

I don't know sql, but if the files are not compressed dbf, i.e. you can see your data inside when looking with a hex editor, then any regex-aware tool (like notepad++, or pn2), or a simple perl script, could transform them in csv files in minutes (i.e. separating fields by placing commas in the right place). Then csv files can be handled by excel, or any db tools.

In the old age, FoxPro was able to handle dbf files very well. Old versions of VS which still contained FoxPro may still be downloadable.

Old CAD tools (like Protel95, before it became Altium) and Corel Draw 5 (before introducing VBA as scripting language) could also handle dbf files (even compressed).

Last fiddled with by LaurV on 2021-06-12 at 03:46
LaurV is offline   Reply With Quote
Old 2021-06-12, 05:01   #8
Uncwilly
6809 > 6502
 
Uncwilly's Avatar
 
"""""""""""""""""""
Aug 2003
101Γ—103 Posts

2·4,909 Posts
Default

Quote:
Originally Posted by LaurV View Post
Then csv files can be handled by excel, or any db tools.
I have been able to get the files to xml which Excel can open. I suspect if needed I can write some VBA to spit the records out. During the coming week I will have time to work on this. I doubt that I will do anything more this weekend, because of other obligations. I do have other ideas floating around in my head.

To Chalsall's question. Once I got into the machine in question, before I knew that I was going to be able to transfer the data to the USB drive, I zipped the files (with a strong PW) in DOS (the machine boots to DOS and only runs Win98SE if you call it) and wrote the file to a 1.44. Then, once I got it on the USB, I saved it on to my laptop that I brought, then immediately emailed it to the data owner/creator and the person that will be using the SQL. (If my laptop had a parallel port I would have brought my LapLink cable.) So, the data does exist in multiple physical locations now.
Uncwilly is offline   Reply With Quote
Old 2021-06-12, 15:01   #9
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

9,767 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
So, the data does exist in multiple physical locations now.
Excellent. That is absolutely the first critical step.

Did you, by chance, also clone the software (likely binaries)? A full ZIP of the software's directory is ideal (of the full filesystem is even better if that's an option).

I was serious about using VMs to get the software stack and the data running in its native environment. Sorry if I'm telling you how to chew gum, but Virtual Box might be your very good friend here.

P.S. I don't like Oracle as a company; way too litigious for my comfort. But they acquired VirtualBox when they bought Sun, and fortunately continue to support it. VMWare is another option.
chalsall is offline   Reply With Quote
Old 2021-06-12, 17:02   #10
xilman
Bamboozled!
 
xilman's Avatar
 
"π’‰Ίπ’ŒŒπ’‡·π’†·π’€­"
May 2003
Down not across

250348 Posts
Default

Quote:
Originally Posted by Uncwilly View Post
I have a database (actually just a single table at the moment) that is in a dBase format. It has the dbf that contains the bulk of the data and a dbt that contains the "Notes" field.
(In dBase and other compatible programs a "Note" field type had a pointer within in the table that pointed to a section of the dbt. Each section was 512 B and they would be chained when there were larger notes. Many of the standard functions could not be done on note fields.)
It contains over 10,000 entries and over 130 fields per record (although I could kill off about a third of them, they were used internally to ease data entry and report generation and label printing.) That is beyond the capability of free on-line converters. Many of the other converters that I have looked into don't mention that they handle dbt's as well as the dbf. I can get it it into an xml file. I have used an online converter to take some smaller tables (~1000 or less records) that are related from xml to a mySQL file that creates the table in SQL.

Any pointers, suggestions, or offers? The table does contain personal contact info, but nothing like CC# or government ID#.
I have some elderly machines in the attic, some of which still work just fine.

I can't handle anything pre-CP/M, sorry, but there is an excellent chance that MS-DOG 3.x will still boot. Whether dBase can be installed on them remains to be seen. If it does, perhaps an ASCII dump might be possible? All this is way beyond my previous experience.

OTOH, if you could provide (a representive sub-set) of your files I see a possibly interesting cryptanalytical challenge.

How big are they? Anything less than a gigabyte, which seems rather plausible given their age, would be acceptable.
xilman is offline   Reply With Quote
Old 2021-06-12, 18:49   #11
chalsall
If I May
 
chalsall's Avatar
 
"Chris Halsall"
Sep 2002
Barbados

976710 Posts
Default

Quote:
Originally Posted by xilman View Post
OTOH, if you could provide (a representive sub-set) of your files I see a possibly interesting cryptanalytical challenge.
Agreed.

How many can decode the Golden Record using first principles?
chalsall is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Conversion GHz to GFLOPS? preda PrimeNet 11 2017-12-02 21:51
v5 Server Conversion compusion Software 3 2008-11-14 19:22
PS3 programmers(program conversion for pay?) jasong Programming 5 2007-12-16 00:10
Units Conversion Puzzle JHagerson Lounge 19 2005-11-24 05:38
conversion to GF(2) bigbud Math 9 2005-04-16 01:13

All times are UTC. The time now is 11:15.


Sun Aug 1 11:15:44 UTC 2021 up 9 days, 5:44, 0 users, load averages: 1.13, 1.19, 1.31

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.